My Path Towards Data @ Netflix
by Lisa Herzog
Have you ever heard of the game “Two truths, one lie”? The rules are simple:
- Prepare three statements about yourself
- Two true statements, one false statement
- Ask your audience to guess which of your three statements is the lie
Are you ready? Let’s see if you can catch my lie:
- Childhood: I have grown up in a family of teachers: my grandmother, all my aunts and uncles are teachers and — guess what — my cousin is a teacher too. My father, a mathematics and science teacher, would get so enthusiastic about applied math that he would regularly try to convince my friends to do ‘fun’ DIY experiments when they were visiting me at home. My grandmother, on the other hand, was an excellent storyteller who would capture and inspire us with her stories (some fictional, some true and some a blend of both).
- Career Path: When I graduated from highschool in the early 2000’s in Germany, I knew exactly what type of career I wanted to pursue: my father’s passion for applied math inspired me to study Econometrics at Maastricht University in the Netherlands. Shortly after I commenced my studies in Econometrics, I discovered the world of Data Science, and it was love at first sight. After completing my Bachelor’s degree in Econometrics, I decided to specialise in “Data Science for Decision-Making” and shortly after graduating I landed a job in Data Science at Netflix.
- Analytics Engineering @ Netflix: If you asked me about my dream job as a kid, I would typically give one of two answers: “I want to become a detective” and “I want to become a writer”. While I have never pursued my childhood aspirations, I consider my current role as an Analytics Engineer to be a blend of both: we start with a question, collect and validate evidence, identify the “story” behind the evidence, and once we have made sense of it, we share our insights with our partners.
Have you guessed which statement is false? Let’s find out if you are right.
You might have guessed it — statement 2 is a lie!
I don’t have a quantitative degree in mathematics, statistics or computer science and have built most of my knowledge and experience through books, online courses, mentorship and hobby projects. So in case you are dreaming about a career in data but don’t have a degree in math or science — don’t be discouraged!
There are plenty of great resources that you can leverage to break into data.
In the next couple of sections I want to tell you my story and share my favourite data resources with you!
My Path Into Data
My path towards data science is non-linear; when I graduated from high school in a German small town in the early 2000’s, I had never heard of Data Science and Analytics, I had never heard of Silicon Valley, and — like many high school graduates — I had no clue what type of career I wanted to pursue. There was one thing I knew for sure, however: I could not see myself working in tech. Why? Working in tech would conjure up images of dark office rooms with (primarily male) programmers in hoodies and a working reality in which creativity, communication and social interaction did not have a place (oh boy was I wrong about this). After much consideration, I decided to study International Business with the hope that I could specialize later with more knowledge and experience below my belt.
I discovered the world of data science by accident during an open day at Maastricht University. My original plan was to visit information lectures about traditional business masters (process management looked like the most promising candidate) and then — I got lost (I do have a horrible sense of orientation). I sat down in a lecture hall expecting a information session on masters in process management and was therefore slightly baffled when the presenter kicked off with “Welcome to our lecture on Data Science in Decision-Making”. I did not want to be rude and leave early so I stayed. In less than 30 minutes, my view on Data Science and tech was reversed; I realised that:
- Data-Powered Use Cases: Data Science enables many exciting use cases ranging from sentiment classification to what if scenario simulation models (GenAI was not a thing yet)
- Creativity & Communication: Problem-Solving in tech requires creativity, a broad range of skills and exceptional communication skills (identifying and selling data-powered use cases, finding creative solutions to coding challenges, change management)
And after only 30 minutes I decided to take a leap of faith and pivot into data. I am not going to lie, pivoting into data was tough in the beginning. The master program was designed to train business students to become “data translators”, someone who could serve as a bridge between business and tech. In the short time of a year, we covered data-powered use cases, quantitative methodologies and their applications, and unstructured data (e.g. text and image processing). But since I was brand new to the world of data, keeping up meant many evenings spent with digital mentors such as Josh Starmer’s StatQuest, Kirill Emerenko’s Python for Data Science and many more. When I graduated, I was glad that the evening work had paid off — I had landed my first job in Data! My first job in Data gave me access to a large network of amazing mentors — one mentor spent hours and hours of her time reviewing my code and helped me to level up my coding skills, another mentor taught me statistics, and yet another mentor taught me to leverage personas when communicating to a non-technical audience.
Fast-forward a couple of years and I could not believe my eyes when I spotted a message from a Netflix recruiter in my inbox inviting me to interview for an Analytics Engineering opportunity in Studio Production Data Science and Engineering. This opportunity felt like a dream coming true — ever since I can think, I would spend hours watching “Behind the Scenes” and the Oscars, and watching movies has always been a medium to explore unknown worlds, cultures through stories. Throughout the interview process, I was won over by the competence of my interviewers and the uniqueness of Netflix’s culture — and — the rest is history.
Key Takeaways: you don’t need a quantitative degree to land a job in Data. There are plenty of great resources that can enable you to grow the skills you need.
Which resources? Find out below ⬇️⬇️⬇️
Analytics Practices and Resources
SQL
SQL is a database language that enables you to retrieve, combine and manipulate data. To give a concrete example, In Studio Production DSE we leverage SQL to answer questions about the operational health (time/cost/quality) of content production, for example:
- How many titles (movies or series) are we launching this year?
- How much did it cost to produce X title, did we spend more than our budget?
- Given X title, where did we spend the most?
Resources
- SQL: A good starting point is SQL for Data Analytics and Business Intelligence by 365 careers which provides an overview of all essential SQL operations (aggregation, data table joins and window functions).
- Working with Real-Life Data: Once you have mastered SQL syntax, it is important to get your hands on real-life data (courses typically use very polished data sets). Leveraging real life data sets (see Kraggle for published datasets) enables you to build experience with cleaning your data and interpreting and resolving error messages. And rest assured, whatever error message you encounter, it is very likely that someone else has encountered it before and has found a solution, so you can rely on Google (and ChatGPT) to find an answer to your coding problem.
Data Preprocessing
Preprocessing your data involves selecting information needed for your analysis (using SQL or Python), filtering your data, and data cleaning. In Studio Production DSE, the majority of data we work with is user entered which could result in missing data, and inconsistencies. Using SQL and Python enables us to identify and correct missing data and inconsistencies.
Resources
- Data Cleaning: For an excellent data cleaning guide see Mahesh Tiwari’s Guide for Data Cleaning.
- Python: a good starting point is Kirill Emerenko’s Python A-Z, a very thorough course for Python fundamentals (loops, data types, metric manipulations and visualisation). For a specialisation in Data Preprocessing (using a library called Pandas), Data Analysis with Pandas by Boris Paskhaver is a great resource.
Statistics
Statistics enables you to decide to what extent you can generalise data beyond your sample, allows you to be cognisant of methodologies and their prerequisites towards the input data, and enables you to choose the best methodology to answer a question. To provide a concrete example, in Studio Production DSE, we leverage forecasting methodologies to predict cash flow per production which enables Production to anticipate spend and ensure that spend obligations are met throughout the production lifecycle.
Resources
- Statistics: A resource that I have found incredibly useful is StatQuest by Josh Starmer — a channel focused on statistics and machine-learning which provides intuitive explanations and concrete examples for illustration. And many of the chapters have a themed song which will haunt you for weeks for example “Calculating p-values is kinda fun and not just when you are done”.
Defining Meaningful Metrics
Working as an Analytics Engineer involves developing meaningful metrics for our cross-functional partners. In Studio Production DSE, we partner with Directors and VPs in Production Management and Content Operations to develop metrics that measure operational health (time/cost/quality) of content production for example: spend overages (actual spend vs. budget) per production or production slate, delays per content production phase (actual vs. planned milestones).
Resources
What defines a meaningful metric? You could ask yourself the following questions:
- Relevant: is your metric aligned with the overall (business) objective?
- Actionable: are your partners able to influence this metric?
- Quantifiable: are you able to measure this metric?
- Simple: are you able to explain the metric in less than five minutes?
Okay — this sounds great on paper but how do you build experience with setting meaningful metrics?
Something that I have found very useful is setting annual goals and developing metrics that help you track your progress towards the goal.
Okay, okay — let me give you an example: let’s suppose you want to complete a half marathon by the end of the year. What metrics would help you track your progress towards your goal? There are two components to successfully completing a half marathon: mastering the distance, and mastering your speed. Knowing this you could set goals, metrics and targets:
- Frequency: I want to run three times per week (metrics: # weekly runs)
- Weekly Distance Goals: I want to run 30 kilometers every week (metric: km per week)
- Speed: I want to run at a speed of 6:00 min per km (metric: speed per minute)
Once you have set these goals, go through a mental checklist. Are your metrics: aligned, actionable, quantifiable and simple?
Problem-Solving
Problem-solving involves asking the right questions to understand the context and impact of a request, translating a vague question into a specific hypothesis and choosing the right type of methodology. In Studio Production, data projects typically start with a scoping session with our cross-functional partners. In scoping sessions, we ask questions to understand 1) what type of insights are needed 2) what use cases will be enabled/powered by the requested insights and 3) how the insights fit into the bigger picture (eg. company objectives). Once scoping is finalised, we typically prioritise this request against all other requests on our roadmap.
Resources
- Prioritisation: prioritisation will depend on your problem space. Having said that, it is always useful to ask yourself: How will X insights influence my partners’ decisions and/or workflows? (it is useful to think through a couple what if scenarios, eg. if my metric showed X, how would this influence decisions/workflows). How does the above decision/workflow change impact the business? How does X insight fit into the bigger picture (eg annual company priorities and strategy).
- Problem-Solving: in case you are solving problems in a business context, a great resource is consultancy interview guides such as Case In Point by Cosentino, a comprehensive guide to common business problems and problem-solving approaches.
Data Storytelling
Data storytelling involves crafting a narrative to your target audience and choosing the most effective visuals to corroborate your story. In Studio Production, Data Storytelling best practices enable us to talk to our cross functional partners in their own language. When pitching an idea, for example, we focus conversations on key questions that would be answered and use cases that could be enabled by specific insights (vs. providing a list of metrics or functionalities). When developing an insights tool, we leverage usability testing to catch data inaccuracies, identify usability issues and understand how the information fits into the user’s workflows and use cases.
Resource
- Storytelling: An excellent resource for data storytelling is Storytelling with Data by Cole Nussbaumer Knaflic (see this link for a visual summary of the key concepts). Don’t Make Me Think by Steve Krug is an excellent resource to learn more about usability.