TERRY GROSS, HOST:

This is FRESH AIR. The rise of on-demand video content delivered over the Internet has made it possible to watch many movies and TV shows any time, anywhere. But with so many choices available at our fingertips, deciding what to watch can be a bit daunting. In an attempt to help viewers find something that appeals to them, Netflix presents its subscribers with personalized viewing recommendations. Our tech contributor Alexis Madrigal explains how and why they do it.

ALEXIS MADRIGAL, BYLINE: In the old days, a movie genre was a simple communal category - Action/Adventure, Comedy, Drama. One had to locate one's self in the drama aisle in the video store and then look for just the right thing - a dark road trip movie with a strong female lead? A-ha: "Thelma and Louise." But Netflix, the movie streaming DVD rental service doesn't work like that. It recommends genres that are intensely, almost bizarrely personalized.

Netflix might tell you not just that you like road trip movies but that you like understated, romantic road trip movies, dark road trip thrillers, road trip art house movies, road trip musicals, or of course, Canadian independent road trip movies.

That's because seven years ago, Todd Yellin, a film-obsessed executive at Netflix, set out to break down every movie into data. He hired aspiring screenwriters and paid them to watch movies and rate their level of romance, gore, quirkiness, and even plot resolution. In a sense, Yellin wanted to reverse-engineer all the Hollywood formulas so that Netflix could mathematically show you the movies they knew you would like.

Now it's become one of the company's big selling points. Netflix doesn't just provide streaming movies and TV shows - it knows you. Thinking about how specific Netflix could get, I started to wonder just how many micro-genres does Netflix really have? A friend pointed out that the web addresses for the categories in the Netflix database were sequentially numbered and that I could type through each URL one by one and figure out all the micro-genres.

The first brought up African-American Crime Documentaries. The second pulled up Scary Cult Movies from the 1980s. The next was Tear-Jerkers from the 1970s. After a couple more minutes, I tried entering 10,000 just to see if the database was really that big. Japanese Horror Movies from the 1960s was in that slot. There was no way I could copy and paste tens of thousands of genre titles by hand.

So I wrote a simple script, a little piece of code, that would copy the names to a list. I set it up to run and then I waited as the script kept copying and pasting for more than 20 hours. I found that Netflix has 76,897 separate categories. To my knowledge, no one outside Netflix has ever compiled this massive data before, and now we can really understand how the system works.

The micro-genres are formed from Netflix's version of Mad Libs, an algorithm that takes all the tags in Netflix's system and combines them based on specific criteria, especially the number of movies fitting the category. Traditional genres like Drama form the center of each micro-genre but Netflix can toss in actors and directors and a bunch of descriptors, including time period, location, age level, and the squishier human words - the adjectives.

These are really what make Netflix's movie genre seem uncannily precise. Netflix's favorite adjective is romantic, which appears in 5,272 categories. Following it are foreign, classic, dark, British, critically acclaimed, suspenseful, gritty, independent, visually striking, family, violent, and feel good. But not all the adjectives are used thousands of times.

Some of the least used adjectives are telling, too - experimental, screwball, Satanic, stoner, visionary, and Depression Era. Hollywood's a popularity contest, though, so we have to ask: Which actor is the most Netflix famous? That is to say, which actor appears in the most Netflix micro-genres? The number two answer is precisely who you might expect - Bruce Willis - who has 17 dedicated categories including violent action thrillers starring Bruce Willis.

But the actor with the most categories dedicated to himself is not Tom Cruise or Angelina Jolie or Jackie Chan or Meryl Streep or Clint Eastwood or Doris Day, but Raymond Burr, star of "Perry Mason." Why? I have no idea. Even Todd Yellin, who created the Netflix system, was baffled by the number of Burr categories, and that's the interesting thing wandering through Netflix's big data - only some of the logic that drives these categories feels human.

But perhaps that's exactly what we like about Netflix's recommendations. They take our taste, break it down into its constituent parts, and spit it back to us in new and revealing ways. Netflix's strange machine wants to make us happy and to do so it must know us and our culture in ways that are not always obvious to humans. How else do we explain the Raymond Burr phenomenon? That's the eye of software staring into the American soul.

(SOUNDBITE OF MUSIC, "PERRY MASON THEME")

GROSS: Alexis Madrigal is a senior editor at The Atlantic and a visiting scholar at Berkeley's Center for Science, Technology, Medicine, and Society. He created an interactive webpage where you can explore the 76,897 genres in Netflix's database and have fun creating new ones. The webpage includes a Netflix genre generator. You'll find a link to that page on our website freshair.npr.org.

Copyright © 2014 NPR. All rights reserved. Visit our website terms of use and permissions pages at www.npr.org for further information.

NPR transcripts are created on a rush deadline by a contractor for NPR, and accuracy and availability may vary. This text may not be in its final form and may be updated or revised in the future. Please be aware that the authoritative record of NPR’s programming is the audio.

Comments

 

Please keep your community civil. All comments must follow the NPR.org Community rules and terms of use, and will be moderated prior to posting. NPR reserves the right to use the comments we receive, in whole or in part, and to use the commenter's name and location, in any medium. See also the Terms of Use, Privacy Policy and Community FAQ.