Starting from "House of Cards", Netflix is increasingly in the sight of Chinese people. There are also a lot of myths and legends about its user recommendation system, "big data analysis" and so on. This article is the Netflix recommendation ideas and some methods that I have sorted out after searching for some information. The content of the algorithm that is too technical is really untenable, so it will only be explained logically. First, organize your own ideas, second, learn as much as possible, and share it to welcome everyone to discuss.
1. Wait for the wind
Looking back at history, Netflix is a typical forward-looking company.
In August 1997, only a few months after the DVD player was launched, Reed Hasting and Marc Randolph founded Netflix, and in March 1998 launched the world ’s first online DVD rental store. With only 30 employees won 925 movies, which is almost all the stock of DVD movies at that time.
In 1999, they launched a brand new monthly subscription model. For the first time, users have the opportunity to enjoy a series of annoying services such as no overdue fines, no shipping fees, handling fees and so on. The new model is more user-friendly than the single movie rental method used by Netflix. Netflix quickly established a reputation in the industry with this model. The old single movie rental model also ended in 2000.
In 2001, as the price of DVD players became lower and lower, it became one of the most popular gifts for Christmas that year. Netflix also took this express train in 2002, and the number of users has increased significantly. This is the first outlet in the four years since Netflix was founded, and now it seems that it has to sigh its unique vision.
In 2005, they found that despite the lack of high-definition content, Youtube ’s streaming service (which can be simply understood as online playback) was still very popular, so they gave up their hardware products Netflix Box and transferred to it. The streaming service was launched in 2007 . With the increase in network bandwidth and the reduction of costs in the future, Netflix, which has taken advantage of the first mover, has once again gained huge growth.
In 2006, an algorithm competition called the Netflix Prize came into being. Netflix presented a $ 1 million prize for developers to recommend algorithms for their optimized movies. As of the fourth quarter of 2012, Netflix had 29.4 million subscribers worldwide.
In 2012, Netflix began experimenting with home-made content, and launched "House of Cards" in 2013. The superb content quality and the release method of releasing the entire season of content at once made it instantly popular in the world.
In April this year, Netflix's global subscribers reached 125 million, serving more than 190 countries and regions. As of today, its market value has surpassed that of Disney and has become the sixth largest Internet company in the world.
Looking back on the 21-year history of Netflix, it seems that the timing and direction of each transformation are so accurate that it has reached a certain level of "naturalness". However, if we look at the essence through phenomena and look for change from change, one thing will definitely be mentioned-personalized recommendation. It can even be said that "personalized recommendations" are like Netflix's homemade blowers, and the third outlet is made by themselves.
There is no clear information about whether Netflix has a recommendation mechanism when it comes to mailing rental DVDs. But they did attach great importance to the data from the beginning, and began to collect user data: they will attach questionnaires in mailed envelopes to allow users to rate the movie. These scoring data is one of the important cornerstones of the Netflix recommendation system.
"Personalized recommendation" has always been Netflix's killer experience. The advance of data accumulation and algorithm development makes it almost unsurpassable in this regard. Today, 80% of the content that users watch on Netflix are derived from recommendations .
Second, deconstruct Hollywood
The reason why Netflix's recommendation system can achieve its goals so efficiently, I think the biggest reason is that they taught "make the machine understand the movie." In an article called> How Netflix Reverse Engineerd Hollywood <(published by Alexis C. Madrigal in 2014). Starting from Netflix's recommendation classification, the author explains how they deconstructed Hollywood and then made a recommendation system for users.
On the homepage of Netflix you will see a line of movies, each line is a category, the official name is altgenre, or "micro-category", each category is a series of movies. These categories and movies are recommended for you.
There are some very precise and interesting titles in these categories: Emotional Fight-the-System Documentaries, Period Pieces About Royalty Based on Real Life, foreign countries in the 1980s Evil movie (Foreign Satanic Stories from the 1980s).
So how did these types come from? Author Alexis did a very unique thing:
He crawled all the categories of Netflix, with a total of 76,897 categories. In addition, I made an in-depth analysis of the words and grammar of these classifications, and also developed a "type generator", which produced similar results to Netflix. He even gave the formula: area + adjective + type + story base + location + era + about (what content) + age-appropriate age (Region… + Adjectives… + Noun Genre… + Based On… + Set In… + From the… + About… + For Age X to Y).
But seeing here, we just saw the result of Netflix's deconstruction of Hollywood, so where did it all start?
In 2006, Todd Yellin, vice president of Netflix products, led a ticket engineer to write a 24-page document called "Netflix Quantum Theory" for several months. Describes how to use "microtag" (microtag) to disassemble movies.
The purpose of this document is to serve as a training manual for different people to have the same understanding of microtags, so as to ensure that thousands of movies can be deconstructed systematically and uniformly. Today this manual has been expanded to 36 pages.
This 36-page training manual describes how to rate a film ’s sexually suggestive content, bloodiness, romantic rating, and even plot summary. The document also explains how to label the ending of the film, the "social acceptance" of the main actors, the romantic degree of each film, and more importantly, each label has a rating from 1 to 5.
Take the "Super Hero" movie as an example, the label will include "four main characters". As for the character of Matt Murdock, there will be an actor name, a character name, he is "heroic" (heroic), a lawyer, etc.
In this way, Netflix deconstructs almost all movies, and uses fine and accurate microtags and ratings to teach the recommendation system to recognize movies and interpret them.
Even more commendable is that labeling Netflix is really a job. Netflix formed a team that paid them to watch movies while tagging them. Some good media interviewed a "tagger" (tagger) and asked him to tell what kind of experience it was to tag Netflix, which was very interesting.
3. Deconstruct users
Around 2012, Netflix's recommendation system underwent a major strategic change. The official technical blog described the causes and consequences of this change with an article titled> Netflix Recommendations: Beyond the 5 stars <(divided into 1, 2):
In the era of mailing and renting DVDs, Netflix can get users' ratings, but the process of users watching movies is invisible to the platform. But with the development of streaming media business, Netflix finally has the opportunity to see more aspects of users. So they realized:
" Everything is a Recommendation. "- Everything is a recommendation.
This idea has spawned more detailed and in-depth user recommendations.
Netflix's official documentation refers to itself as "lucky" because they have a large amount of relevant data and talents who can apply this data to products.
The following are the data sources used by Netflix to optimize the recommendation system:
- Millions of user ratings data (tagins), and it is still growing by millions on a daily basis;
- Item popularity as the algorithm baseline;
- Contains millions of streaming data (stream plays) of duration, time, and device type;
- Users will want to add millions of items to their queues every day;
- Rich metadata under each project;
- Presentation and effect of each project;
- User's social data (social);
- Millions of user search data (search terms);
- Box office or movie review data from external data;
- Of course, the data actually used is far more than that.
After the transformation of streaming media, all user behaviors are completed within the platform, which gives Netflix an excellent environment for observing users. They not only know what users have seen, they even know how they viewed it: when they watched it, they watched it How long, where to pause, where to repeat, where to close, etc., these behavioral data are all manifestations of user preferences.
By analyzing these behavioral data and matching the movie data obtained from deconstructing Hollywood, the recommendation accuracy of Netflix becomes more accurate.