Tuesday, July 28, 2009

The Art of Predicting Behavior when People Differ

Suppose I give you information about what movies I have enjoyed in the past, and you observe my answers to your survey questions about whether I am a Rambo guy or a guy who likes to cry at the movies. Suppose that for a new movie, you observe the demographics and types of people who have chosen to see this movie and what their ex-post "grade" for the movie was. Using all of this information, can you write down a good predictive model for guessing whether I will like a given new movie? The NY Times talks about this NETFLIXs attempt to build a better model. If this company can make better suggestions to people, and people ex-post agree with their suggestions, then this company's profits will rise.

From a heterogeneity standpoint some obvious issues arise; if I am watching a movie with my wife --- are my tastes different than if she isn't around? Are my preferences a function of what movies I've seen in the last 2 months? Are there diminishing returns for "sci-fi" thrillers or increasing returns?

Like in the Dixit-Stiglitz model, do I have a taste for "diversity"? Are there people who jump around across categories?

It appears that this marketing literature will assume that a person can be modeled as a fixed effect (Kahn likes James Bond) plus a moving error term (as he gets older he likes a Bald Bond). But is the fixed effect "fixed"? Do people's tastes change? As we age? Given that age is observable this would be testable --- the challenge is modeling the dynamics of the unobserved preference shifter. For example, if I start reading books about England in the 19th century --- do I only want to see movies from that time period?

Holding observable attributes constant, how unpredictable are movie watchers? In a world featuring diversity, only the subset of movie watchers who know that they are "average" may listen to NETFLIX because they know that NETFLIX can figure out their preferences based on their data sets. The true freaks will never participate because they know that they are odd and that the NETFLIX predictions for them are unlikely to be a good movie for the particular freak in question.

For NETFLEX to make a lot of money; they need us all of us to be consistent and average (holding observables constant) so that their predictions are useful for the consumer.