I wanted to follow up on the previous post regarding pandora.com. Pandora represents a bit of an anachronism in today’s world of artificial intelligence, a relic from our linear regression past. Our what now? Well, back in the good old days (say, the 1990’s), people did research where they measured some outputs (in Pandora’s case, how much people like a given song), and then correlated them to some inputs (for Pandora, the qualities that make up the song). The statistical method one uses to do this correlation is linear regression - we assume that if we like a little of something, we’ll like more of that something even more. This is what Pandora does - it learns that you like electric guitar riffs, or “unintelligible lyrical style” (yes, this is an actual classification in use by Pandora). It then suggests songs that also have these qualities. It’s recommendations are based on the painstaking work of over 50 full-time employees, whose sole job is to listen to music and categorize it on dozens of different characteristics.
These days, recommendation algorithms are much more likely to follow the Amazon algorithm - call it our networked present. Amazon generates your recommendations based on what similar people have purchased. For example, I recently purchased Nudge: Improving Decisions about Health, Wealth, and Happiness by Richard Thaler (which looks excellent by the way). Because of this, Amazon thinks I might also like Sway: The Irresistible Pull of Irrational Behavior by Ori Brafman. This seems reasonable, I suppose: if I am interested in the behavioral finance of nudging, I would probably be interested in the behavioral finance of swaying as well. The networked present is so popular because it is so easy. Amazon doesn’t need a full time staff of reviewers classifying books as “behavioral finance”, or “one word titles followed by a colon”. The purchase data is just sort of…there. No need to go out and get it.
For all their simplicity, networked algorithms have some significant weaknesses. They tend to promote closed systems - if I write a new book entitled “Shove: The Economics of Smackdowns”, someone needs to buy both my book and Nudge before the network will make the match. Networked algorithms also have a natural tendency towards the popular - since a lot of people have by definition bought the popular item, it will show up frequently in the recommendation algorithm, while the less popular option will go unnoticed.
What is a person to do? Well, the best sites use a little bit of both algorithm types. For example, Netflix has devoted considerable resources to their recommendation algorithm, which uses both network effects as well as linear effects like the genre of the movie and the main actor(s). Amazon also allows you to rate and categorize products, to improve your recommendation. In sum, it pays to practice algorithm inclusivism.