How does Amazon’s collaborative-filtering recommendation engine work?
|How much sales lift is attributed to Amazon’s recommendation engine?
Amazon makes heavy use of an item-to-item collaborative filtering approach. This essentially means that for each item X, Amazon builds a neighbourhood of related items S(X); whenever you buy/look at an item, Amazon then recommends you items from that item’s neighbourhood. That’s why when you sign in to Amazon and look at the front page, your recommendations are mostly of the form “You viewed… Customers who viewed this also viewed………”.
Other approaches. This item-to-item approach can be contrasted to:
• A user-to-user collaborative filtering approach. This finds users similar to you (e.g., it could find users who bought a lot of items in common with you), and suggest items that they’ve bought but you haven’t.
A global factorization approach. Rather than looking at individual items in isolation (in the item-to-item approach, if you and I both buy a book X, Amazon will make essentially the same recommendations based on X, regardless of what we’ve bought in the past), a global approach would look at all the items you’ve bought, and try to detect properties that characterize what you like. For example, if you buy a lot of science fiction books and also a lot of romance books, a global-approach algorithm might try to recommend you books with both science fiction and romance elements. Pros/cons of the item-to-item approach:
• Pros over the user-to-user approach: Amazon (and most applications) has many more users than items, so it’s computationally simpler to find similar items than it is to find similar users. Finding similar users is also a difficult algorithmic task, since individual users often have a very wide range of tastes, but individual items usually belong to relatively few genres.
• Pros over the factorization approach: Simpler to implement. Faster to update recommendations: as soon as you buy a new book, Amazon can make a new recommendation in the item-to-item approach, whereas a factorization approach would have to wait until the factorization has been recomputed. The item-to-item approach can also be more easily leveraged in several areas, not only in the recommendations made to you, but also in the “similar items/other customers also bought” section when you look at a particular item.
• Cons of the item-to-item approach: You don’t get very much diversity or surprise in item-to-item recommendations, so recommendations tend to be kind of “obvious” and boring.
• How to find similar items: Since the item-to-item approach makes crucial use of similar items, here’s a high-level view of how to do it. First, associate each item with the set of users who have bought/looked at it. The similarity between any two items could then be a normalized measure of the number of users they have in common (i.e., the Jaccard index) or the cosine distance between the two items (imagine each item as a vector, with a 1 in the ith element if user i has bought it, and 0 otherwise).
Let’s make a distinction: Recommender systems is the application – You want to recommend books at Amazon or movies on Netflix as a company to increase your customer base.
b) Collaborative filtering on the other hand refers to a modelling approach.
There are two main approaches or models used to make recommendations:
i) Collaborative filtering – Here existing ratings given by users or customers for books or movies are used to figure out or predict other ratings for movies not watched or books not read by customers. If the rating prediction is good, you may want to make a recommendation of the book or movie to the customer. Matrix factorization approaches are common here.
ii) Content based filtering – In collaborative filtering, we don’t use the features or information of the users as such (what genres user likes or dislikes, age, gender, etc) to make predictions. It’s all done by inferring from existing users, who in a manner of speaking, collaborate to make a prediction. Content based filtering on the other hand, uses the features of the user to make predictions.
iii) Hybrid – One can obviously mix the above two approaches. For example, what if the user is new and hasn’t rated movies or books – Perhaps his/her background information can be viewed as when to use & what approach depends on the scale of the data, what kind of data is available, how much training time, memory one has, etc.