Session: Algorithms (Collaborative Filtering)

Date: Saturday, October 20, 08:30-10:15

  • Distributed collaborative filtering with domain specialization

    by Shlomo Berkovsky, Tsvi Kuflik, Francesco Ricci

    User data scarcity has always been indicated among the major problems of collaborative filtering recommender systems. That is, if two users do not share sufficiently large set of items for whom their ratings are known, then the user-to-user similarity computation is not reliable and a rating prediction for one user can not be based on the ratings of the other. This paper shows that this problem can be solved, and that the accuracy of collaborative recommendations can be improved by: a) partitioning the collaborative user data into specialized and distributed repositories, and b) aggregating information coming from these repositories. This paper explores a content-dependent partitioning of collaborative movie ratings, where the ratings are partitioned according to the genre of the movie and presents an evaluation of four aggregation approaches. The evaluation demonstrates that the aggregation improves the accuracy of a centralized system containing the same ratings and proves the feasibility and advantages of a distributed collaborative filtering scenario.

    Details

  • Complex-network theoretic clustering for identifying groups of similar listeners in p2p systems

    by Amelie Anglade, Marco Tiemann, Fabio Vignoli

    This article presents an approach to automatically create virtual communities of users with similar music preferences in a distributed system. Our goal is to create personalized music channels for these communities using the content shared by its members in peer-to-peer networks for each community. To extract these communities a complex network theoretic approach is chosen. A fully connected graph of users is created using epidemic protocols. We show that the created graph sufficiently converges to a graph created with a centralized algorithm after a small number of protocol iterations. To find suitable techniques for creating user communities, we analyze graphs created from real-world recommender datasets and identify specific properties of these datasets. Based on these properties, different graph-based community-extraction techniques are chosen and evaluated. We select a technique that exploits identified properties to create clusters of music listeners. The suitability of this technique is validated using a music dataset and two large movie datasets. On a graph of 6,040 peers, the selected technique assigns at least 85% of the peers to optimal communities, and obtains a mean classification error of less than 0.05% over the remaining peers that are not assigned to the best community.

    Details

  • Robust collaborative filtering

    by Bhaskar Mehta, Thomas Hofmann, Wolfgang Nejdl

    The widespread deployment of recommender systems has lead to user feedback of varying quality. While some users faithfully express their true opinion, many provide noisy ratings which can be detrimental to the quality of the generated recommendations. The presence of noise can violate modeling assumptions and may thus lead to instabilities in estimation and prediction. Even worse, malicious users can deliberately insert attack profiles in an attempt to bias the recommender system to their benefit.

    Robust statistics is an area within statistics where estimation methods have been developed that deteriorate more gracefully in the presence of unmodeled noise and slight departures from modeling assumptions. In this work, we study how such robust statistical methods, in particular M-estimators, can be used to generate stable recommendation even in the presence of noise and spam. To that extent, we present a Robust Matrix Factorization algorithm and study its stability. We conclude that M-estimators do not add significant stability to recommendation; however the presented algorithm can outperform existing recommendation algorithms in its recommendation quality.

    Details

  • A recursive prediction algorithm for collaborative filtering recommender systems

    by Jiyong Zhang, Pearl Pu

    Collaborative filtering (CF) is a successful approach for building online recommender systems. The fundamental process of the CF approach is to predict how a user would like to rate a given item based on the ratings of some nearest-neighbor users (user-based CF) or nearest-neighbor items (item-based CF). In the user-based CF approach, for example, the conventional prediction procedure is to find some nearest-neighbor users of the active user who have rated the given item, and then aggregate their rating information to predict the rating for the given item. In reality, due to the data sparseness, we have observed that a large proportion of users are filtered out because they don’t rate the given item, even though they are very close to the active user. In this paper we present a recursive prediction algorithm, which allows those nearest-neighbor users to join the prediction process even if they have not rated the given item. In our approach, if a required rating value is not provided explicitly by the user, we predict it recursively and then integrate it into the prediction process. We study various strategies of selecting nearest-neighbor users for this recursive process. Our experiments show that the recursive prediction algorithm is a promising technique for improving the prediction accuracy for collaborative filtering recommender systems.

    Details

Back to Program