Session 8: Matrix Factorization

Date: Thursday, Oct 9, 16:15-18:00
Moderator: Harald Steck

  • GASGD: Stochastic Gradient Descent for Distributed Asynchronous Matrix Completion via Graph Partitioning

    by Fabio Petroni and Leonardo Querzoni

    Matrix completion latent factors models are known to be an effective method to build recommender systems. Currently, stochastic gradient descent (SGD) is considered one of the best latent factor-based algorithm for matrix completion. In this paper we discuss GASGD, a distributed asynchronous variant of SGD for large-scale matrix completion, that (i) leverages data partitioning schemes based on graph partitioning techniques, (ii) exploits specific characteristics of the input data and (iii) introduces an explicit parameter to tune synchronization frequency among the computing nodes. We empirically show how, thanks to these features, GASGD achieves a fast convergence rate incurring in smaller communication cost with respect to current asynchronous distributed SGD implementations.

  • A Framework for Matrix Factorization based on General Distributions

    by Josef Bauer and Alexandros Nanopoulos

    In this paper we extend the current state-of-the-art matrix factorization method for recommendations to general probability distributions. As shown in previous work, the standard method called “Probabilistic Matrix Factorization” is based on a normal distribution assumption. While there exists work in which this method is extended to other distributions, these extensions are restrictive and we experimentally show on the basis of a real data set that it is worthwhile considering more general distributions which have not been used in the literature. Our contribution lies in providing a flexible and easy-to-use framework for matrix factorization with almost no limitation on the form of the distribution used. Our approach is based on maximum likelihood estimation and a key ingredient of our proposed method is automatic differentiation. This allows for the automatic derivation of the corresponding optimization algorithm, without the need to derive it manually for each distributional assumption while simultaneously being computationally efficient. Thus, with our method it is very easy to use a wide range of even complicated distributions for any data set.

  • Speeding Up the Xbox Recommender System Using a Euclidean Transformation for Inner-Product Spaces

    by Yoram Bachrach, Yehuda Finkelstein, Ran Gilad-Bachrach, Liran Katzir, Noam Koenigstein, Nir Nice and Ulrich Paquet

    A prominent approach in collaborative filtering based recommender systems is using dimensionality reduction (matrix factorization) techniques to map users and items into low-dimensional vectors. In such systems, a higher inner product between a user vector and an item vector indicates that the item better suits the user’s preference. Traditionally, retrieving the most suitable items was done by scoring and sorting all items. Real world online recommender systems must adhere to strict response-time constraints. Therefore, when the number of items is too large, scoring all items becomes infeasible. We propose a novel order preserving transformation, mapping the maximum inner product search problem to Euclidean space nearest neighbor search problem. Utilizing this transformation, we study the efficiency of several (approximate) nearest neighbor data structures. Our final solution is based on a novel use of the PCA-Tree data structure in which results are augmented using paths one hamming distance away from the query (neighborhood boosting). The end result is a system which allows approximate matches (items with relatively high inner product, but not necessarily the highest one). We evaluate our techniques on two large-scale recommendation datasets, Xbox Movies and Yahoo~Music, and show that this technique allow trading off a slight degradation in the recommendation quality for a significant improvement in the retrieval time.

  • Gradient Boosting Factorization Machines

    by Chen Cheng, Fen Xia, Tong Zhang, Irwin King and Michael Lyu

    Recommendation techniques have been well developed in the past decades. Most of them build models only based on user item rating matrix. However, in real world, there is plenty of auxiliary information available in recommendation systems. We can utilize these information as additional features to improve recommendation performance. We refer recommendation with auxiliary information as context-aware recommendation. Context-aware Factorization Machines (FM) is one of the most successful context-aware recommendation models. FM models pairwise interactions between all features, in such way, a certain feature latent vector is shared to compute the factorized parameter it involved. In practice, there are tens of context features and not all the pairwise feature interactions are useful. Thus, one important challenge for context-aware recommendation is how to effectively select “good” interaction features. In this paper, we focus on solving this problem and propose a greedy interaction feature selection algorithm based on gradient boosting. Then we propose a novel Gradient Boosting Factorization Machine (GBFM) model to incorporate feature selection algorithm with Factorization Machines into a unified framework. The experimental results on both synthetic and real datasets demonstrate the efficiency and effectiveness of our algorithm compared to other state-of-the-art methods.

  • Exploiting Temporal Influence in Online Recommendation

    by Robert Palovics, Andras Benczur, Tamas Kiss, Levente Kocsis and Erzsebet Frigo

    In this paper we give methods for time aware music recommendation in a social media service with the potential of exploiting immediate temporal influences between users. We consider events when a user listens to an artist the first time and this event follows some friend listening to the same artist short time before. We train a blend of matrix factorization methods that model the relation of the influencer, the influenced and the artist, both the individual factor decompositions and their weight learned by variants of stochastic gradient descent (SGD). Special care is taken since events of influence form a subset of the positive implicit feedback data and hence we have to cope with two different definitions of the positive and negative implicit training data. In addition, in the time aware setting we have to use online learning and evaluation methods. While SGD can easily be trained online, evaluation is cumbersome by traditional measures since we will have potentially different top recommendations at different times. Our experiments are carried over the two-year scrobble history of 70,000 Last.fm users and show a 4% increase in recommendation quality by predicting temporal influences.

Back to Program

Diamond Supporters
 
Platinum Supporters
 
 
 
 
Gold Supporters
 
 
Silver Supporter