Paper Session 6: Algorithms: Large-Scale, Constraints and Evaluation

Date: Wednesday, Sept 18, 2019, 14:00-15:30
Location: Auditorium
Chair: Tao Ye

  • LPEfficient Similarity Computation for Collaborative Filtering in Dynamic Environments
    by Olivier Jeunen, Koen Verstrepen, Bart Goethals

    The problem of computing all pairwise similarities in a large collection of vectors is a well-known and common data mining task. As the number and dimensionality of these vectors keeps increasing, however, currently existing approaches are often unable to meet the strict efficiency requirements imposed by the environments they need to perform in. Real-time neighbourhood-based collaborative filtering (CF) is one example of such an environment in which performance is critical. In this work, we present a novel algorithm for efficient and exact similarity computation between sparse, high-dimensional vectors. Our approach exploits the sparsity that is inherent to implicit feedback data-streams, entailing significant gains compared to other methods. Furthermore, as our model learns incrementally, it is naturally suited for dynamic real-time CF environments. We propose a MapReduce-inspired parallellisation procedure along with our method, and show how even more speed-up can be achieved. Additionally, in many real-world systems, many items are actually not recommendable at any given time, due to recency, stock, seasonality, or enforced business rules. We exploit this fact to further improve the computational efficiency of our approach. Experimental evaluation on both real-world and publicly available datasets shows that our approach scales up to millions of processed user-item interactions per second, and well advances the state-of-the-art.

  • LPPersonalized Diffusions for Top-N Recommendation
    by Athanasios N. Nikolakopoulos, Dimitris Berberidis, George Karypis, Georgios B. Giannakis

    This paper introduces PERDIF; a novel framework for learning personalized diffusions over item-to-item graphs for top-n recommendation. PERDIF learns the teleportation probabilities of a time-inhomogeneous random walk with restarts capturing a user-specific underlying item exploration process. Such an approach can lead to significant improvements in recommendation accuracy, while also providing useful information about the users in the system. Per-user fitting can be performed in parallel and very efficiently even in large-scale settings. A comprehensive set of experiments on real-world datasets demonstrate the scalability as well as the qualitative merits of the proposed framework. PERDIF achieves high recommendation accuracy, outperforming state-of-the-art competing approaches—including several recently proposed methods relying on deep neural networks.

  • LPSampling-Bias-Corrected Neural Modeling for Large Corpus Item Recommendations
    by Xinyang Yi, Ji Yang, Lichan Hong, Derek Zhiyuan Cheng, Lukasz Heldt, Aditee Kumthekar, Zhe Zhao, Li Wei, Ed Chi

    Many recommendation systems retrieve and score items from a very large corpus. A common recipe to handle data sparsity and power-law item distribution is to learn item representations from its content features. Apart from many content-aware systems based on matrix factorization, we consider a modeling framework using two-tower neural net, with one of the towers (item tower) encoding a wide variety of item content features. A general recipe of training such two-tower models is to optimize loss functions calculated from in-batch negatives, which are items sampled from a random mini-batch. However, in-batch loss is subject to sampling biases, potentially hurting model performance, particularly in the case of highly skewed distribution. In this paper, we present a novel algorithm for estimating item frequency from streaming data. Through theoretical analysis and simulation, we show that the proposed algorithm can work without requiring fixed item vocabulary, and is capable of producing unbiased estimation and being adaptive to item distribution change. We then apply the sampling-bias-corrected modeling approach to build a large scale neural retrieval system for YouTube recommendations. The system is deployed to retrieve personalized suggestions from a corpus with tens of millions of videos. We demonstrate the effectiveness of sampling-bias correction through offline experiments on two real-world datasets. We also conduct live A/B testings to show that the neural retrieval system leads to improved recommendation quality for YouTube.

  • LPLeveraging Post-click Feedback for Content Recommendations
    by Hongyi Wen, Longqi Yang, Deborah Estrin

    Implicit feedback (e.g., clicks) is widely used in content recommendations. However, clicks only reflect user preferences according to their first impressions. They do not capture the extent to which users continue to engage with the content. Our analysis shows that more than half of the clicks on music and short videos are followed by skips from two real-world datasets. In this paper, we leverage post-click feedback, e.g. skips and completions, to improve the training and evaluation of content recommenders. Specifically, we experiment with existing collaborative filtering algorithms and find that they perform poorly against post-click-aware ranking metrics. Based on these insights, we develop a generic probabilistic framework to fuse click and post-click signals. We show how our framework can be applied to improve pointwise and pairwise recommendation models. Our approach is shown to outperform existing methods by 18.3% and 2.5% respectively in terms of Area Under the Curve (AUC) on the short-video and music dataset. We discuss the effectiveness of our approach across content domains and trade-offs in weighting various user feedback signals.

  • LPWhen Actions Speak Louder than Clicks: A Combined Model of Purchase Probability and Long-term Customer Satisfaction
    by Gal Lavee, Noam Koenigstein, Oren Barkan

    Maximizing sales and revenue is an important goal of online commercial retailers. Recommender systems are designed to maximize users’ click or purchase probability, but often disregard users’ eventual satisfaction with purchased items. As result, such systems promote items with high appeal at the selling stage (e.g. an eye-catching presentation) over items that would yield more satisfaction to users in the long run. This work presents a novel unified model that considers both goals and can be tuned to balance between them according to the needs of the business scenario. We propose a multi-task probabilistic matrix factorization model with a dual task objective: predicting binary purchase/no purchase variables combined with predicting continuous satisfaction scores. Model parameters are optimized using Variational Bayes which allows learning a posterior distribution over model parameters. This model allows making predictions that balance the two goals of maximizing the probability for an immediate purchase and maximizing user satisfaction and engagement down the line. These goals lie at the heart of most commercial recommendation scenario and enabling their balance has the potential to improve value for millions of users worldwide. Finally, we present experimental evaluation on different types of consumer retail datasets that demonstrate the benefits of the model over popular baselines on a number of well-known ranking metrics.

  • LPUplift-based Evaluation and Optimization of Recommenders
    by Masahiro Sato, Janmajay Singh, Sho Takemori, Takashi Sonoda, Qian Zhang, Tomoko Ohkuma

    Recommender systems aim to increase user actions such as clicks and purchases. Typical evaluations of recommenders regard the purchase of a recommended item as a success. However, the item may have been purchased even without the recommendation. An uplift is defined as an increase in user actions caused by recommendations. Situations with and without a recommendation cannot both be observed for a specific user-item pair at a given time instance, making uplift-based evaluation and optimization challenging. This paper proposes new evaluation metrics and optimization methods for the uplift in a recommender system. We apply a causal inference framework to estimate the average uplift for the offline evaluation of recommenders. Our evaluation protocol leverages both purchase and recommendation logs under a currently deployed recommender system, to simulate the cases both with and without recommendations. This enables the offline evaluation of the uplift for newly generated recommendation lists. For optimization, we need to define positive and negative samples that are specific to an uplift-based approach. For this purpose, we deduce four classes of items by observing purchase and recommendation logs. We derive the relative priorities among these four classes in terms of the uplift and use them to construct both pointwise and pairwise sampling methods for uplift optimization. Through dedicated experiments with three public datasets, we demonstrate the effectiveness of our optimization methods in improving the uplift.

Back to Program

Diamond Supporters
Platinum Supporters
Gold Supporters
Silver Supporters
Special Supporter