- PALearning to Ride a Buy-Cycle: A Hyper-Convolutional Model for Next Basket Repurchase Recommendation
by Ori Katz (Microsoft, Israel, Technion, Israel), Oren Barkan (Microsoft, Israel, The Open University, Israel), Noam Koenigstein (Microsoft, Israel, Tel-Aviv University, Israel), Nir Zabari (Microsoft, Israel, The Hebrew University, Israel)
The problem of Next Basket Recommendation (NBR) addresses the challenge of recommending items for the next basket of a user, based on her sequence of prior baskets. In this paper, we focus on a variation of this problem in which we aim to predict repurchases, i.e. we wish to recommend a user only items she had purchased before. We coin this problem Next Basket Repurchase Recommendation (NBRR). Over the years, a variety of models have been proposed to address the problem of NBR, however, the problem of NBRR has been overlooked. Although being highly related problems, which are often solved by the same methods, the problem of repurchase recommendation calls for a different approach. In this paper, we share insights from our experience of facing the challenge of NBRR. In light of these insights, we propose a novel hyper-convolutional model to leverage the behavioral patterns of repeated purchases. We demonstrate the effectiveness of the proposed model on three publicly available datasets, where it is shown to outperform other existing methods across multiple metrics.
Full text in ACM Digital Library
|
- INA Lightweight Transformer for Next-Item Product Recommendation
by Jeffrey Mei (Wayfair LLC, United States), Cole Zuber (Wayfair LLC, United States), Yasaman Khazaeni (Wayfair LLC, United States)
We apply a transformer using sequential browse history to generate next-item product recommendations. Interpreting the learned item embeddings, we show that the model is able to implicitly learn price, popularity, style and functionality attributes without being explicitly passed these features during training. Our real-life test of this model on Wayfair’s different international stores show mixed results (but overall win). Diagnosing the cause, we identify a useful metric (average number of customers browsing each product) to ensure good model convergence. We also find limitations of using standard metrics like recall and nDCG, which do not correctly account for the positional effects of showing items on the Wayfair website, and empirically determine a more accurate discount factor.
Full text in ACM Digital Library
|
- REPStreaming Session-Based Recommendation: When Graph Neural Networks meet the Neighborhood
by Sarah Latifi (University of Klangenfurt, Austria), Dietmar Jannach (University of Klagenfurt, Austria)
In a number of application areas of recommender systems it is important to frequently update the underlying models, e.g., because of a continuous stream of new items that can be recommended or due to rapidly changing interest trends within a community. Moreover, when individual short-term user interests may also change from visit to visit, session-based recommendation techniques are required, leading to the problem of streaming session-based recommendation (SSR). Such problem settings have attracted increased interest in recent years, and different deep learning architectures were proposed that support fast updates of the underlying prediction models when new data arrive.
In a recent paper, a method based on Graph Neural Networks (GNN) was proposed as being superior than previous methods for the SSR problem. The baselines in the reported experiments included different machine learning models. However, several studies have shown that often conceptually simpler methods, e.g., based on nearest neighbors, can be highly effective for session-based recommendation problems. In this work, we report a similar phenomenon for the streaming configuration. We first reproduce the results of the mentioned GNN method and then show that simpler methods are able to outperform this complex state-of-the-art neural method on two datasets. Overall, our work points to continued methodological issues in the academic community, e.g., in terms of the choice of baselines and reproducibility.
Full text in ACM Digital Library
|
- PASelf-Supervised Bot Play for Transcript-Free Conversational Recommendation with Rationales
by Shuyang Li (UC San Diego, United States), Bodhisattwa Prasad Majumder (UC San Diego, United States), Julian McAuley (UC San Diego, United States)
Conversational recommender systems offer a way for users to engage in multi-turn conversations to find items they enjoy. For users to trust an agent and give effective feedback, the recommender system must be able to explain its suggestions and rationales. We develop a two-part framework for training multi-turn conversational recommenders that provide recommendation rationales that users can effectively interact with to receive better recommendations. First, we train a recommender system to jointly suggest items and explain its reasoning via subjective rationales. We then fine-tune this model to incorporate iterative user feedback via self-supervised bot-play. Experiments on three real-world datasets demonstrate that our system can be applied to different recommendation models across diverse domains to achieve state-of-the-art performance in multi-turn recommendation. Human studies show that systems trained with our framework provide more useful, helpful, and knowledgeable suggestions in warm- and cold-start settings. Our framework allows us to use only product reviews during training, avoiding the need for expensive dialog transcript datasets that limit the applicability of previous conversational recommender agents.
Full text in ACM Digital Library
|
- PAOff-Policy Actor Critic for Recommender Systems
by Minmin Chen (Google, United States), Can Xu (Google Inc, United States), Vince Gatto (Google, United States), Devanshu Jain (Google, United States), Aviral Kumar (Google, United States), Ed Chi (Google, United States)
Industrial recommendation platforms are increasingly concerned with how to make recommendations that cause users to enjoy their long term experience on the platform. Reinforcement learning emerged naturally as an appealing approach for its promise in 1) combating feedback loop effect resulted from myopic system behaviors; and 2) sequential planning to optimize long term outcome. Scaling RL algorithms to production recommender systems serving billions of users and contents, however has been proven hard. Sample inefficiency and instability of online RL hinder its widespread adoption in production. Offline RL enables usage of off-policy data and batch learning. It on the other hand faces major challenges in learning due to the distribution shift.
A REINFORCE agent [3] was successfully tested for YouTube recommendation, showing significant improvement over a sophisticated supervised learning production system. Off-policy correction was employed to learn from logged data. To control variance in learning, the authors adopted one-step approximation to the full trajectory correction. This however introduces bias in learning, producing sub-optimal policies in optimizing the defined long term outcome. Here we share the key designs in setting up an off-policy actor-critic agent for production recommender systems. It extends [3] with a critic network that estimates the value of any state-action pairs under the target learned policy through temporal difference learning, addressing the aforementioned bias. We demonstrate in offline and live experiments that the new framework out-performs baseline and improves long term user experience.
An interesting discovery along our investigation is that recommendation agents, which commonly employ a softmax policy parameterization, can end up being too pessimistic about out-of-distribution (OOD) actions. The phenomenon contrasts with findings in the general RL community, and suggests new research directions in advancing RL for recommender systems.
Full text in ACM Digital Library
|
|
|