RecSys 2023 — Posters Day 3 - RecSys

Posters Day 3

Date: Friday September 22
Room: Hall 405

RESInterface Design to Mitigate Inflation in Recommender Systems
by Rana Shahout (Technion), Yehonatan Peisakhovsky (Technion), Sasha Stoikov (Cornell Tech) and Nikhil Garg (Cornell Tech).

Recommendation systems rely on user-provided data to learn about item quality and provide personalized recommendations. An implicit assumption when aggregating ratings into item quality is that ratings are strong indicators of item quality. In this work, we analyze this assumption using data collected from a music discovery application. Our study focuses on two factors that cause rating inflation: heterogeneous user rating behavior and the dynamics of personalized recommendations. We show that user rating behavior is significantly variable, leading to item quality estimates that reflect the users who rated an item more than the item quality itself. Additionally, items that are more likely to be shown via personalized recommendations can experience a substantial increase in their exposure and potential bias toward them. To mitigate these effects, we conducted a randomized controlled trial where the rating interface was modified. This resulted in a substantial improvement in user rating behavior and a reduction in item quality inflation. These findings highlight the importance of carefully considering the assumptions underlying recommendation systems and designing interfaces that encourage accurate rating behavior.

Full text in ACM Digital Library
RESTowards Self-Explaining Sequence-Aware Recommendation
by Alejandro Ariza-Casabona (University of Barcelona), Maria Salamo (Universitat de Barcelona), Ludovico Boratto (University of Cagliari) and Gianni Fenu (University of Cagliari).

Self-explaining models are becoming an important perk of recommender systems, as they help users understand the reason behind certain recommendations, which encourages them to interact more often with the platform. In order to personalize recommendations, modern recommender approaches make the model aware of the user behavior history for interest evolution representation. However, existing explainable recommender systems do not consider the past user history to further personalize the explanation based on the user interest fluctuation. In this work, we propose a SEQuence-Aware Explainable Recommendation model (SEQUER) that is able to leverage the sequence of user-item review interactions to generate better explanations while maintaining recommendation performance. Experiments validate the effectiveness of our proposal on multiple recommendation scenarios. Our source code and preprocessed datasets are available at https://tinyurl.com/SEQUER-RECSYS23.

Full text in ACM Digital Library
RESLooks Can Be Deceiving: Linking User-Item Interactions and User’s Propensity Towards Multi-Objective Recommendations
by Patrik Dokoupil (Department of Software Engineering, Charles University), Ladislav Peska (Faculty of Mathematics and Physics, Charles University, Prague, Czechia) and Ludovico Boratto (University of Cagliari).

Multi-objective recommender systems (MORS) provide suggestions to users according to multiple (and possibly conflicting) goals. When a system optimizes its results at the individual-user level, it tailors them on a user’s propensity towards the different objectives. Hence, the capability to understand users’ fine-grained needs towards each goal is crucial. In this paper, we present the results of a user study in which we monitored the way users interacted with recommended items, as well as their self-proclaimed propensities towards relevance, novelty and diversity objectives. The study was divided into several sessions, where users evaluated recommendation lists originating from a relevance-only single-objective baseline as well as MORS. We show that despite MORS-based recommendations attracted less selections, its presence in the early sessions is crucial for users’ satisfaction in the later stages. Surprisingly, the self-proclaimed willingness of users to interact with novel and diverse items is not always reflected in the recommendations they accept. Post-study questionnaires provide insights on how to deal with this matter, suggesting that MORS-based results should be accompanied by elements that allow users to understand the recommendations, so as to facilitate their acceptance.

Full text in ACM Digital Library
RESTi-DC-GNN: Incorporating Time-Interval Dual Graphs for Recommender Systems
by Nikita Severin (HSE University), Andrey Savchenko (Sber AI Lab), Dmitrii Kiselev (Artificial Intelligence Research Institute (AIRI)), Maria Ivanova (Sber AI Lab), Ivan Kireev (Sber AI Lab) and Ilya Makarov (Artificial Intelligence Research Institute (AIRI)).

Recommender systems are essential for personalized content delivery and have become increasingly popular in recent years. However, traditional recommender systems are limited in their ability to capture complex relationships between users and items. Recently, dynamic graph neural networks (DGNNs) have emerged as a promising solution for improving recommender systems by incorporating temporal and sequential information in dynamic graphs. In this paper, we propose a novel method, “Ti-DC-GNN” (Time-Interval Dual Causal Graph Neural Networks), based on an intermediate representation of graph evolution as a sequence of time-interval graphs. The main parts of the method are the novel forms of interval graphs: graph of causality and graph of consequence that explicitly preserve inter-relationships between edges (user-items interactions). The local and global message passing are developed based on edge memory to identify both short-term and long-term dependencies. Experiments on several well-known datasets show that our method consistently outperforms modern temporal GNNs with node memory alone in dynamic edge prediction tasks.

Full text in ACM Digital Library
RESOf Spiky SVDs and Music Recommendation
by Darius Afchar (Deezer Research), Romain Hennequin (Deezer Research) and Vincent Guigue (AgroParisTech).

The truncated singular value decomposition is a widely used methodology in music recommendation for direct similar-item retrieval or embedding musical items for downstream tasks. This paper investigates a curious effect that we show naturally occurring on many recommendation datasets: spiking formations in the embedding space. We first propose a metric to quantify this spiking organization’s strength, then mathematically prove its origin tied to underlying communities of items of varying internal popularity. With this new-found theoretical understanding, we finally open the topic with an industrial use case of estimating how music embeddings’ top-k similar items will change over time under the addition of data.

Full text in ACM Digital Library
RESTopic-Level Bayesian Surprise and Serendipity for Recommender Systems
by Tonmoy Hasan (UNC Charlotte) and Razvan Bunescu (UNC Charlotte).

A recommender system that optimizes its recommendations solely to fit a user’s history of ratings for consumed items can create a filter bubble, wherein the user does not get to experience items from novel, unseen categories. One approach to mitigate this undesired behavior is to recommend items with high potential for serendipity, namely surprising items that are likely to be highly rated. In this paper, we propose a content-based formulation of serendipity that is rooted in Bayesian surprise and use it to measure the serendipity of items after they are consumed and rated by the user. When coupled with a collaborative-filtering component that identifies similar users, this enables recommending items with high potential for serendipity. To facilitate the evaluation of topic-level models for surprise and serendipity, we introduce a dataset of book reading histories extracted from Goodreads, containing over 26 thousand users and close to 1.3 million books, where we manually annotate 450 books read by 4 users in terms of their time-dependent, topic-level surprise. Experimental evaluations show that models that use Bayesian surprise correlate much better with the manual annotations of topic-level surprise than distance-based heuristics, and also obtain better serendipitous item recommendation performance.

Full text in ACM Digital Library
RESProgressive Horizon Learning: Adaptive Long Term Optimization for Personalized Recommendation
by Congrui Yi (Amazon), David Zumwalt (Amazon), Zijian Ni (Amazon) and Shreya Chakrabarti (Amazon).

As B2C companies such as Amazon, Netflix, Spotify scale, personalized recommender systems are often needed to further drive long term business growth in acquisition, engagement, and retention of customers. However, long-term metrics associated with these goals can require several months to mature. Additionally, deep personalization also demands a large volume of training data that take a long time to collect. These factors incur substantial lead time for training a model to optimize a long-term metric. Before such model is deployed, a recommender system has to rely on a simple policy (e.g. random) to collect customer feedback data for training, inflicting high opportunity cost and delaying optimization of the target metric. Besides, as customer preferences can shift over time, a large temporal gap between inputs and outcome poses a high risk of data staleness and suboptimal learning. Existing approaches involve various compromises. For instance, contextual bandits often optimize short-term surrogate metrics with simple model structure, which can be suboptimal in the long run, while Reinforcement Learning approaches rely on an abundance of historical data for offline training, which essentially means long lead time before deployment. To address these problems, we propose Progressive Horizon Learning Recommender (PHLRec), a personalized model that can progressively learn metric patterns and adaptively evolve from short- to long-term optimization over time. Through simulations and real data experiments, we demonstrated that PHLRec outperforms competing methods, achieving optimality in both deployment speed and long-term metric performances.

Full text in ACM Digital Library
RESStability of Explainable Recommendation
by Sairamvinay Vijayaraghavan (Department of Computer Science, University of California, Davis) and Prasant Mohapatra (Department of Computer Science, University of California, Davis).

Explainable Recommendation has been gaining attention over the last few years in industry and academia. Explanations provided along with recommendations for each user in a recommender system framework have many uses: particularly reasoning why a suggestion is provided and how well an item aligns with a user’s personalized preferences. Hence, explanations can play a huge role in influencing users to purchase products. However, the reliability of the explanations under varying scenarios has not been strictly verified in an empirical perspective. Unreliable explanations can bear strong consequences such as attackers leveraging explanations for manipulating and tempting users to purchase target items: that the attackers would want to promote. In this paper, we study the vulnerability of existent feature-oriented explainable recommenders, particularly analyzing their performance under different levels of external noises added into model parameters. We conducted experiments by analyzing three important state-of-the-art explainable recommenders when trained on two widely used e-commerce based recommendation datasets of different scales. We observe that all the explainable models are vulnerable to increased noise levels. Experimental results verify our hypothesis that the ability to explain recommendations does decrease along with increasing noise levels and particularly adversarial noise does contribute to a much stronger decrease. Our study presents an empirical verification on the topic of robust explanations in recommender systems which can be extended to different types of explainable recommenders in RS.

Full text in ACM Digital Library
RESInterpretable User Retention Modeling in Recommendation
by Rui Ding (Northeastern University), Ruobing Xie (WeChat, Tencent), Xiaobo Hao (WeChat, Tencent), Xiaochun Yang (Northeastern University), Kaikai Ge (WeChat, Tencent), Xu Zhang (WeChat, Tencent), Jie Zhou (WeChat, Tencent) and Leyu Lin (WeChat, Tencent).

Recommendation usually focuses on immediate accuracy metrics like CTR as training objectives. User retention rate, which reflects the percentage of today’s users that will return to the recommender system in the next few days, should be paid more attention to in real-world systems. User retention is the most intuitive and accurate reflection of user long-term satisfaction. However, most existing recommender systems are not focused on user retention-related objectives, since their complexity and uncertainty make it extremely hard to discover why a user will or will not return to a system and which behaviors affect user retention. In this work, we conduct a series of preliminary explorations on discovering and making full use of the reasons for user retention in recommendation. Specifically, we make a first attempt to design a rationale contrastive multi-instance learning framework to explore the rationale and improve the interpretability of user retention. Extensive offline and online evaluations with detailed analyses of a real-world recommender system verify the effectiveness of our user retention modeling. We further reveal the real-world interpretable factors of user retention from both user surveys and explicit negative feedback quantitative analyses to facilitate future model designs.

Full text in ACM Digital Library
RESDeep Exploration for Recommendation Systems
by Zheqing Zhu (Meta AI, Stanford University) and Benjamin Van Roy (Stanford University).

Modern recommendation systems ought to benefit by probing for and learning from delayed feedback. Research has tended to focus on learning from a user’s response to a single recommendation. Such work, which leverages methods of supervised and bandit learning, forgoes learning from the user’s subsequent behavior. Where past work has aimed to learn from subsequent behavior, there has been a lack of effective methods for probing to elicit informative delayed feedback. Effective exploration through probing for delayed feedback becomes particularly challenging when rewards are delayed and sparse. To address this, we develop deep exploration methods for recommendation systems. In particular, we formulate recommendation as a sequential decision problem and demonstrate benefits of deep exploration over single-step exploration. Our experiments are carried out with high-fidelity industrial-grade simulators and establish large improvements over existing algorithms.

Full text in ACM Digital Library
RESEx2Vec: Characterizing Users and Items from the Mere Exposure Effect
by Bruno Sguerra (Deezer Research) and Romain Hennequin (Deezer Research).

The traditional recommendation framework seeks to connect user and content, by finding the best match possible based on users past interaction. However, a good content recommendation is not necessarily similar to what the user has chosen in the past. One limitation of basing future interaction on what happened in the past is that it ignores the fact that both sides of the problems are dynamic. As human, users naturally evolve, learn, forget, get bored, they change their perspective of the world and in consequence, of the recommendable content. In this work we present Ex2Vec our framework for accounting to the dynamic of the human side of the recommendation problem. We introduce the Mere Exposure Effect as a common phenomenon in music streaming platforms. We then present our model that leverage the effect for jointly characterizing users and music. We validate our model through predicting future music consumption based on repetition and discuss its implications.

Full text in ACM Digital Library
RESTALLRec: An Effective and Efficient Tuning Framework to Align Large Language Model with Recommendation
by Keqin Bao (University of Science and Technology in China), Jizhi Zhang (University of Science and Technology in China), Yang Zhang (University of Science and Technology of China), Wenjie Wang (National University of Singapore), Fuli Feng (University of Science and Technology in China) and Xiangnan He (University of Science and Technology of China).

The impressive performance of Large Language Models (LLMs) across various fields has encouraged researchers to investigate their potential in recommendation tasks. To harness the LLMs’ extensive knowledge and powerful generalization abilities, initial efforts have tried to design instructions for recommendation tasks through In-context Learning. However, the recommendation performance of LLMs remains limited due to (i) significant differences between LLMs’ language-related pre-training tasks and recommendation tasks, and (ii) inadequate recommendation data during the LLMs’ pre-training. To fill the gap, we consider further tuning LLMs for recommendation tasks. To this end, we propose a lightweight tuning framework for LLMs-based recommendation, namely LLM4Rec, which constructs the recommendation data as tuning samples and utilizes LoRA for lightweight tuning. We conduct experiments on two datasets, validating that LLM4Rec is highly efficient w.r.t. computing costs (e.g., a single RTX 3090 is sufficient for tuning LLaMA-7B), and meanwhile, it can significantly enhance the recommendation capabilities of LLMs in the movie and book domains, even with limited tuning samples (< 100 samples). Furthermore, LLM4Rec exhibits strong generalization ability in cross-domain recommendation. Our code and data are available at https://anonymous.4open.science/r/LLM4rec.

Full text in ACM Digital Library
RESInitiative transfer in conversational recommender systems
by Yuan Ma (University of Duisburg-Essen) and Jürgen Ziegler (University of Duisburg-Essen).

Conversational recommender systems (CRS) are increasingly designed to offer mixed-initiative dialogs in which the user and the system can take turns in starting a communicative exchange, for example, by asking questions or stating preferences. However, whether and when users make use of the mixed-initiative capabilities in a CRS and which factors influence their behavior is as yet not well understood. We report an online study investigating user interaction behavior, especially the transfer of initiative between user and system in a real-time online CRS. We assessed the impact of dialog initiative at the system start as well as of several psychological user characteristics. To collect interaction data, we developed a CRS framework and implementation for the domain of smartphones. Two groups of participants on Prolific (total n=143) used the system which started either with a system-initiated or user-initiated dialog. In addition to interaction data, we measured several psychological factors as well as users’ subjective assessment of the system through questionnaires. We found that: 1. Most users tended to take over the initiative from the system or stay in user-initiated mode when it was offered initially. 2. Starting the dialog in user-initiated mode CRS lead to fewer interactions needed for selecting a product than in system-initiated mode. 3. The user’s initiative transfer was mainly affected by their personal interaction preferences (especially initiative preference). 4. The initial modes of the mixed-initiative CRS did not affect the user experience, but the occurrence of initiative transfers in the dialog negatively affected the degree of user interest and excitement. The results can inform the design and potential personalization of CRS.

Full text in ACM Digital Library
RESTime-Aware Item Weighting for the Next Basket Recommendations
by Aleksey Romanov (National Research University Higher School of Economics), Oleg Lashinin (Tinkoff), Marina Ananyeva (National Research University Higher School of Economics) and Sergey Kolesnikov (Tinkoff.AI).

In this paper we study the next basket recommendation problem. Recent methods use different approaches to achieve better performance. However, many of them do not use information about the time of prediction and time intervals between baskets. To fill this gap, we propose a novel method, Time-Aware Item-based Weighting (TAIW), which takes timestamps and intervals into account. We provide experiments on three real-world datasets, and TAIW outperforms well-tuned state-of-the-art baselines for next-basket recommendations. In addition, we show the results of an ablation study and a case study of a few items.

Full text in ACM Digital Library
RESIs ChatGPT Fair for Recommendation? Evaluating Fairness in Large Language Model Recommendation
by Jizhi Zhang (University of Science and Technology of China), Keqin Bao (University of Science and Technology of China), Yang Zhang (University of Science and Technology of China), Wenjie Wang (National University of Singapore), Fuli Feng (University of Science and Technology of China) and Xiangnan He (University of Science and Technology of China).

The resounding triumph of the Large Language Models (LLMs) has ushered in a novel LLM for recommendation (LLM4rec) paradigm. Notwithstanding, the capacity of LLM4rec to provide equitable recommendations remains uncharted due to the potential presence of societal prejudices in LLMs. In order to avert the plausible hazard of employing LLM4rec, we scrutinize the fairness of LLM4rec with respect to the users’ sensitive attributes. Owing to the disparity between LLM4rec and the conventional recommendation paradigm, there are challenges in utilizing the conventional recommendation fairness benchmark directly. To explore the fairness of recommendations under the LLM4rec, we propose a new benchmark Fairness in Large language models for Recommendation (FairLR), which consists of carefully designed metrics and a dataset that considers eight sensitive attributes in two recommendation scenarios: music and movie. We utilize our FairLR benchmark to examine ChatGPT and expose that it still demonstrates bias towards certain sensitive attributes while making recommendations. Our code and dataset can be found at https://anonymous.4open.science/r/FairLR-751D/.

Full text in ACM Digital Library
RESMultiple Connectivity Views for Session-based Recommendation
by Yaming Yang (School of Artificial Intelligence, Peking University), Jieyu Zhang (University of Washington), Yujing Wang (School of Artificial Intelligence, Peking University), Zheng Miao (School of Artificial Intelligence, Peking University) and Yunhai Tong (Peking University).

Session-based recommendation (SBR), which makes the next-item recommendation based on previous anonymous actions, has drawn increasing attention. The last decade has seen multiple deep learning-based modeling choices applied on SBR successfully, e.g., recurrent neural networks (RNNs), convolutional neural networks (CNNs), graph neural networks (GNNs), and each modeling choice has its intrinsic superiority and limitation. We argue that these modeling choices differentiate from each other by (1) the way they capture the interactions between items within a session and (2) the operators they adopt for composing the neural network, e.g., convolutional operator or self-attention operator.

In this work, we dive deep into the former as it is relatively unique to the SBR scenario, while the latter is shared by general neural network modeling techniques. We first introduce the concept of connectivity view to describe the different item interaction patterns at the input level. Then, we develop the Multiple Connectivity Views for Session-based Recommendation (MCV-SBR), a unified framework that incorporates different modeling choices in a single model through the lens of connectivity view. In addition, MCV-SBR allows us to effectively and efficiently explore the search space of the combinations of connectivity views by the Tree-structured Parzen Estimator Approach (TPE) algorithm. Finally, on three widely used SBR datasets, we verify the superiority of MCV-SBR by comparing the searched models with state-of-the-art baselines. We also conduct a series of studies to demonstrate the efficacy and practicability of the proposed connectivity view search algorithm, as well as other components in MCV-SBR.

Full text in ACM Digital Library
LBRClimbing crags repetitive choices and recommendations
by Iustina Ivanova (Independent Researcher).

Outdoor sport climbing in Northern Italy attracts climbers from around the world. While this country has many rock formations, it offers enormous possibilities for adventurous people to explore the mountains. Unfortunately, this great potential causes a problem in finding suitable destinations (crags) to visit for climbing activity. Existing recommender systems in this domain address this issue and suggest potentially interesting items to climbers utilizing a content-based approach. These systems understand users’ preferences from their past logs recorded in an electronic training diary. At the same time, some sports people have a behavioral tendency to revisit the same place for subjective reasons. It might be related to weather and seasonality (for instance, some crags are suitable for climbing in winter/summer only), the users’ preferences (when climbers like specific destinations more than others), or personal goals to be achieved in sport (when climbers plan to try some routes again). Unfortunately, current climbing crags recommendations do not adapt when users demonstrate these repetitive behavior patterns. Sequential recommender systems can capture such users’ habits since their architectures were designed to model users’ next item choice by learning from their previous decision manners. To understand to which extent these sequential recommendations can predict the following crags choices in sport climbing, we analyzed a scenario when climbers show repetitious decisions. Further, we present a data set from collected climbers’ e-logs in the Arco region (Italy) and applied several sequential recommender systems architectures for predicting climbers’ following crags’ visits from their past logs. We evaluated these recommender systems offline and compared ranking metrics with the other reported results on the different data sets. The work concludes that sequential models obtain comparably accurate results as in the other studies and have the prospect for climbers’ subsequent visit prediction and crags recommendations.

Full text in ACM Digital Library
LBRUncertainty-adjusted Inductive Matrix Completion with Graph Neural Networks
by Petr Kasalicky (Singapore Management University, School of Computing and Information Systems), Antoine Ledent (Singapore Management University, School of Computing and Information Systems) and Rodrigo Alves (Czech Technical University, Faculty of Information Technology).

We propose a robust recommender systems model which performs matrix completion and a ratings-wise uncertainty estimation jointly. Whilst the prediction module is purely based on an implicit low-rank assumption imposed via nuclear norm regularization, our loss function is augmented by an uncertainty estimation module which learns an anomaly score for each individual rating via a Graph Neural Network: data points deemed more anomalous by the GNN are downregulated in the loss function used to train the low-rank module. The whole model is trained in an end-to-end fashion, allowing the anomaly detection module to tap on the supervised information available in the form of ratings. Thus, our model’s predictors enjoy the favourable generalization properties that come with being chosen from small function space (i.e., low-rank matrices), whilst exhibiting the robustness to outliers and flexibility that comes with deep learning methods. Furthermore, the anomaly scores themselves contain valuable qualitative information. Experiments on various real-life datasets demonstrate that our model outperforms standard matrix completion and other baselines, confirming the usefulness of the anomaly detection module.

Full text in ACM Digital Library
LBRAn Exploration of Sentence-Pair Classification for Algorithmic Recruiting
by Mesut Kaya (Aalborg University Copenhagen) and Toine Bogers (IT University of Copenhagen).

Recent years have seen a rapid increase in the application of computational approaches to different HR tasks, such as algorithmic hiring, skill extraction, and monitoring of employee satisfaction. Much of the recent work on estimating the fit between a person and a job has used representation learning to represent both resumes and job vacancies computationally and determine the degree to which they match. A common approach to this task is Sentence-BERT, which uses a Siamese network to encode resumes and job descriptions into fixed-length vectors and estimates how well they match based on the similarity between those vectors. In our paper, we adapt BERT’s next-sentence prediction task—predicting whether one sentence is likely to follow another in a given context—to the task of matching resumes with job descriptions. Using historical data on past (mis)matches between job-resume pairs, we fine-tune BERT for this downstream task. Through a combination of offline and online experiments on data from a large Scandinavian job portal, we show that this approach performs significantly better than Sentence-BERT and other state-of-the-art approaches for determining person-job fit.

Full text in ACM Digital Library
LBRPower Loss Function in Neural Networks for Predicting Click-Through Rate
by Ergun Biçici (Huawei R&D Center Turkey).

Loss functions guide machine learning models towards concentrating on the error most important to improve upon. We introduce power loss functions for neural networks and apply them on imbalanced click-through rate datasets. Power loss functions decrease the loss for confident predictions and increase the loss for error-prone predictions. They improve both AUC and F1 and produce better calibrated results. We obtain improvements in the results on four different classifiers and on two different datasets. We obtain significant improvements in AUC that reach $0.44\%$ for DeepFM on the Avazu dataset.

Full text in ACM Digital Library
LBRTowards Health-Aware Fairness in Food Recipe Recommendation
by Mehrdad Rostami (University of Oulu), Mohammad Aliannejadi (University of Amsterdam) and Mourad Oussalah (University of Oulu).

Food recommendation systems play a crucial role in suggesting personalized recommendations designed to help users find food and recipes that align with their preferences. However, many existing food recommendation systems have overlooked the important aspect of considering the health and nutritional value of recommended foods, thereby limiting their effectiveness in generating truly healthy recommendations. Our preliminary analysis indicates that users tend to respond positively to unhealthy food and recipes. As a result, existing food recommender systems that neglect health considerations often assign high scores to popular items, inadvertently encouraging unhealthy choices among users. In this study, we propose the development of a fairness-based model that prioritizes health considerations. Our model incorporates fairness constraints from both the user and item perspectives, integrating them into a joint objective framework. Experimental results conducted on real-world food datasets demonstrate that the proposed system not only maintains the ability of food recommendation systems to suggest users’ favorite foods but also improves the health factor compared to unfair models, with an average enhancement of approximately 35%.

Full text in ACM Digital Library
LBRA Model-Agnostic Framework for Recommendation via Interest-aware Item Embeddings
by Amit Kumar Jaiswal (University of Surrey) and Yu Xiong (University of Surrey).

Item representation holds significant importance in recommendation systems, which encompasses domains such as news, retail, and videos. Retrieval and ranking models utilise item representation to capture the user-item relationship based on user behaviours. While existing representation learning methods primarily focus on optimising item-based mechanisms, such as attention and sequential modelling. However, these methods lack a modelling mechanism to directly reflect user interests within the learned item representations. Consequently, these methods may be less effective in capturing user interests indirectly. To address this challenge, we propose a novel Interest-aware Capsule network (IaCN) recommendation model, a model-agnostic framework that directly learns interest-oriented item representations. IaCN serves as an auxiliary task, enabling the joint learning of both item-based and interest-based representations. This framework adopts existing recommendation models without requiring substantial redesign. We evaluate the proposed approach on benchmark datasets, exploring various scenarios involving different deep neural networks, behaviour sequence lengths, and joint learning ratios of interest-oriented item representations. Experimental results demonstrate significant performance enhancements across diverse recommendation models, validating the effectiveness of our approach.

Full text in ACM Digital Library
DEMRe2Dan: Retrieval of medical documents for e-Health in Danish
by Antonela Tommasel (ISISTAN Research Institute, CONICET-UNCPBA), Rafael Pablos (Aarhus Universitet) and Ira Assent (Aarhus Universitet).

With the clinical environment becoming more data-reliant, healthcare professionals now have unparalleled access to comprehensive clinical information from numerous sources. Then, one of the main issues is how to avoid overloading practitioners with large amounts of (irrelevant) information while guiding them to the relevant documents for specific patient cases. Additional challenges appear due to the shortness of queries and the presence of long (and maybe noisy) contextual information. This demo presents Re2Dan, a web Retrieval and recommender of Danish medical documents. Re2Dan leverages several techniques to improve the quality of retrieved documents. First, it combines lexical and semantic searches to understand the meaning and context of user queries, allowing the retrieval of documents that are conceptually similar to the user’s query. Second, it recommends similar queries, allowing users to discover related documents and insights. Third, when given contextual information (e.g., from patients’ clinical notes), it suggests medical concepts to expand the user query, enabling a more focused search scope and thus obtaining more accurate recommendations. Preliminary analyses showed the effectiveness of the recommender in improving the relevance and comprehensiveness of recommendations, thereby assisting healthcare professionals in finding relevant information for informed decision-making.

Full text in ACM Digital Library
DEMImproving Group Recommendations using Personality, Dynamic Clustering and Multi-Agent MicroServices
by Patrícia Alves (GECAD/LASI – ISEP, Polytechnic of Porto), André Martins (GECAD/LASI – ISEP, Polytechnic of Porto), Paulo Novais (ALGORITMI/LASI, University of Minho) and Goreti Marreiros (GECAD/LASI, ISEP, Polytechnic of Porto).

The complexity associated to group recommendations needs strategies to mitigate several problems, such as the group’s heterogeinity and conflicting preferences, the emotional contagion phenomenon, the cold-start problem, and the group members’ needs and concerns while providing recommendations that satisfy all members at once. In this demonstration, we show how we implemented a Multi-Agent Microservice to represent the tourists in a mobile Group Recommender System for Tourism prototype. A novel dynamic clustering process is presented to help minimize the group’s heterogeneity and conflicting preferences. To help solve the cold-start problem, the preliminary tourist attractions preference and travel-related preferences & concerns are predicted using the tourists’ personality, while taking the tourists’ disabilities and fears into account. Although there is no need for previous interactions data to build the tourists’ profile since we predict the tourists’ preferences, the tourist agents learn with each other by using association rules to find patterns in the tourists’ profile and in the ratings given to Points of Interest to refine the recommendations.

Full text in ACM Digital Library
INDNonlinear Bandits Exploration for Recommendations
by Yi Su (Google) and Minmin Chen (Google).

The paradigm of framing recommendations as (sequential) decision-making processes has gained significant interest. To achieve long-term user satisfaction, these interactive systems need to strikes a balance between exploitation (recommending high-reward items) and exploration (exploring uncertain regions for potentially better items). Classical bandit algorithms like Upper-Confidence-Bound and Thompson Sampling, and their contextual extensions with linear payoffs have exhibited strong theoretical guarantees and empirical success in managing the exploration-exploitation trade-off. Building efficient exploration-based systems for deep neural network powered real-world, large-scale industrial recommender systems remains under studied. In addition, these systems are often multi-stage, multi-objective and response time sensitive. In this talk, we share our experience in addressing these challenges in building exploration based industrial recommender systems. Specifically, we adopt the Neural Linear Bandit algorithm, which effectively combines the representation power of deep neural networks, with the simplicity of linear bandits to incorporate exploration in DNN based recommender systems. We introduce exploration capability to both the nomination and ranking stage of the industrial recommender system. In the context of the ranking stage, we delve into the extension of this algorithm to accommodate the multi-task setup, enabling exploration in systems with multiple objectives. Moving on to the nomination stage, we will address the development of efficient bandit algorithms tailored to factorized bi-linear models. These algorithms play a crucial role in facilitating maximum inner product search, which is commonly employed in large-scale retrieval systems. We validate our algorithms and present findings from real-world live experiments.

Full text in ACM Digital Library
INDNavigating the Feedback Loop in Recommender Systems: Insights and Strategies from Industry Practice
by Ding Tong (Netflix), Qifeng Qiao (Netflix), Ting-Po Lee (Netflix), James McInerney (Netflix) and Justin Basilico (Netflix).

Understanding and measuring the impact of feedback loops in industrial recommender systems is challenging, leading to the underestimation of the deterioration. In this study, we define open and closed feedback loops and investigate the unique reasons behind the emergence of feedback loops in the industry, drawing from real-world examples that have received limited attention in prior research. We highlight the measurement challenges associated with capturing the full impact of feedback loops using traditional online A/B tests. To address this, we propose the use of offline evaluation frameworks as surrogates for long-term feedback loop bias, supported by a practical simulation system using real data. Our findings provide valuable insights for optimizing the performance of recommender systems operating under feedback loop conditions.

Full text in ACM Digital Library
INDLeveling Up the Peloton Homescreen: A System and Algorithm for Dynamic Row Ranking
by Natalia Chen (Peloton Interactive), Nganba Meetei (Peloton Interactive), Nilothpal Talukder (Peloton Interactive) and Alexey Zankevich (Peloton Interactive).

At Peloton, we constantly strive to improve the member experience by highlighting personalized content that speaks to each individual user. One area of focus is our landing page, the homescreen, consisting of numerous rows of class recommendations used to captivate our users and guide them through our growing catalog of workouts. In this paper, we discuss a strategy we have used to increase the rate of workouts started from our homescreen through a Thompson sampling approach to row ranking, enhanced further with a collaborative filtering method based on user similarity calculated from workout history.

Full text in ACM Digital Library
INDCreating the next generation of news experience on ekstrabladet.dk with recommender systems
by Johannes Kruse (DTU Compute & Ekstra Bladet), Kasper Lindskow (Ekstra Bladet), Michael Riis Andersen (DTU Compute) and Jes Frellsen (DTU Compute).

With the uptake of algorithmic personalization, news organizations have to increasingly trust automated systems with previously considered editorial values, e.g., prioritizing news to readers. In the case study carried out by Ekstra Bladet, the Platform Intelligent News project demonstrates how recommender systems successfully enhanced the click-through rates (CTR) for multiple segments at ekstrabladet.dk while still prioritizing the news organization’s editorial values.

Full text in ACM Digital Library
INDFrom Research to Production: Towards Scalable and Sustainable Neural Recommendation Models on Commodity CPU Hardware
by Vihan Lakshman (ThirdAI), Anshumali Shrivastava (Rice University/ThirdAI), Tharun Medini (ThirdAI), Nicholas Meisburger (ThirdAI Corp), Joshua Engels (ThirdAI), David Torres Ramos (ThirdAI), Benito Geordie (ThirdAI), Pratik Pranav (ThirdAI), Shubh Gupta (ThirdAI), Yashwanth Adunukota (ThirdAI) and Siddharth Jain (ThirdAI).

In the last decade, large-scale deep learning has fundamentally transformed industrial recommendation systems. However, this revolutionary technology remains prohibitively expensive due to the need for costly and scarce specialized hardware, such as GPUs, to train and serve models. In this talk, we share our multi-year journey at ThirdAI in developing efficient neural recommendation models that can be trained and deployed on commodity CPU machines without the need for costly accelerators like GPUs. In particular, we discuss the limitations of the current GPU-based ecosystem in machine learning, why recommendation systems are amenable to the strengths of CPU devices, and present results from our efforts to translate years of academic research into a deployable system that fundamentally shifts the economics of training and operating large-scale machine learning models.

Full text in ACM Digital Library
INDContextual Multi-Armed Bandit for Email Layout Recommendation
by Yan Chen (Wayfair), Emilian Vankov (Wayfair), Linas Baltrunas (Netflix), Preston Donovan (Wayfair), Akash Mehta (Wayfair) and Benjamin Schroeder (Wayfair).

We present the use of a contextual multi-armed bandit approach to improve the personalization of marketing emails sent to Wayfair’s customers. Emails are a critical outreach tool as they economically unlock a significant amount of revenue. We describe how we formulated our problem of selecting the optimal personalized email layout to use as a contextual multi-armed bandit problem. We also explain how we approximated a solution with an Epsilon-greedy strategy. We detail the thorough evaluations we ran, including offline experiments, an off-policy evaluation, and an online A/B test. Our results demonstrate that our approach is able to select personalized email layouts that lead to significant gains in topline business metrics including engagement and conversion rates.

Full text in ACM Digital Library
INDAccelerating Creator Audience Building through Centralized Exploration
by Buket Baran (Spotify), Guilherme Dinis Junior (Spotify), Antonina Danylenko (Spotify), Olayinka S. Folorunso (Spotify), Gösta Forsum (Spotify), Maksym Lefarov (Spotify), Lucas Maystre (Spotify) and Yu Zhao (Spotify).

On Spotify, multiple recommender systems enable personalized user experiences across a wide range of product features. These systems are owned by different teams and serve different goals, but all of these systems need to explore and learn about new content as it appears on the platform. In this work, we describe ongoing efforts at Spotify to develop an efficient solution to this problem, by centralizing content exploration and providing signals to existing, decentralized recommendation systems (a.k.a. exploitation systems). We take a creator-centric perspective, and argue that this approach can dramatically reduce the time it takes for new content to reach its full potential.

Full text in ACM Digital Library
INDEfficient Data Representation Learning in Google-scale Systems
by Derek Cheng (Google DeepMind), Ruoxi Wang (Google DeepMind), Wang-Cheng Kang (Google DeepMind), Benjamin Coleman (Google DeepMind), Yin Zhang (Google DeepMind), Jianmo Ni (Google DeepMind), Jonathan Valverde (Google DeepMind), Lichan Hong (Google DeepMind) and Ed Chi (Google DeepMind).

Garbage in, Garbage out is a familiar maxim to ML practitioners and researchers, because the quality of a learned data representation is highly crucial to the quality of any ML model that consumes it as an input. To handle systems that serve billions of users at millions of queries per second (QPS), we need representation learning algorithms with significantly improved efficiency. At Google, we have dedicated thousands of iterations to develop a set of powerful techniques that efficiently learn high quality data representations.We have thoroughly validated these methods through offline evaluation, online A/B testing, and deployed these in over 50 models across major Google products. In this paper, we consider a generalized data representation learning problem that allows us to identify feature embeddings and crosses as common challenges. We propose two solutions, including: 1. Multi-size Unified Embedding to learn high-quality embeddings; and 2. Deep Cross Network V2 for learning effective feature crosses. We discuss the practical challenges we encountered and solutions we developed during deployment to production systems, compare with SOTA methods, and report offline and online experimental results. This work sheds light on the challenges and opportunities for developing next-gen algorithms for web-scale systems.

Full text in ACM Digital Library
INDBeyond Labels: Leveraging Deep Learning and LLMs for Content Metadata
by Saurabh Agrawal (Tubi), John Trenkle (Tubi) and Jaya Kawale (Tubi).

Content metadata plays a very important role in movie recommender systems as it provides valuable information about various aspects of a movie such as genre, cast, plot synopsis, box office summary, etc. Analyzing the metadata can help understand the user preferences and generate personalized recommendations catering to the niche tastes of the users. It can also help with content cold starting when the recommender system has little or no interaction data available to perform collaborative filtering. In this talk, we will focus on one particular type of metadata – genre labels. Genre labels associated with a movie or a TV series such as “horror” or “comedy” or “romance” help categorize a collection of movies into different themes and correspondingly setting up the audience expectation for a title. We present some of the challenges associated with using genre label information via traditional methods and propose a new way of examining the genre information that we call as the Genre Spectrum. The Genre Spectrum helps capture the various nuanced genres in a title and our offline and online experiments corroborate the effectiveness of the approach.

Full text in ACM Digital Library
DSUser-Centric Conversational Recommendation: Adapting the Need of User with Large Language Models
by Gangyi Zhang (University of Science and Technology of China).

Conversational recommender systems (CRS) promise to provide a more natural user experience for exploring and discovering items of interest through ongoing conversation. However, effectively modeling user preferences during conversations and generating personalized recommendations in real time remain challenging problems. Users often express their needs in a vague and evolving manner, and CRS must adapt to capture the dynamics and uncertainty in user preferences to have productive interactions.

This research develops user-centric methods for building conversational recommendation system that can understand complex and changing user needs. We propose a graph-based conversational recommendation framework that represents multi-turn conversations as reasoning over a user-item-attribute graph. Enhanced conversational path reasoning incorporates graph neural networks to improve representation learning in this framework. To address uncertainty and dynamics in user preferences, we present the vague preference multi-round conversational recommendation scenario and an adaptive vague preference policy learning solution that employs reinforcement learning to determine recommendation and preference elicitation strategies tailored to the user.

Looking to the future, large language models offer promising opportunities to enhance various aspects of CRS, including user modeling, policy learning, response generation. Overall, this research takes a user-centered perspective in designing conversational agents that can adapt to the inherent ambiguity involved in natural language dialogues with people.

Full text in ACM Digital Library
DSAdvancing Automation of Design Decisions in Recommender System Pipelines
by Tobias Vente (University of Siegen).

Recommender systems have become essential in domains like streaming services, social media platforms, and e-commerce websites. However, the development of a recommender system involves a complex pipeline with preprocessing, data splitting, algorithm and model selection, and postprocessing stages, requiring critical design decisions. Every stage of the recommender systems pipeline requires design decisions that influence the performance of the recommender system. To ease design decisions, automated machine learning (AutoML) techniques have been adapted to the field of recommender systems, resulting in various AutoRecSys libraries. Nevertheless, these libraries lack library independence and limit flexibility in integrating automation techniques from different sources. In response, our research aims to enhance the usability of AutoML techniques for design decisions in recommender system pipelines. We focus on developing flexible and library-independent automation techniques for algorithm selection, model selection, and postprocessing steps. By enabling developers to make informed choices and ease the recommender system development process, we decrease the developer’s effort while improving the performance of the recommender systems. Moreover, we want to analyze the cost-to-benefit ratio of automation techniques in recommender systems, evaluating the computational overhead and the resulting improvements in predictive performance. Our objective is to leverage AutoML concepts to automate design decisions in recommender system pipelines, reduce manual effort, and enhance the overall performance and usability of recommender systems.

Full text in ACM Digital Library
DSDemystifying Recommender Systems: A Multi-faceted Examination of Explanation Generation, Impact, and Perception
by Giacomo Balloccu (Università degli Studi di Cagliari).

Recommender systems have become an integral component of the digital landscape, impacting a multitude of services and industries ranging from e-commerce to entertainment and beyond. By offering personalised suggestions, these systems challenge a fundamental problem in our modern information society named information overload. As users face a deluge of choices, recommender systems help sift through this immense sea of possibilities, delivering a personalised subset of options that align with user preferences and historical behaviour.

However, despite their considerable utility, recommender systems often operate as “black boxes,” obscuring the rationale behind recommendations. This opacity can engender mistrust and undermine user engagement, thus attenuating the overall effectiveness of the system. Researchers have emphasized the importance of explanations in recommender systems, highlighting how explanations can enhance system transparency, foster user trust, and improve decision-making processes, thereby enriching user experiences and yielding potential business benefits. Yet, a significant gap persists in the current state of human-understandable explanations research. While recommender systems have grown increasingly complex, our capacity to generate clear, concise, and relevant explanations that reflect this complexity remains limited. Crafting explanations that are both understandable and reflective of sophisticated algorithmic decision-making processes poses a significant challenge, especially in a manner that meets the user’s cognitive and contextual needs.

Full text in ACM Digital Library
DSEnhanced Privacy Preservation for Recommender Systems
by Ziqing Wu (NTU).

My research focuses on privacy preservation for recommender systems specifically in the following aspects: first, how to better address users’ realistic privacy concerns and offer enhanced privacy control by considering what and with whom to share sensitive information for decentralized recommender systems; second, how to enhance the privacy preservation capability of LLM-based recommender systems; last, how to formulate uniform metrics to compare the privacy-preservation efficacy of the recommender system.

Full text in ACM Digital Library
DSRetrieval-augmented Recommender System: Enhancing Recommender Systems with Large Language Models
by Dario Di Palma (Politecnico di Bari).

Recommender Systems (RSs) play a pivotal role in delivering personalized recommendations across various domains, from e-commerce to content streaming platforms. Recent advancements in natural language processing have introduced Large Language Models (LLMs) that exhibit remarkable capabilities in understanding and generating human-like text. RS are renowned for their effectiveness and proficiency within clearly defined domains; nevertheless, they are limited in adaptability and incapable of providing recommendations for unexplored data. Conversely, LLMs exhibit contextual awareness and strong adaptability to unseen data. Combining these technologies creates a potent tool for delivering contextual and relevant recommendations, even in cold scenarios characterized by high data sparsity. The proposal aims to explore the possibilities of integrating LLMs into RS, introducing a novel approach called Retrieval-augmented Recommender Systems, which combines the strengths of retrieval-based and generation-based models to enhance the ability of RSs to provide relevant suggestions.

Full text in ACM Digital Library

Back to program

Posters Day 3

RecSys 2023 (Singapore)

Diamond Supporter

Platinum Supporter

Gold Supporter

Silver Supporter

Bronze Supporter

Challenge Sponsor

Special Supporters

About this site

RecSys 2026

About the photos on this site