Posters Day 1

Date: Wednesday September 20
Room: Hall 405

  • RESA Probabilistic Position Bias Model for Short-Video Recommendation Feeds
    by Olivier Jeunen (ShareChat UK).

    Modern web-based platforms often show ranked lists of recommendations to users, in an attempt to maximise user satisfaction or business metrics. Typically, the goal of such systems boils down to maximising the exposure probability — conversely, minimising the rank— for items that are deemed “reward-maximising” according to some metric of interest. This general framing comprises music or movie streaming applications, as well as e-commerce, restaurant or job recommendations, and even web search. Position bias or user models can be used to estimate exposure probabilities for each use-case, specifically tailored to how users interact with the presented rankings. A unifying factor in these diverse problem settings is that typically only one or several items will be engaged with (clicked, streamed, purchased, et cetera) before a user leaves the ranked list. Short-video feeds on social media platforms diverge from this general framing in several ways, most notably that users do not tend to leave the feed after, for example, liking a post. Indeed, seemingly infinite feeds invite users to scroll further down the ranked list. For this reason, existing position bias or user models tend to fall short in such settings, as they do not accurately capture users’ interaction modalities. In this work, we propose a novel and probabilistically sound personalised position bias model for feed recommendations. We focus on a 1st-level feed in a hierarchical structure, where users may enter a 2nd-level feed via any given 1st-level item. We posit that users come to the platform with a given scrolling budget that is drawn according to a discrete power-law distribution, and show how the survival function of said distribution can be used to obtain closed-form estimates for personalised exposure probabilities. Empirical insights gained through data from a large-scale social media platform show how our probabilistic position bias model more accurately captures empirical exposure than existing models, and paves the way for improved unbiased evaluation and learning-to-rank.

    Full text in ACM Digital Library

  • RESADRNet: A Generalized Collaborative Filtering Framework Combining Clinical and Non-Clinical Data for Adverse Drug Reaction Prediction
    by Haoxuan Li (Center for Data Science, Peking University), Taojun Hu (Peking University), Zetong Xiong (Zhongnan University of Economic and Law), Chunyuan Zheng (University of California, San Diego), Fuli Feng (University of Science and Technology of China), Xiangnan He (University of Science and Technology of China) and Xiao-Hua Zhou (Peking University).

    Adverse drug reaction (ADR) prediction plays a crucial role in both health care and drug discovery for reducing patient mortality and enhancing drug safety. Recently, many studies have been devoted to effectively predict the drug-ADRs incidence rates. However, these methods either did not effectively utilize non-clinical data, i.e., physical, chemical, and biological information about the drug, or did little to establish a link between content-based and pure collaborative filtering during the training phase. In this paper, we first formulate the prediction of multi-label ADRs as a drug-ADR collaborative filtering problem, and to the best of our knowledge, this is the first work to provide extensive benchmark results of previous collaborative filtering methods on two large publicly available clinical datasets. Then, by exploiting the easy accessible drug characteristics from non-clinical data, we propose ADRNet, a generalized collaborative filtering framework combining clinical and non-clinical data for drug-ADR prediction. Specifically, ADRNet has a shallow collaborative filtering module and a deep drug representation module, which can exploit the high-dimensional drug descriptors to further guide the learning of low-dimensional ADR latent embeddings, which incorporates both the benefits of collaborative filtering and representation learning. Extensive experiments are conducted on two publicly available real-world drug-ADR clinical datasets and two non-clinical datasets to demonstrate the accuracy and efficiency of the proposed ADRNet.

    Full text in ACM Digital Library

  • RESUsing Learnable Physics for Real-Time Exercise Form Recommendations
    by Abhishek Jaiswal (Indian Institute of Technology Kanpur), Gautam Chauhan (Indian Institute of Technology Kanpur) and Nisheeth Srivastava (Indian Institute of Technology Kanpur).

    Good posture and form are essential for safe and productive exercising. Even in gym settings, trainers may not be readily available for feedback. Rehabilitation therapies and fitness workouts can thus benefit from recommender systems that provide real-time evaluation. In this paper, we present an algorithmic pipeline that can diagnose problems in exercises technique and offer corrective recommendations, with high sensitivity and specificity, in real-time. We use MediaPipe for pose recognition, count repetitions using peak-prominence detection, and use a learnable physics simulator to track motion evolution for each exercise. A test video is diagnosed based on deviations from the prototypical learned motion using statistical learning. The system is evaluated on six full and upper body exercises. These real-time interactive suggestions counseled via low-cost equipment like smartphones will allow exercisers to rectify potential mistakes making self-practice feasible while reducing the risk of workout injuries.

    Full text in ACM Digital Library

  • RESReCon: Reducing Congestion in Job Recommendation using Optimal Transport
    by Yoosof Mashayekhi (Ghent University), Bo Kang (Ghent University), Jefrey Lijffijt (Ghent University) and Tijl de Bie (Ghent University).

    Recommender systems may suffer from congestion, meaning that there is an unequal distribution of the items in how often they are recommended. Some items may be recommended much more than others. Recommenders are increasingly used in domains where items have limited availability, such as the job market, where congestion is especially problematic: Recommending a vacancy—for which typically only one person will be hired—to a large number of job seekers may lead to frustration for job seekers, as they may be applying for jobs where they are not hired. This may also leave vacancies unfilled and result in job market inefficiency. We propose a novel approach to job recommendation called ReCon, accounting for the congestion problem. Our approach is to use an optimal transport component to ensure a more equal spread of vacancies over job seekers, combined with a job recommendation model in a multi-objective optimization problem. We evaluated our approach on two real-world job market datasets. The evaluation results show that ReCon has good performance on both congestion-related (e.g., Congestion) and desirability (e.g., NDCG) measures.

    Full text in ACM Digital Library

  • RESOptimizing Long-term Value for Auction-Based Recommender Systems via On-Policy Reinforcement Learning
    by Ruiyang Xu (Meta AI), Jalaj Bhandari (Meta AI), Dmytro Korenkevych (Meta AI), Fan Liu (Meta), Yuchen He (Meta), Alex Nikulkov (Meta AI) and Zheqing Zhu (Meta AI).

    Auction-based recommender systems are prevalent in online advertising platforms, but they are typically optimized to allocate recommendation slots based on immediate expected return metrics, neglecting the downstream effects of recommendations on user behavior. In this study, we employ reinforcement learning to optimize for long-term return metrics in an auction-based recommender system. Utilizing temporal difference learning, a fundamental reinforcement learning algorithm, we implement a \textit{one-step policy improvement approach} that biases the system towards recommendations with higher long-term user engagement metrics. This optimizes value over long horizons while maintaining compatibility with the auction framework. Our approach is based on dynamic programming ideas which show that our method provably improves upon the existing auction-based base policy. Through an online A/B test conducted on an auction-based recommender system, which handles billions of impressions and users daily, we empirically establish that our proposed method outperforms the current production system in terms of long-term user engagement metrics.

    Full text in ACM Digital Library

  • RESAnalysis Operations for Constraint-based Recommender Systems
    by Sebastian Lubos (Institute of Software Technology – Graz University of Technology), Viet-Man Le (Graz University of Technology), Alexander Felfernig (TU Graz) and Thi Ngoc Trang Tran (Graz University of Technology).

    Constraint-based recommender systems support users in the identification of complex items such as financial services and digital cameras. Such recommender systems enable users to find an appropriate item within the scope of a conversational process. In this context, relevant items are determined by matching user preferences with a corresponding product (item) assortment on the basis of a pre-defined set of constraints. The development and maintenance of constraint-based recommenders is often an error-prone activity – specifically with regard to the scoping of the offered item assortment. In this paper, we propose a set of offline analysis operations that provide insights to assess the quality of a constraint-based recommender system before the system is deployed for productive use. The operations include a.o. automated analysis of feature restrictiveness and item (product) accessibility. We analyze usage scenarios of the proposed analysis operations on the basis of a simplified example digital camera recommender.

    Full text in ACM Digital Library

  • RESBootstrapped Personalized Popularity for Cold Start Recommender Systems
    by Iason Chaimalas (University College London), Duncan Walker (British Broadcasting Corporation), Edoardo Gruppi (University College London), Ben Clark (British Broadcasting Corporation) and Laura Toni (University College London).

    Recommender Systems are severely hampered by the well-known Cold Start problem, identified by the lack of information on new items and users. This has led to research efforts focused on data imputation and augmentation models as predominantly data pre-processing strategies, yet their improvement of cold-user performance is largely indirect and often comes at the price of a reduction in accuracy for warmer users. To address these limitations, we propose Bootstrapped Personalized Popularity (B2P), a novel framework that improves performance for cold users (directly) and cold items (implicitly) via popularity models personalized with item metadata. B2P is scalable to very large datasets and directly addresses the Cold Start problem, so it can complement existing Cold Start strategies. Experiments on a real-world Enterprise dataset (anonymized) and a public dataset demonstrate that B2P (1) significantly improves cold-user performance, (2) boosts warm-user performance for bootstrapped models by lowering their training sparsity, and (3) improves total recommendation accuracy at a competitive diversity level relative to existing high-performing Collaborative Filtering models. We demonstrate that B2P is a powerful and scalable framework for strongly cold datasets.

    Full text in ACM Digital Library

  • RESBeyond the Sequence: Statistics-driven Pre-training for Stabilizing Sequential Recommendation Model
    by Sirui Wang (Meituan Group), Peiguang Li (Meituan Group), Yunsen Xian (Meituan Group) and Hongzhi Zhang (Meituan Group).

    The sequential recommendation task aims to predict the item that user is interested in according to his/her historical action sequence. However, inevitable random action, i.e. user randomly accesses an item among multiple candidates or clicks several items at random order, cause the sequence fails to provide stable and high-quality signals. To alleviate the issue, we propose the StatisTics-Driven Pre-traing framework (called STDP briefly). The main idea of the work lies in the exploration of utilizing the statistics information along with the pre-training paradigm to stabilize the optimization of recommendation model. Specifically, we derive two types of statistical information: item co-occurrence across sequence and attribute frequency within the sequence. And we design the following pre-training tasks: 1) The co-occurred items prediction task, which encourages the model to distribute its attention on multiple suitable targets instead of just focusing on the next item that may be unstable. 2) We generate a paired sequence by replacing items with their co-occurred items and enforce its representation close with the original one, thus enhancing the model’s robustness to the random noise. 3) To reduce the impact of random on user’s long-term preferences, we encourage the model to capture sequence-level frequent attributes. The significant improvement over six datasets demonstrates the effectiveness and superiority of the proposal, and further analysis verified the generalization of the STDP framework on other models.

    Full text in ACM Digital Library

  • RESPersonalized Category Frequency prediction for Buy It Again recommendations
    by Amit Pande (Target), Kunal Ghosh (Target) and Rankyung Park (Target).

    Buy It Again (BIA) recommendations are crucial to retailers to help improve user experience and site engagement by suggest- ing items that customers are likely to buy again based on their own repeat purchasing patterns. Most existing BIA studies analyze guests’ personalized behaviour at item granularity. This finer level of granularity might be appropriate for small businesses or small datasets for search purposes. However, this approach can be infea- sible for big retailers like Amazon, Walmart, or Target which have hundreds of millions of guests and tens of millions of items. For such data sets, it is more practical to have a coarse-grained model that captures customer behaviour at the item category level. In addition, customers commonly explore variants of items within the same categories, e.g., trying different brands or flavors of yogurt. A category-based model may be more appropriate in such scenarios. We propose a recommendation system called a hierarchical PCIC model that consists of a personalized category model (PC model) and a personalized item model within categories (IC model). PC model generates a personalized list of categories that customers are likely to purchase again. IC model ranks items within categories that guests are likely to reconsume within a category. The hierarchical PCIC model captures the general consumption rate of products using survival models. Trends in consumption are captured using time series models. Features derived from these models are used in training a category-grained neural network. We compare PCIC to twelve existing baselines on four standard open datasets. PCIC improves NDCG up to 16% while improving recall by around 2%. We were able to scale and train (over 8 hours) PCIC on a large dataset of 100M guests and 3M items where repeat categories of a guest outnumber repeat items. PCIC was deployed and A/B tested on the site of a major retailer, leading to significant gains in guest engagement.

    Full text in ACM Digital Library

  • RESGenerative Next-Basket Recommendation
    by Wenqi Sun (Renmin University of China), Ruobing Xie (WeChat, Tencent), Junjie Zhang (Renmin University of China), Wayne Xin Zhao (Renmin University of China), Leyu Lin (WeChat Search Application Department, Tencent) and Ji-Rong Wen (Renmin University of China).

    Next-basket Recommendation (NBR) refers to the task of predicting a set of items that a user will purchase in the next basket. However, most of existing works merely focus on the relevance between user preferences and predicted items, ignoring the essential relationships among items in the next basket, which often results in over-homogenization of items. In this work, we presents a novel Generative next-basket Recommendation model (GeRec), a new NBR paradigm that generates the recommended items one by one to form the next basket via an autoregressive decoder. This generative NBR paradigm contributes to capturing and considering item relationships inside each baskets in both training and serving. Moreover, we jointly consider user’s both item- and basket-level contextual information to better capture user’s multi-granularity preferences. Extensive experiments on three real-world datasets demonstrate the effectiveness of our model.

    Full text in ACM Digital Library

  • RESAdversarial Sleeping Bandit Problems with Multiple Plays: Algorithm and Ranking Application
    by Jianjun Yuan (Expedia Group), Wei Lee Woon (Expedia Group) and Ludovik Coba (Expedia Group).

    This paper presents an efficient algorithm to solve the sleeping bandit with multiple plays problem in the context of an online recommendation system. The problem involves bounded, adversarial loss and unknown i.i.d. distributions for arm availability. The proposed algorithm extends the sleeping bandit algorithm for single arm selection and is guaranteed to achieve theoretical performance with regret upper bounded by $\bigO(kN^2\sqrt{T\log T})$, where $k$ is the number of arms selected per time step, $N$ is the total number of arms, and $T$ is the time horizon.

    Full text in ACM Digital Library

  • RESCollaborative filtering algorithms are prone to mainstream-taste bias
    by Pantelis Analytis (University of Southern Denmark) and Philipp Hager (University of Amsterdam).

    Collaborative filtering has been the main steam engine of the recommender systems community since the early 1990s. Collaborative filtering (and other) algorithms, however, have been predominantly evaluated by aggregating results across users or user groups. These performance averages hide large disparities: an algorithm may perform very well for some users (or groups) and very poorly for others. We show that performance variation is large and systematic. In experiments on three large scale datasets and using an array of collaborative filtering algorithms, we demonstrate the large performance disparities for different users across algorithms and datasets. We then show that performance variation is systematic and that two key features that characterize users, their mean taste similarity with other users and the dispersion in taste similarity, can explain performance variation better than previously identified features. We use these two features to visualize algorithm performance for different users, and point out that this mapping can be used to capture different categories of users that have been proposed before. Our results demonstrate an extensive mainstream-taste bias in all collaborative filtering algorithms, and they imply a fundamental fairness limitation that needs to be mitigated.

    Full text in ACM Digital Library

  • RESHessian-aware Quantized Node Embeddings for Recommendation
    by Huiyuan Chen (Visa Research), Kaixiong Zhou (Rice University), Kwei-Herng Lai (Rice University), Chin-Chia Michael Yeh (Visa Research), Yan Zheng (Visa Research), Xia Hu (Rice University) and Hao Yang (Visa Research).

    Graph Neural Networks (GNNs) have achieved state-of-the-art performance in recommender systems. Nevertheless, the process of searching and ranking from a large item corpus usually requires high latency, which limits the widespread deployment of GNNs in industry-scale applications. To address this issue, many methods quantize user/item representations into the binary embedding space to reduce space requirements and accelerate inference. Also, they use the Straight-through Estimator (STE) to prevent zero gradients during back-propagation. However, the STE often causes gradient mismatch problem, leading to sub-optimal results.

    In this work, we present the Hessian-aware Quantized GNN (HQ-GNN) as an effective solution for discrete representations of users/items that enable fast retrieval. HQ-GNN is composed of two components: a GNN encoder for learning continuous node embeddings and a quantized module for compressing full-precision embeddings into low-bit ones. Consequently, HQ-GNN benefits from both lower memory requirements and faster inference speeds compared to vanilla GNNs. To address the gradient mismatch problem in STE, we further consider the quantized errors and its second-order derivatives for better stability. The experimental results on several large-scale datasets show that HQ-GNN achieves a good balance between latency and performance.

    Full text in ACM Digital Library

  • RESScalable Approximate NonSymmetric Autoencoder for Collaborative Filtering
    by Martin Spišák (GLAMI.cz and Faculty of Mathematics and Physics, Charles University, Prague, Czechia), Radek Bartyzal (GLAMI.cz), Antonín Hoskovec (GLAMI.cz and Faculty of Nuclear Sciences and Physical Engineering, Czech Technical University in Prague, Czechia), Ladislav Peška (Faculty of Mathematics and Physics, Charles University, Prague, Czechia) and Miroslav Tůma (Faculty of Mathematics and Physics, Charles University, Prague, Czechia).

    In the field of recommender systems, shallow autoencoders have recently gained significant attention. One of the most highly acclaimed shallow autoencoders is EASE, favored for its competitive recommendation accuracy and simultaneous simplicity. However, the poor scalability of EASE (both in time and especially in memory) severely restricts its use in production environments with vast item sets. In this paper, we propose a hyperefficient factorization technique for sparse approximate inversion of the data-Gram matrix used in EASE. The resulting autoencoder, SANSA, is an end-to-end sparse solution with prescribable density and almost arbitrarily low memory requirements (even for training). As such, SANSA allows us to effortlessly scale the concept of EASE to millions of items and beyond.

    Full text in ACM Digital Library

  • RESM3REC: A Meta-based Multi-scenario Multi-task Recommendation Framework
    by Zerong Lan (Dalian University of Technology), Yingyi Zhang (Dalian University of technology) and Xianneng Li (Dalian University of Technology).

    Users in recommender systems exhibit multi-behavior in multiple business scenarios on real-world e-commerce platforms. A crucial challenge in such systems is to make recommendations for each business scenario at the same time. On top of this, multiple predictions (e.g., Click Through Rate and Conversion Rate) need to be made simultaneously in order to improve the platform revenue. Research focus on making recommendations for several business scenarios is in the field of Multi-Scenario Recommendation (MSR), and Multi-Task Recommendation (MTR) mainly attempts to solve the possible problems in collaboratively executing different recommendation tasks. However, existing researchers have paid attention to either MSR or MTR, ignoring the integration of MSR and MTR that faces the issue of conflict between scenarios and tasks. To address the above issue, we propose a Meta-based Multi-scenario Multi-task RECommendation framework (M3REC) to serve multiple tasks in multiple business scenarios by a unified model. However, integrating MSR and MTR in a proper manner is non-trivial due to: 1) Unified representation problem: Users’ and items’ representation behave Non-i.i.d in different scenarios and tasks which takes inconsistency into recommendations. 2) Synchronous optimization problem: Tasks distribution varies in different scenarios, and a unified optimization method is needed to optimize multi-tasks in multi-scenarios. Thus, to unified represent users and items, we design a Meta-Item-Embedding Generator (MIEG) and a User-Preference Transformer (UPT). The MIEG module can generate initialized item embedding using item features through meta-learning technology, and the UPT module can transfer user preferences in other scenarios. Besides, the M3REC framework uses a specifically designed backbone network together with a task-specific aggregate gate to promote all tasks to achieve the purpose of optimizing multiple tasks in multiple business scenarios within one model. Experiments on two public datasets have shown that M3REC outperforms those compared MSR and MTR state-of-the-art methods.

    Full text in ACM Digital Library

  • RESLarge Language Model Augmented Narrative Driven Recommendations
    by Sheshera Mysore (University of Massachusetts Amherst), Andrew Mccallum (University of Massachusetts) and Hamed Zamani (University of Massachusetts Amherst).

    Narrative-driven recommendation (NDR) presents an information access problem where users solicit recommendations with verbose descriptions of their preferences and context, for example, travelers soliciting recommendations for points of interest while describing their likes/dislikes and travel circumstances. These requests are increasingly important with the rise of natural language-based conversational interfaces for search and recommendation systems. However, NDR lacks abundant training data for models, and current platforms commonly do not support these requests. Fortunately, classical user-item interaction datasets contain rich textual data, e.g., reviews, which often describe user preferences and context — this may be used to bootstrap training for NDR models. In this work, we explore using large language models (LLMs) for data augmentation to train NDR models. We use LLMs for authoring synthetic narrative queries from user-item interactions with few-shot prompting and train retrieval models for NDR on synthetic queries and user-item interaction data. Our experiments demonstrate that this is an effective strategy for training small-parameter retrieval models that outperform other retrieval and LLM baselines for narrative-driven recommendation.

    Full text in ACM Digital Library

  • LBROutRank: Speeding up AutoML-based Model Search for Large Sparse Data sets with Cardinality-aware Feature Ranking
    by Blaž Škrlj (Outbrain) and Blaž Mramor (Outbrain).

    The design of modern recommender systems relies on understanding which parts of the feature space are relevant for solving a given recommendation task. However, real-world data sets in this domain are often characterized by their large size, sparsity, and noise, making it challenging to identify meaningful signals. Feature ranking represents an efficient branch of algorithms that can help address these challenges by identifying the most informative features and facilitating the automated search for more compact and better-performing models (AutoML). We introduce OutRank, a system for versatile feature ranking and data quality-related anomaly detection. OutRank was built with categorical data in mind, utilizing a variant of mutual information that is normalized with regard to the noise produced by features of the same cardinality. We further extend the similarity measure by incorporating information on feature similarity and combined relevance. The proposed approach’s feasibility is demonstrated by speeding up the state-of-the-art AutoML system on a synthetic data set with no performance loss. Furthermore, we considered a real-life click-through-rate prediction data set where it outperformed strong baselines such as random forest-based approaches. The proposed approach enables exploration of up to 300% larger feature spaces compared to AutoML-only approaches, enabling faster search for better models on off-the-shelf hardware.

    Full text in ACM Digital Library

  • LBREvaluating The Effects of Calibrated Popularity Bias Mitigation: A Field Study
    by Anastasiia Klimashevskaia (MediaFutures, University of Bergen), Mehdi Elahi (MediaFutures, University of Bergen), Dietmar Jannach (University of Klagenfurt), Lars Skjærven (TV 2), Astrid Tessem (TV 2) and Christoph Trattner (MediaFutures, University of Bergen).

    Despite their proven various benefits, Recommender Systems can cause or amplify certain undesired effects. In this paper, we focus on Popularity Bias, i.e., the tendency of a recommender system to utilize the effect of recommending popular items to the user. Prior research has studied the negative impact of this type of bias on individuals and society as a whole and proposed various approaches to mitigate this in various domains. However, almost all works adopted offline methodologies to evaluate the effectiveness of the proposed approaches. Unfortunately, such offline simulations can potentially be rather simplified and unable to capture the full picture. To contribute to this line of research and given a particular lack of knowledge about how debiasing approaches work not only offline, but online as well, we present in this paper the results of user study on a national broadcaster movie streaming platform in [country]1, i.e., [platform], following the A/B testing methodology. We deployed an effective mitigation approach for popularity bias, called Calibrated Popularity (CP), and monitored its performance in comparison to the platform’s existing collaborative filtering recommendation approach as a baseline over a period of almost four months. The results obtained from a large user base interacting in real-time with the recommendations indicate that the evaluated debiasing approach can be effective in addressing popularity bias while still maintaining the level of user interest and engagement

    Full text in ACM Digital Library

  • LBRHow Users Ride the Carousel: Exploring the Design of Multi-List Recommender Interfaces From a User Perspective
    by Benedikt Loepp (University of Duisburg-Essen) and Jürgen Ziegler (University of Duisburg-Essen).

    Multi-list interfaces are widely used in recommender systems, especially in industry, showing collections of recommendations, one below the other, with items that have certain commonalities. The composition and order of these “carousels” are usually optimized by simulating user interaction based on probabilistic models learned from item click data. Research that actually involves users is rare, with only few studies investigating general user experience in comparison to conventional recommendation lists. Hence, it is largely unknown how specific design aspects such as carousel type and length influence the individual perception and usage of carousel-based interfaces. This paper seeks to fill this gap through an exploratory user study. The results confirm previous assumptions about user behavior and provide first insights into the differences in decision making in the presence of multiple recommendation carousels.

    Full text in ACM Digital Library

  • LBRLeveraging Large Language Models for Sequential Recommendation
    by Jesse Harte (Delivery Hero SE), Wouter Zorgdrager (Delivery Hero SE), Panos Louridas (Athens University of Economics & Business), Asterios Katsifodimos (Delft University of Technology), Dietmar Jannach (University of Klagenfurt) and Marios Fragkoulis (Delivery Hero SE).

    Sequential recommendation problems have received increasing attention in research during the past few years, leading to the inception of a large variety of algorithmic approaches. In this work, we explore how large language models (LLMs), which are nowadays introducing disruptive effects in many AI-based applications, can be used to build or improve sequential recommendation approaches. Specifically, we devise and evaluate three approaches to leverage the power of LLMs in different ways. Our results from experiments on two datasets show that initializing the state-of-the-art sequential recommendation model BERT4Rec with embeddings obtained from an LLM improves NDCG by 15-20% compared to the vanilla BERT4Rec model. Furthermore, we find that a simple approach that leverages LLM embeddings for producing recommendations, can provide competitive performance by highlighting semantically related items. We publicly share the code and data of our experiments to ensure reproducibility.

    Full text in ACM Digital Library

  • LBRIntegrating Offline Reinforcement Learning with Transformers for Sequential Recommendation
    by Xumei Xi (Cornell University), Yuke Zhao (Bloomberg LP), Quan Liu (Bloomberg), Liwen Ouyang (Bloomberg) and Yang Wu (Independent Researcher).

    We consider the problem of sequential recommendation, where the current recommendation is made based on past interactions. This recommendation task requires efficient processing of the sequential data and aims to provide recommendations that maximize the long-term reward. To this end, we train a farsighted recommender by using an offline RL algorithm with the policy network in our model architecture that has been initialized from a pre-trained transformer model. The pre-trained model leverages the superb ability of the transformer to process sequential information. Compared to prior works that rely on online interaction via simulation, we focus on implementing a fully offline RL framework that is able to converge in a fast and stable way. Through extensive experiments on public datasets, we show that our method is robust across various recommendation regimes, including e-commerce and movie suggestions. Compared to state-of-the-art supervised learning algorithms, our algorithm yields recommendations of higher quality, demonstrating the clear advantage of combining RL and transformers.

    Full text in ACM Digital Library

  • LBRLearning the True Objectives of Multiple Tasks in Sequential Behavior Modeling
    by Jiawei Zhang (Peking University).

    Multi-task optimization is an emerging research field in recommender systems that focuses on improving the recommendation performance of multiple tasks. Various methods have been proposed in the past to address task weight balancing, gradient conflict resolution, Pareto optimality, etc, yielding promising results in specific contexts. However, when it comes to real-world scenarios involving user sequential behaviors, these methods are not well suited. To address this gap, we propose AcouRec, a novel and effective approach for sequential behavior modeling in multi-task recommender systems inspired by acoustic attenuation. Specifically, AcouRec introduces an impact attenuation mechanism to mitigate the uncertain task interference in multi-task optimization. Extensive experiments on public datasets demonstrate the effectiveness of AcouRec.

    Full text in ACM Digital Library

  • LBRIntegrating Item Relevance in Training Loss for Sequential Recommender Systems
    by Andrea Bacciu (Sapienza University of Rome), Federico Siciliano (Sapienza University of Rome), Nicola Tonellotto (University of Pisa) and Fabrizio Silvestri (University of Rome).

    Sequential Recommender Systems (SRSs) are a popular type of recommender system that leverages user history to predict the next item of interest. However, the presence of noise in user interactions, stemming from account sharing, inconsistent preferences, or accidental clicks, can significantly impact the robustness and performance of SRSs, particularly when the entire item set to be predicted is noisy. This situation is more prevalent when only one item is used to train and evaluate the SRSs. To tackle this challenge, we propose a novel approach that addresses the issue of noise in SRSs. First, we propose a sequential multi-relevant future items training objective, leveraging a loss function aware of item relevance, thereby enhancing their robustness against noise in the training data. Additionally, to mitigate the impact of noise at evaluation time, we propose multi-relevant future items evaluation (MRFI-evaluation), aiming to improve overall performance. Our relevance-aware models obtain an improvement of ~1.58\% of NDCG@10 and 0.96\% in terms of HR@10 in the traditional evaluation protocol, the one which utilizes one relevant future item. In the MRFI-evaluation protocol, using multiple future items, the improvement is ~2.82\% of NDCG@10 and ~0.64\% of HR@10 w.r.t the best baseline model.

    Full text in ACM Digital Library

  • DEMEasyStudy: Framework for Easy Deployment of User Studies on Recommender Systems
    by Patrik Dokoupil (Department of Software Engineering, Charles University) and Ladislav Peska (Faculty of Mathematics and Physics, Charles University, Prague, Czechia).

    Improvements in the recommender systems (RS) domain are not possible without a thorough way to evaluate and compare newly proposed approaches. User studies represent a viable alternative to online and offline evaluation schemes, but despite their numerous benefits, they are only rarely used. One of the main reasons behind this fact is that preparing a user study from scratch involves a lot of extra work on top of a simple algorithm proposal. To simplify this task, we propose \textsc{EasyStudy}, a modular framework built on the credo “\textit{Make simple things fast and hard things possible}”. It features ready-to-use datasets, preference elicitation methods, incrementally tuned baseline algorithms, study flow plugins, and evaluation metrics. As a result, a simple study comparing several RS can be deployed with just a few clicks, while more complex study designs can still benefit from a range of reusable components, such as preference elicitation. Overall, \textsc{EasyStudy} dramatically decreases the gap between the laboriousness of offline evaluation vs. user studies and, therefore, may contribute towards the more reliable and insightful user-centric evaluation of next-generation RS.

    Full text in ACM Digital Library

  • DEMLocalify.org: Locally-focus Music Artist and Event Recommendation
    by Douglas Turnbull (Ithaca College), April Trainor (Ithaca College), Griffin Homan (Ithaca College), Elizabeth Richards (Ithaca College), Kieran Bentley (Ithaca College), Victoria Conrad (Ithaca College), Paul Gagliano (Ithaca College) and Cassandra Raineault (Ithaca College).

    Cities with strong local music scenes enjoy many social and economic benefits. To this end, we are interested in developing a locally-focused artist and event recommendation system called Localify.org that supports and promotes local music scenes. Local artists tend to be relatively obscure and reside in the long tail of the artist’s popularity distribution. In this demo paper, we describe both the overall system architecture as well as our core recommender system that uses artist-artist similarity information as opposed to user-artist preference information. We also discuss the role of popularity bias and how we attempt to ameliorate it in the context of local music recommendation.

    Full text in ACM Digital Library

  • INDAn Industrial Framework for Personalized Serendipitous Recommendation in E-commerce
    by Zongyi Wang (jd.com), Yanyan Zou (JD.com), Anyu Dai (jd.com), Linfang Hou (jd.com), Nan Qiao (jd.com), Luobao Zou (jd.com), Mian Ma (JD.com), Zhuoye Ding (JD.com) and Sulong Xu (JD).

    Classical recommendation methods typically face the filter bubble problem where users likely receive recommendations of their familiar items, making them bored and dissatisfied. To alleviate such an issue, this applied paper introduces a novel framework for personalized serendipitous recommendation in an e-commerce platform (i.e., JD.com), which allows to present user unexpected and satisfying items deviating from user’s prior behaviors, considering both accuracy and novelty. To achieve such a goal, it is crucial yet challenging to recognize when a user is willing to receive serendipitous items and how many novel items are expected. To address above two challenges, a two-stage framework is designed. Firstly, a DNN-based scorer is deployed to quantify the novelty degree of a product category based on user behavior history. Then, we resort to a potential outcome framework to decide the optimal timing to recommend a user serendipitous items and the novelty degree of the recommendation. Online A/B test on the e-commerce recommender platform in JD.com demonstrates that our model achieves significant gains on various metrics, 0.54% relative increase of impressive depth, 0.8% of average user click count, 3.23% and 1.38% of number of novel impressive and clicked items individually.

    Full text in ACM Digital Library

  • INDRecQR: Using Recommendation Systems for Query Reformulation to correct unseen errors in spoken dialog systems
    by Manik Bhandari (Amazon.com), Mingxian Wang (Amazon), Oleg Poliannikov (Amazon) and Kanna Shimizu (Amazon).

    As spoken dialog systems like Siri, Alexa and Google Assistant become widespread, it becomes apparent that relying solely on global, one-size-fits-all models of Automatic Speech Recognition (ASR), Natural Language Understanding (NLU) and Entity Resolution (ER), is inadequate for delivering a friction-less customer experience. To address this issue, Query Reformulation (QR) has emerged as a crucial technique for personalizing these systems and reducing customer friction. However, existing QR models, trained on personal rephrases in history face a critical drawback – they are unable to reformulate unseen queries to unseen targets. To alleviate this, we present RecQR, a novel system based on collaborative filters, designed to reformulate unseen defective requests to target requests that a customer may never have requested for in the past. RecQR anticipates a customer’s future requests and rewrites them using state of the art, large-scale, collaborative filtering and query reformulation models. Based on experiments we find that it reduces errors by nearly 40% (relative) on the reformulated utterances.

    Full text in ACM Digital Library

  • INDScaling Session-Based Transformer Recommendations using Optimized Negative Sampling and Loss Functions
    by Timo Wilm (OTTO (GmbH & Co KG)), Philipp Normann (OTTO (GmbH & Co KG)), Sophie Baumeister (OTTO (GmbH & Co KG)) and Paul-Vincent Kobow (OTTO (GmbH & Co KG)).

    This work introduces TRON, a scalable session-based Transformer Recommender using Optimized Negative-sampling. Motivated by the scalability and performance limitations of prevailing models such as SASRec and GRU4Rec+, TRON integrates top-k negative sampling and listwise loss functions to enhance its recommendation accuracy. Evaluations on relevant large-scale e-commerce datasets show that TRON improves upon the recommendation quality of current methods while maintaining training speeds similar to SASRec. A live A/B test yielded an 18.14% increase in click-through rate over SASRec, highlighting the potential of TRON in practical settings. For further research, we provide access to our source code and an anonymized dataset.

    Full text in ACM Digital Library

  • INDVisual Representation for Capturing Creator Theme in Brand-Creator Marketplace
    by Asnat Greenstein-Messica (Lightricks), Keren Gaiger (Lightricks), Sarel Duanis (Lightricks), Ravid Cohen (Lightricks) and Shaked Zychlinski (Lightricks).

    Providing cold start recommendations in a brand-creator marketplace is challenging as brands’ preferences extend beyond the mere objects depicted in the creator’s content and encompass the creator’s individual theme consistently resonates across images shared on her social media profile. Furthermore, brands often use textual keywords to describe their campaign’s aesthetic appeal, with which creators must align. To address these challenges, we propose two methods: SAME (Same Account Media Embedding), a novel creator representation employing a Siamese network to capture the unique creator theme and OAAR (Object-Agnostic Adjective Representation), enabling filtering creators based on textual adjectives that relate to aesthetic qualities through zero-shot learning. These two methods utilize CLIP, a state-of-the-art language-image model, and improve it in addressing the aforementioned challenges.

    Full text in ACM Digital Library

  • INDUnleash the Power of Context: Enhancing Large-Scale Recommender Systems with Context-Based Prediction Models
    by Jan Hartman (Outbrain), Assaf Klein (Outbrain), Davorin Kopič (Outbrain) and Natalia Silberstein (Outbrain).

    In this work, we introduce the notion of Context-Based Prediction Models. A Context-Based Prediction Model determines the probability of a user’s action (such as a click or a conversion) solely by relying on user and contextual features, without considering any specific features of the item itself. We have identified numerous valuable applications for this modeling approach, including training an auxiliary context-based model to estimate click probability and incorporating its prediction as a feature in CTR prediction models.Our experiments indicate that this enhancement brings significant improvements in offline and online business metrics while having minimal impact on the cost of serving. Overall, our work offers a simple and scalable, yet powerful approach for enhancing the performance of large-scale commercial recommender systems, with broad implications for the field of personalized recommendations.

    Full text in ACM Digital Library

  • INDLightSAGE: Graph Neural Networks for Large Scale Item Retrieval in Shopee’s Advertisement Recommendation
    by Dang Minh Nguyen (Shopee, SEA Group), Chenfei Wang (Shopee, SEA Group), Yan Shen (Shopee, SEA Group) and Yifan Zeng (Shopee, SEA Group).

    Graph Neural Network (GNN) is the trending solution for item retrieval in recommendation problems. Most recent reports, however, focus heavily on new model architectures. This may bring some gaps when applying GNN in the industrial setup, where, besides the model, constructing graph and handling data sparsity also play critical roles in the overall success of the project. In this work, we report how we apply GNN for large-scale e-commerce item retrieval at Shopee. We detail our simple yet novel and impactful techniques in graph construction, modeling, and handling data skewness. Specifically, we construct high-quality item graphs by combining strong-signal user behaviors with high-precision collaborative filtering (CF) algorithm. We then develop a new GNN architecture named LightSAGE to produce high-quality items’ embeddings for vector search. Finally, we develop multiple strategies to handle cold-start and long-tail items, which are critical in an advertisement (ads) system. Our models bring improvement in offline evaluations, online A/B tests, and are deployed to the main traffic of Shopee’s Recommendation Advertisement system.

    Full text in ACM Digital Library

  • INDLoss Harmonizing for Multi-Scenario CTR Prediction
    by Congcong Liu (JD.com), Liang Shi (JD.com), Pei Wang (JD.com), Fei Teng (JD.com), Xue Jiang (JD.com), Changping Peng (JD.com), Zhangang Lin (JD.com) and Jingping Shao (JD.com).

    Large-scale industrial systems often include multiple scenarios to satisfy diverse user needs. The common approach of using one model per scenario does not scale well and not suitable for minor scenarios with limited samples. An solution is to train a model on all scenarios, which can introduce domination and bias from the main scenario. MMoE-like structures have been proposed for multi-scenario prediction, but they do not explicitly address the issue of gradient unbalancing. This work proposes an adaptive loss harmonizing (ALH) algorithm for multi-scenario CTR prediction. It balances training by dynamically adjusting the learning speed, resulting in improved prediction performance. Experiments conducted on real production dataset and a rigorous A/B test prove the superiority of our method.

    Full text in ACM Digital Library

  • INDPersonalised Recommendations for the BBC iPlayer: Initial approach and current challenges
    by Benjamin R. Clark (British Broadcasting Corporation), Kristine Grivcova (British Broadcasting Corporation), Polina Proutskova (British Broadcasting Corporation) and Duncan M. Walker (British Broadcasting Corporation).

    BBC iPlayer is one of the most important digital products of the BBC, offering live and on-demand television for audiences in the UK with over 10 million weekly active users. The BBC’s role as a public service broadcaster, broadcasting over traditional linear channels as well as online presents a number of challenges for a recommender system. In addition to having substantially different objectives to a commercial service, we show that the diverse content offered by the BBC including news and sport, factual, drama and live events lead to a catalogue with a diversity of consumption patterns, depending on genre. Our research shows that simple models represent strong baselines in this system. We discuss our initial attempts to improve upon these baselines, and conclude with our current challenges.

    Full text in ACM Digital Library

  • INDMCM: A Multi-task Pre-trained Customer Model for Personalization
    by Rui Luo (Amazon), Tianxin Wang (Amazon), Jingyuan Deng (Amazon) and Peng Wan (Amazon).

    Personalization plays a critical role in helping customers discover the products and contents they prefer for e-commerce stores.Personalized recommendations differ in contents, target customers, and UI. However, they require a common core capability – the ability to deeply understand customers’ preferences and shopping intents. In this paper, we introduce the MLCM (Multi-task Large pre-trained Customer Model), a large pre-trained BERT-based multi-task customer model with 10 million trainable parameters for e-commerce stores. This model aims to empower all personalization projects by providing commonly used preference scores for recommendations, customer embeddings for transfer learning, and a pre-trained model for fine-tuning. In this work, we improve the SOTA BERT4Rec framework to handle heterogeneous customer signals and multi-task training as well as innovate new data augmentation method that is suitable for recommendation task. Experimental results show that MLCM outperforms the original BERT4Rec by 17% on preference prediction tasks. Additionally, we demonstrate that the model can be easily fine-tuned to assist a specific recommendation task. For instance, after fine-tuning MLCM for an incentive based recommendation project, performance improves by 60% on the conversion prediction task and 25% on the click-through prediction task compared to the production baseline model.

    Full text in ACM Digital Library

  • INDTrack Mix Generation on Music Streaming Services using Transformers
    by Walid Bendada (Deezer Research), Théo Bontempelli (Deezer Research), Mathieu Morlon (Deezer Research), Benjamin Chapus (Deezer Research), Thibault Cador (Deezer Research), Thomas Bouabça (Deezer Research) and Guillaume Salha-Galvan (Deezer Research).

    This paper introduces Track Mix, a personalized playlist generation system released in 2022 on the music streaming service Deezer. Track Mix automatically generates “mix” playlists inspired by initial music tracks, allowing users to discover music similar to their favorite content. To generate these mixes, we consider a Transformer model trained on millions of track sequences from user playlists. In light of the growing popularity of Transformers in recent years, we analyze the advantages, drawbacks, and technical challenges of using such a model for mix generation on the service, compared to a more traditional collaborative filtering approach. Since its release, Track Mix has been generating playlists for millions of users daily, enhancing their music discovery experience on Deezer.

    Full text in ACM Digital Library

  • DSSequential Recommendation Models: A Graph-based Perspective
    by Andreas Peintner (University of Innsbruck).

    Recommender systems (RecSys) traditionally leverage the users’ rich interaction data with the system, but ignore the sequential dependency of items. Sequential recommender systems aim to predict the next item the user will interact with (e.g., click on, purchase, or listen to) based on the preceding interactions of the user. Current state-of-the-art approaches focus on transformer-based architectures and graph neural networks. Specifically, the modeling of sequences as graphs has shown to be a promising approach to introduce a structured bias into the recommendation learning framework. In this work, we will outline our research of exploring different applications of graphs in sequential recommendation.

    Full text in ACM Digital Library

  • DSExploring Unlearning Methods to Ensure the Privacy, Security, and Usability of Recommender Systems
    by Jens Leysen (University of Antwerp).

    Machine learning algorithms have proven highly effective in analyzing large amounts of data and identifying complex patterns and relationships. One application of machine learning that has received significant attention in recent years is recommender systems, which are algorithms that analyze user behavior and other data to suggest items or content that a user may be interested in. However useful, these systems may unintentionally retain sensitive, outdated, or faulty information. Posing a risk to user privacy, system security, and limiting a system’s usability. In this research proposal, we aim to address these challenges by investigating methods for machine “unlearning”, which would allow information to be efficiently “forgotten” or “unlearned” from machine learning models. The main objective of this proposal is to develop the foundation for future machine unlearning methods. We first evaluate current unlearning methods and explore novel adversarial attacks on these methods’ verifiability, efficiency, and accuracy to gain new insights and further develop the theory of machine unlearning. Using our gathered insights, we seek to create novel unlearning methods that are verifiable, efficient, and limit unnecessary accuracy degradation. Through this research, we seek to make significant contributions to the theoretical foundations of machine unlearning while also developing unlearning methods that can be applied to real-world problems.

    Full text in ACM Digital Library

  • DSComplementary Product Recommendation for Long-tail Products
    by Rastislav Papso (Kempelen Institute of Intelligent Technologies).

    Identifying complementary relations between products plays a key role in e-commerce Recommender Systems (RS). Existing methods in Complementary Product Recommendation (CPR), however, focus only on identifying complementary relations in huge and data-rich catalogs, while none of them considers real-world scenarios of small and medium e-commerce platforms with limited number of interactions. In this paper, we discuss our research proposal that addresses the problem of identifying complementary relations in such sparse settings. To overcome the data sparsity problem, we propose to first learn complementary relations in large and data-rich catalogs and then transfer learned knowledge to small and scarce ones. To be able to map individual products across different catalogs and thus transfer learned relations between them, we propose to create Product Universal Embedding Space (PUES) using textual and visual product meta-data, which serves as a common ground for the products from arbitrary catalog.

    Full text in ACM Digital Library

  • DSKnowledge-Aware Recommender Systems based on Multi-Modal Information Sources
    by Giuseppe Spillo (University of Bari ‘Aldo Moro’).

    The last few years saw a growing interest in Knowledge-Aware Recommender Systems (KARSs), given their capability in encoding and exploiting several data sources, both structured (such as \textit{knowledge graphs}) and unstructured (such as plain text); indeed, several pieces of research show the competitiveness of these models. Nowadays, a lot of models at the state-of-the-art in KARSs use deep learning, enabling them to exploit large amounts of information, including knowledge graphs (KGs), user reviews, plain text, and multimedia content (pictures, audio, videos). In my Ph.D. I will explore and study techniques for designing KARSs leveraging embeddings deriving from multi-modal information sources; the models I will design will aim at providing fair, accurate, and explainable recommendations.

    Full text in ACM Digital Library

  • DSExplainable Graph Neural Network Recommenders; Challenges and Opportunities
    by Amir Reza Mohammadi (Universität Innsbruck).

    Graph Neural Networks (GNNs) have demonstrated significant potential in recommendation tasks by effectively capturing intricate connections among users, items, and their associated features. Given the escalating demand for interpretability, current research endeavors in the domain of GNNs for Recommender Systems (RecSys) necessitate the development of explainer methodologies to elucidate the decision-making process underlying GNN-based recommendations. In this work, we aim to present our research focused on techniques to extend beyond the existing approaches for addressing interpretability in GNN-based RecSys.

    Full text in ACM Digital Library

Back to program

Diamond Supporter
 
 
Platinum Supporter
 
 
Amazon Science
 
Gold Supporter
 
 
Silver Supporter
 
 
Bronze Supporter
 
Challenge Sponsor
ShareChat
 
Special Supporters