Poster Session O1: Short Papers

Session A: 21:0023:00
Session B: 8:0010:00

  • SPADER: Adaptively Distilled Exemplar Replay Towards Continual Learning for Session-based Recommendation
    by Fei Mi (LIA EPFL), Xiaoyu Lin (École polytechnique fédérale de Lausanne),Boi Faltings (LIA EPFL)

    Session-based recommendation has received growing attention recently due to the increasing privacy concern. Despite the recent success of neural session-based recommenders, they are typically developed in an offline manner using a static dataset. However, recommendation requires continual adaptation to take into account new and obsolete items and users, and requires “continual learning” in real-life applications. In this case, the recommender is updated continually and periodically with new data that arrives in each update cycle, and the updated model needs to provide recommendations for user activities before the next model update. A major challenge for continual learning with neural models is catastrophic forgetting, in which a continually trained model forgets user preference patterns it has learned before. To deal with this challenge, we propose a method called Adaptively Distilled Exemplar Replay (ADER) by periodically replaying previous training samples (i.e., exemplars) to the current model with an adaptive distillation loss. Experiments are conducted based on the state-of-the-art SASRec model using two widely used datasets to benchmark ADER with several well-known continual learning techniques. We empirically demonstrate that ADER consistently outperforms other baselines, and it even outperforms the method using all historical data at every update cycle. This result reveals that ADER is a promising solution to mitigate the catastrophic forgetting issue towards building more realistic and scalable session-based recommenders.

  • SPAdaptive Pointwise-Pairwise Learning-to-Rank for Content-based Personalized Recommendation
    by Yagmur Gizem Cinar (LIG Univsersity of Grenoble Alpes), Jean-Michel Renders (Naver Labs Europe)

    This paper extends the standard pointwise and pairwise paradigms for learning-to-rank in the context of personalized recommendation, by considering these two approaches as two extremes of a continuum of possible strategies. It basically consists of a surrogate loss that models how to select and combine these two approaches adaptively, depending on the context (query or user, pair of items, etc.). In other words, given a training instance, which is typically a triplet (a query/user and two items with different preferences or relevance grades), the strategy adaptively determines whether it is better to focus on the “most preferred” item (pointwise – positive instance), on the “less preferred” one (pointwise – negative instance) or on the pair (pairwise), or on anything else in between these 3 extreme alternatives. We formulate this adaptive strategy as minimizing a particular loss function that generalizes simultaneously the traditional pointwise and pairwise loss functions (negative log-likelihood) through a mixture coefficient. This coefficient is formulated as a learnable function of the features associated to the triplet. Experimental results on several real-world news recommendation datasets show clear improvements over several pointwise, pairwise, and listwise approaches.

  • SPCarousel Personalization in Music Streaming Apps with Contextual Bandits
    by Walid Bendada (Deezer), Guillaume Salha (Deezer), Théo Bontempelli (Deezer)

    Media services providers, such as music streaming platforms, frequently leverage swipeable carousels to recommend personalized content to their users. However, selecting the most relevant items (albums, artists, playlists…) to display in these carousels is a challenging task, as items are numerous and as users have different preferences. In this paper, we model carousel personalization as a contextual multi-armed bandit problem with multiple plays, stochastic arm display and delayed batch feedback. We empirically show the effectiveness of our framework at capturing characteristics of real-world carousels by addressing a large-scale playlist recommendation task on a global music streaming mobile app. Along with this paper, we publicly release industrial data from our experiments, as well as an open-source environment to simulate comparable carousel personalization learning problems.

  • SPDeconfounding User Satisfaction Estimation from Response Rate Bias
    by Konstantina Christakopoulou (Google), Madeleine Traverse (Google), Trevor Potter (Google), Emma Marriott (Waymo), Daniel Li (Google), Chris Haulk (Google), Ed Chi (Google), Minmin Chen (Google)

    “Improving user satisfaction is at the forefront of industrial recommender systems. While significant progress has been made by utilizing logged implicit data of user-item interactions (i.e., clicks, dwell/watch time, and other user engagement signals), there has been a recent surge of interest in measuring and modeling user satisfaction, as provided by orthogonal data sources. Such data sources typically originate from responses to user satisfaction surveys, which explicitly ask users to rate their experience with the system and/or specific items they have consumed in the recent past. This data can be valuable for measuring and modeling the degree to which a user has had a satisfactory experience on the recommendation platform, since what users do (engagement) does not always align with what users say they want (satisfaction as measured by surveys).
    We focus on a large-scale industrial system trained on user survey responses to predict user satisfaction. The predictions of the satisfaction model for each user-item pair, combined with the predictions of the other models (e.g., engagement-focused ones), are fed into the ranking component of a real-world recommender system in deciding items to present to the user. It is therefore imperative that the satisfaction model does an equally good job on imputing user satisfaction across slices of users and items, as it would directly impact which items a user is exposed to. However, the data used for training satisfaction models is biased in that users are more likely to respond to a survey when they will respond that they are more satisfied. When the satisfaction survey responses in slices of data with high response rate follow a different distribution than those with low response rate, response rate becomes a confounding factor for user satisfaction estimation.
    We find positive correlation between response rate and ratings in a large-scale survey dataset collected in our case study. To address this inherent response rate bias in the satisfaction data, we propose an inverse propensity weighting approach within a multi-task learning framework. We extend a simple feed-forward neural network architecture predicting user satisfaction to a shared-bottom multi-task learning architecture with two tasks: the user satisfaction estimation task, and the response rate estimation task. We concurrently train these two tasks, and use the inverse of the predictions of the response rate task as loss weights for the satisfaction task to address the response rate bias. We showcase that by doing this, (i) we can accurately model whether a user will respond to a survey, (ii) we improve the user satisfaction estimation error for the data slices with lower response rate while not hurting slices with higher response rate, and (iii) we demonstrate in live A/B experiments that applying the resulting satisfaction predictions to rank recommendations translates to higher user satisfaction.”

  • SPDeep Bayesian Bandits: Exploring in Online Personalized Recommendations
    by Dalin Guo (UC San Diego), Sofia Ira Ktena (Twitter), Pranay Kumar Myana (Twitter), Ferenc Huszar (Twitter), Wenzhe Shi (Twitter), Alykhan Tejani (Twitter), Michael Kneier (Twitter), Sourav Das (Twitter)

    Recommender systems trained in a continuous learning fashion are plagued by the feedback loop problem, also known as algorithmic bias. This causes a newly trained model to act greedily and favor items that have already been engaged by users. This behavior is particularly harmful in personalised ads recommendations, as it can also cause new campaigns to remain unexplored. Exploration aims to address this limitation by providing new information about the environment, which encompasses user preference, and can lead to higher long-term reward. In this work, we formulate a display advertising recommender as a contextual bandit and implement exploration techniques that require sampling from the posterior distribution of click-through-rates in a computationally tractable manner. Traditional large-scale deep learning models do not provide uncertainty estimates by default. We approximate these uncertainty measurements of the predictions by employing a bootstrapped model with multiple heads and dropout units. We benchmark a number of different models in an offline simulation environment using a publicly available dataset of user-ads engagements. We test our proposed deep Bayesian bandits algorithm in the offline simulation and online AB setting with large-scale production traffic, where we demonstrate a positive gain of our exploration model.

  • SPExplainable Recommendations via Attentive Multi-Persona Collaborative Filtering
    by Oren Barkan (Microsoft), Yonatan Fuchs (Tel Aviv University), Avi Caciularu (Tel Aviv University), Noam Koenigstein (Tel Aviv University)

    Two main challenges in recommender systems are modeling users with heterogeneous taste, and providing explainable recommendations. In this paper, we propose the neural Attentive Multi-Persona Collaborative Filtering (AMP-CF) model as a unified solution for both problems. AMP-CF breaks down the user to several latent ‘personas’ (profiles) that identify and discern the different tastes and inclinations of the user. Then, the revealed personas are used to generate and explain the final recommendation list for the user. AMP-CF models users as an attentive mixture of personas, enabling a dynamic user representation that changes based on the item under consideration. We demonstrate AMP-CF on five collaborative filtering datasets from the domains of movies, music, video games and social networks. As an additional contribution, we propose a novel evaluation scheme for comparing the different items in a recommendation list based on the distance from the underlying distribution of “tastes” in the user’s historical items. Experimental results show that AMP-CF is competitive with other state-of-the-art models. Finally, we provide qualitative results to showcase the ability of AMP-CF to explain its recommendations.

  • SPExploring Longitudinal Effects of Session-based Recommendations
    by Andres Ferraro (Universitat Pompeu Fabra), Dietmar Jannach (AAU Klagenfurt), Xavier Serra (Universitat Pompeu Fabra)

    “Session-based recommendation is a problem setting where the task of a recommender system is to make suitable item suggestions based only on a few observed user interactions in an ongoing session. The lack of long-term preference information about individual users in such settings usually results in a limited level of personalization, where a small set of popular items may be recommended to many users. This repeated exposure of such a subset of the items through the recommendations may in turn lead to a reinforcement effect over time, and to a system which is not able to help users discover new content anymore to the desirable extent.
    In this work, we investigate such potential longitudinal effects of session-based recommendations in a simulation-based approach. Specifically, we analyze to what extent algorithms of different types may lead to concentration effects over time. Our experiments in the music domain reveal that all investigated algorithms—both neural and heuristic ones—may lead to lower item coverage and to a higher concentration on a subset of the items. Additional simulation experiments however also indicate that relatively simple re-ranking strategies, e.g., by avoiding too many repeated recommendations in the music domain, may help to deal with this problem.”

  • SPFit to Run: Personalised Recommendations for Marathon Training
    by Jakim Berndsen (University College Dublin), Barry Smyth (University College Dublin), Aonghus Lawlor (University College Dublin)

    Training for the marathon is a complex problem. In order to run an optimal time, runners must find the right workload for their current abilities and identify the correct balance between the hard work and rest throughout their training programmes. We propose a recommender system that will help guide runners through the weeks leading up to the marathon. Using a large sample of marathon training data (8730 runners), we generate user profiles that capture both a runner’s current fitness and training levels, and leverage this information to generate tailored recommendations for future weeks of training. We investigate patterns of successful runners to determine how best to schedule recommendations and training to allow for improvement in fitness levels alongside adequate rest.

  • SPFree Lunch! Retrospective Uplift Modeling for Dynamic Promotions Recommendation within ROI Constraints
    by Dmitri Goldenberg (, Javier Albert (, Lucas Bernardi (, Pablo Estevez (

    Promotions and discounts have become key components of modern e-commerce platforms. For online travel platforms (OTPs), popular promotions include room upgrades, free meals and transportation services. By offering these promotions, customers can get more value for their money, while both the OTP and its travel partners may grow their loyal customer base. However, the promotions usually incur a cost that, if uncontrolled, can become unsustainable. Consequently, for a promotion to be viable, its associated costs must be balanced by incremental revenue within set financial constraints. Personalized treatment assignment can be used to satisfy such constraints. This paper introduces a novel uplift modeling technique, relying on the Knapsack Problem formulation, that dynamically optimizes the incremental treatment’s outcome subject to the required Return on Investment (ROI) constraints. The technique leverages Retrospective Estimation, a modeling approach that relies solely on data from positive outcome examples. The method also addresses training data bias, long term effects, and seasonality challenges via online-dynamic calibration. This approach was tested via offline experiments and online randomized controlled trials at – a leading OTP with millions of customers worldwide, resulting in a significant increase in the target outcome while staying within the required financial constraints and outperforming other approaches.

  • SPHistory-Augmented Collaborative Filtering for Financial Recommendations
    by Baptiste Barreau (BNP Paribas CIB), Laurent Carlier (BNP Paribas CIB)

    In many businesses, and particularly in finance, the behavior of a client might drastically change over time. It is consequently crucial for recommender systems used in such environments to be able to adapt to these changes. In this study, we propose a novel collaborative filtering algorithm that captures the temporal context of a user-item interaction through the users’ and items’ recent interaction histories to provide dynamic recommendations. The algorithm, designed with issues specific to the financial world in mind, uses a custom neural network architecture that tackles the non-stationarity of users’ and items’ behaviors. The performance and properties of the algorithm are monitored in a series of experiments on a G10 bond request for quotation proprietary database from BNP Paribas Corporate and Institutional Banking.

  • SPImproving One-class Recommendation with Multi-tasking on Various Preference Intensities
    by Chu-Jen Shao (National Taiwan University), Hao-Ming Fu (National Taiwan University), Pu-Jen Cheng (National Taiwan University)

    “In the one-class recommendation problem, it’s required to make recommendations basing on users’ implicit feedback, which is inferred from their action and inaction. Existing works obtain representations of users and items by encoding positive and negative interactions observed from training data. However, these efforts assume that all positive signals from implicit feedback reflect a fixed preference intensity, which is not realistic. Consequently, representations learned with these methods usually fail to capture informative entity features that reflect various preference intensities.
    In this paper, we propose a multi-tasking framework taking various preference intensities of each signal from implicit feedback into consideration. Representations of entities are required to satisfy the objective of each subtask simultaneously, making them more robust and generalizable. Furthermore, we incorporate attentive graph convolutional layers to explore high-order relationships in the user-item bipartite graph and dynamically capture the latent tendencies of users toward the items they interact with. Experimental results show that our method performs better than state-of-the-art methods by a large margin on three large-scale real-world benchmark datasets.”

  • SPInterpretable Contextual Team-aware Item Recommendation: Application in Multiplayer Online Battle Arena Games
    by Andrés Villa (Pontificia Universidad Católica de Chile), Vladimir Araujo (Pontificia Universidad Católica de Chile), Francisca Cattan (Pontificia Universidad Católica de Chile), Denis Parra (Pontificia Universidad Católica de Chile)

    The video game industry has adopted recommendation systems to boost users interest with a focus on game sales. Other exciting applications within video games are those that help the player make decisions that would maximize their playing experience, which is a desirable feature in real-time strategy video games such as Multiplayer Online Battle Arena (MOBA) like as DotA and LoL. Among these tasks, the recommendation of items is challenging, given both the contextual nature of the game and how it exposes the dependence on the formation of each team. Existing works on this topic do not take advantage of all the available contextual match data and dismiss potentially valuable information. To address this problem we develop TTIR, a contextual recommender model derived from the Transformer neural architecture that suggests a set of items to every team member, based on the contexts of teams and roles that describe the match. TTIR outperforms several approaches and provides interpretable recommendations through visualization of attention weights. Our evaluation indicates that both the Transformer architecture and the contextual information are essential to get the best results for this item recommendation task. Furthermore, a preliminary user survey indicates the usefulness of attention weights for explaining recommendations as well as ideas for future work. The code and dataset are available at

  • SPLong-tail Session-based Recommendation
    by Siyi Liu (University of Electronic Science and Technology of China), Yujia Zheng (University of Electronic Science and Technology of China)

    Session-based recommendation focuses on the prediction of user actions based on anonymous sessions and is a necessary method in the lack of user historical data. However, none of the existing session-based recommendation methods explicitly takes the long-tail recommendation into consideration, which plays an important role in improving the diversity of recommendation and producing the serendipity. As the distribution of items with long-tail is prevalent in session-based recommendation scenarios (e.g., e-commerce, music, and TV program recommendations), more attention should be put on the long-tail session-based recommendation. In this paper, we propose a novel network architecture, namely TailNet, to improve long-tail recommendation performance, while maintaining competitive accuracy performance compared with other methods. We start by classifying items into short-head (popular) and long-tail (niche) items based on click frequency. Then a novel preference mechanism is proposed and applied in TailNet to determine user preference between two types of items, so as to softly adjust and personalize recommendations. Extensive experiments on two real-world datasets verify the superiority of our method compared with state-of-the-art works.

  • SPMEANTIME: Mixture of Attention Mechanisms with Multi-temporal Embeddings for Sequential Recommendation
    by Sung Min Cho (Seoul National University), Eunhyeok Park (POSTECH), Sungjoo Yoo (Seoul National University)

    Recently, self-attention based models have achieved state-of-the-art performance in sequential recommendation task. Following the custom from language processing, most of these models rely on a simple positional embedding to exploit the sequential nature of the user’s history. However, there are some limitations regarding the current approaches. First, sequential recommendation is different from language processing in that timestamp information is available. Previous models have not made good use of it to extract additional contextual information. Second, using a simple embedding scheme can lead to information bottleneck since the same embedding has to represent all possible contextual biases. Third, since previous models use the same positional embedding in each attention head, they can wastefully learn overlapping patterns. To address these limitations, we propose MEANTIME (MixturE of AtteNTIon mechanisms with Multi-temporal Embeddings) which employs multiple types of temporal embeddings designed to capture various patterns from the user’s behavior sequence, and an attention structure that fully leverages such diversity. Experiments on real-world data show that our proposed method outperforms current state-of-the-art sequential recommendation methods, and we provide an extensive ablation study to analyze how the model gains from the diverse positional information.

  • SPPersonality Bias of Music Recommendation Algorithms
    by Alessandro Benedetto Melchiorre (Johannes Kepler University Linz), Eva Zangerle (University of Innsbruck), Markus Schedl (Johannes Kepler University Linz)

    Recommender systems, like other tools that make use of machine learning, are known to create or increase certain biases. Earlier work has already unveiled different performance of recommender systems for different user groups, depending on gender, age, country, and consumption behavior. In this work, we study user bias in terms of another aspect, i.e., users’ personality. We investigate to which extent state-of-the-art recommendation algorithms yield different accuracy scores depending on the users’ personality traits. We focus on the music domain and create a dataset of Twitter users’ music consumption behavior and personality traits, measuring the latter in terms of the OCEAN model. Investigating recall@K and NDCG@K of the recommendation algorithms SLIM, embarrassingly shallow autoencoders for sparse data (EASE), and variational autoencoders for collaborative filtering (Mult-VAE) on this dataset, we find several significant differences in performance between user groups scoring high vs. groups scoring low on several personality traits.

  • SPProviding Explainable Race-Time Predictions and Training Plan Recommendations to Marathon Runners
    by Ciara Feely (University College Dublin), Brian Caulfield (University College Dublin), Aonghus Lawlor (University College Dublin), Barry Smyth (University College Dublin)

    Millions of people participate in marathon events every year, typically devoting at least 12-16 weeks to building their endurance and fitness so that they can safely complete these gruelling 42.2km races. Most runners follow a training plan that is tailored to their expected finish-time (e.g. sub-4 hours or 4-5 hours), and these plans will prescribe a complex mixture of training sessions to help them achieve these times. However, such plans cannot adapt to the individual needs (fitness levels, changing goals, personal preferences) of runners, providing only broad training guidance rather than more personalised support. The development of wearable sensors and mobile fitness applications facilitates the collection of a large amount of training data from runners. In this paper, we propose a recommender system that utilizes such training data to deliver more personalised training advice to runners, using ideas from case-based reasoning to reuse and adapt the training habits of similar runners. Explainability plays a significant role in this type of system, and we also describe how the predictions and recommendation advice can be presented to runners. An initial off-line evaluation is presented based on a large-scale, real-world dataset.

  • SPReducing Energy Waste in Households Through Real-Time Recommendations
    by Janhavi Dahihande (San Jose State University), Akshay Anil Jaiswal (San Jose State University), Akshay Anil Pagar (San Jose State University), Ajinkya Thakare, (San Jose State University), Magdalini Eirinaki (San Jose State University), Iraklis Varlamis (San Jose State University)

    The energy consumption of households has steadily increased over the last couple of decades. Research suggests that user behavior is the most influential factor in the energy waste of a household. Thus, there’s a need for helping consumers change their behavior to make it more energy efficient and environment friendly. In this work we propose a real-time recommender system that assists consumers in improving their household’s energy usage. By monitoring the power demand of each appliance in the household, the system detects the device status (on/off) at any moment, and using pattern mining creates a household profile comprising energy consumption patterns for different periods of the day. An intuitive UI allows users to set energy consumption goals and preferences on the appliances they’d like to save energy from. Based on the household profile, the user’s preferences and the actual power demand the system generates personalized real-time recommendations on which appliances should be turned off at a moment. We employ the UK-DALE (UK Domestic Appliance-Level Electricity) dataset to model and evaluate the entire process, from data preprocessing and transformation of the appliance power demand input to various pattern mining algorithms used to generate appliance usage profiles and recommendations, showing that even small changes in appliance usage behavior can lead to energy savings between 2-17%.

  • SPUsing Conceptual Incongruity as a Basis for Making Recommendations
    by Tushar Shandhilya (IIT Kanpur), Nisheeth Srivastava (IIT Kanpur)

    We evaluate the possibility of using within-item measures of meta-data similarity to improve recommendation rankings along psychologically salient dimensions of incongruity and creativity. Our approach contrasts with recently developed methods at introducing diversity into recommendations which rely on across-item measurements of dissimilarity, while sharing several formal and algorithmic elements. We show that semantic distance based operationalizations of psychological constructs show substantial correlation with empirical data. We further show that incongruity predicts variability in satisfaction as measured by movie ratings in a large corpus. Empirical results from a two month-long user study demonstrate that incongruity-based recommendations attract considerably more interaction from users, and users expressed significantly greater satisfaction given these recommendations. Based on these observations, we propose that using incongruity to diversify recommendations may be useful in expanding recommendation repertoires along interesting psychological dimensions, complementing relevance-based search.

Back to Program

Select timezone:

Current time in :

Diamond Supporter
Platinum Supporters
Gold Supporters
Silver Supporter
Special Supporter