Session 10: Applications-Driven Advances

Date: Thursday 14:00 – 15:30 CET
Chair: Özlem Özgöbek (Norwegian University of Science and Technology)

  • PADebiased Off-Policy Evaluation for Recommendation Systems
    by Yusuke Narita (Yale University, United States), Shota Yasui (AILab CyberAgent, Inc., Japan), and Kohei Yata (Department of Economics Yale University, United States)

    Efficient methods to evaluate new algorithms are critical for improving interactive bandit and reinforcement learning systems such as recommendation systems. A/B tests are reliable, but are time- and money-consuming, and entail a risk of failure. In this paper, we develop an alternative method, which predicts the performance of algorithms given historical data that may have been generated by a different algorithm. Our estimator has the property that its prediction converges in probability to the true performance of a counterfactual algorithm at a rate of , as the sample size N increases. We also show a correct way to estimate the variance of our prediction, thus allowing the analyst to quantify the uncertainty in the prediction. These properties hold even when the analyst does not know which among a large number of potentially important state variables are actually important. We validate our method by a simulation experiment about reinforcement learning. We finally apply it to improve advertisement design by a major advertisement company. We find that our method produces smaller mean squared errors than state-of-the-art methods.

    Full text in ACM Digital Library

  • INBoosting Local Recommendations With Partially Trained Global Model
    by Yuxi Zhang (Salesforce, United States) and Kexin Xie (Salesforce, United States)

    Building recommendation systems for enterprise software has many unique challenges that are different from consumer-facing systems. When applied to different organizations, the data used to power those recommendation systems vary substantially in both quality and quantity due to differences in their operational practices, marketing strategies, and targeted audiences. At Salesforce, as a cloud provider of such a system with data across many different organizations, naturally, it makes sense to pool data from different organizations to build a model that combines all values from different brands. However, multiple issues like how do we make sure a model trained with pooled data can still capture customer specific characteristics, how do we design the system to handle those data responsibly and ethically, i.e., respecting contractual agreements with our clients, legal and compliance requirements, and the privacy of all the consumers. In this proposal, We present a framework that not only utilizes enriched user-level data across organizations, but also boosts business-specific characteristics in generating personal recommendations. We will also walk through key privacy considerations when designing such a system.

    Full text in ACM Digital Library

  • PAFollow the guides: disentangling human and algorithmic curation in online music consumption
    by Quentin Villermet (Centre Marc Bloch), Jérémie Poiroux (Centre Marc Bloch & CNRS), Manuel Moussallam (Deezer Research), Thomas Louail (CNRS), and Camille Roth (Centre Marc Bloch & CNRS)

    The role of recommendation systems in the diversity of content consumption on platforms is a much-debated issue. The quantitative state of the art often overlooks the existence of individual attitudes toward guidance, and eventually of different categories of users in this regard. Focusing on the case of music streaming, we analyze the complete listening history of about 9k users over one year and demonstrate that there is no blanket answer to the intertwinement of recommendation use and consumption diversity: it depends on users. First we compute for each user the relative importance of different access modes within their listening history, introducing a trichotomy distinguishing so-called ‘organic’ use from algorithmic and editorial guidance. We thereby identify four categories of users. We then focus on two scales related to content diversity, both in terms of dispersion – how much users consume the same content repeatedly – and popularity – how popular is the content they consume. We show that the two types of recommendation offered by music platforms – algorithmic and editorial – may drive the consumption of more or less diverse content in opposite directions, depending also strongly on the type of users. Finally, we compare users’ streaming histories with the music programming of a selection of popular French radio stations during the same period. While radio programs are usually more tilted toward repetition than users’ listening histories, they often program more songs from less popular artists. On the whole, our results highlight the nontrivial effects of platform-mediated recommendation on consumption, and lead us to speak of ‘filter niches’ rather than ‘filter bubbles’. They hint at further ramifications for the study and design of recommendation systems.

    Full text in ACM Digital Library

  • PARecommendation on Live-Streaming Platforms: Dynamic Availability and Repeat Consumption
    by Jérémie Rappaz (EPFL, Switzerland), Julian McAuley (UC San Diego, United States), and Karl Aberer (LSIR EPFL, Switzerland)

    Live-streaming platforms broadcast user-generated video in real-time. Recommendation on these platforms shares similarities with traditional settings, such as a large volume of heterogeneous content and highly skewed interaction distributions. However, several challenges must be overcome to adapt recommendation algorithms to live-streaming platforms: first, content availability is dynamic which restricts users to choose from only a subset of items at any given time; during training and inference we must carefully handle this factor in order to properly account for such signals, where ‘non-interactions’ reflect availability as much as implicit preference. Streamers are also fundamentally different from ‘items’ in traditional settings: repeat consumption of specific channels plays a significant role, though the content itself is fundamentally ephemeral.
    In this work, we study recommendation in this setting of a dynamically evolving set of available items. We propose LiveRec, a self-attentive model that personalizes item ranking based on both historical interactions and current availability. We also show that carefully modelling repeat consumption plays a significant role in model performance. To validate our approach, and to inspire further research on this setting, we release a dataset containing 475M user interactions on Twitch over a 43-day period. We evaluate our approach on a recommendation task and show our method to outperform various strong baselines in ranking the currently available content.

    Full text in ACM Digital Library

Platinum Supporters
 
 
Gold Supporters
 
 
 
 
 
Silver Supporters
 
 
 
 
Special Supporter