- PALearning to Represent Human Motives for Goal-directed Web Browsing
by Jyun-Yu Jiang (University of California, Los Angeles, United States), Chia-Jung Lee (Microsoft, United States), Longqi Yang (Microsoft, United States), Bahareh Sarrafzadeh (Microsoft, United States), Brent Hecht (Northwestern University, United States), and Jaime Teevan (Microsoft, United States)
Motives or goals are recognized in psychology literature as the most fundamental drive that explains and predicts why people do what they do, including when they browse the web. Although providing enormous value, these higher-ordered goals are often unobserved, and little is known about how to leverage such goals to assist people’s browsing activities. This paper proposes to take a new approach to address this problem, which is fulfilled through a novel neural framework, Goal-directed Web Browsing (GoWeB). We adopt a psychologically-sound taxonomy of higher-ordered goals and learn to build their representations in a structure-preserving manner. Then we incorporate the resulting representations for enhancing the experiences of common activities people perform on the web. Experiments on large-scale data from Microsoft Edge web browser show that GoWeB significantly outperforms competitive baselines for in-session web page recommendation, re-visitation classification, and goal-based web page grouping. A follow-up analysis further characterizes how the variety of human motives can affect the difference observed in human behavioral patterns.
Full text in ACM Digital Library
|
- PADebiased Off-Policy Evaluation for Recommendation Systems
by Yusuke Narita (Yale University, United States), Shota Yasui (AILab CyberAgent, Inc., Japan), and Kohei Yata (Department of Economics Yale University, United States)
Efficient methods to evaluate new algorithms are critical for improving interactive bandit and reinforcement learning systems such as recommendation systems. A/B tests are reliable, but are time- and money-consuming, and entail a risk of failure. In this paper, we develop an alternative method, which predicts the performance of algorithms given historical data that may have been generated by a different algorithm. Our estimator has the property that its prediction converges in probability to the true performance of a counterfactual algorithm at a rate of , as the sample size N increases. We also show a correct way to estimate the variance of our prediction, thus allowing the analyst to quantify the uncertainty in the prediction. These properties hold even when the analyst does not know which among a large number of potentially important state variables are actually important. We validate our method by a simulation experiment about reinforcement learning. We finally apply it to improve advertisement design by a major advertisement company. We find that our method produces smaller mean squared errors than state-of-the-art methods.
Full text in ACM Digital Library
|
- INBoosting Local Recommendations With Partially Trained Global Model
by Yuxi Zhang (Salesforce, United States) and Kexin Xie (Salesforce, United States)
Building recommendation systems for enterprise software has many unique challenges that are different from consumer-facing systems. When applied to different organizations, the data used to power those recommendation systems vary substantially in both quality and quantity due to differences in their operational practices, marketing strategies, and targeted audiences. At Salesforce, as a cloud provider of such a system with data across many different organizations, naturally, it makes sense to pool data from different organizations to build a model that combines all values from different brands. However, multiple issues like how do we make sure a model trained with pooled data can still capture customer specific characteristics, how do we design the system to handle those data responsibly and ethically, i.e., respecting contractual agreements with our clients, legal and compliance requirements, and the privacy of all the consumers. In this proposal, We present a framework that not only utilizes enriched user-level data across organizations, but also boosts business-specific characteristics in generating personal recommendations. We will also walk through key privacy considerations when designing such a system.
Full text in ACM Digital Library
|
|
- PAFollow the guides: disentangling human and algorithmic curation in online music consumption
by Quentin Villermet (Centre Marc Bloch), Jérémie Poiroux (Centre Marc Bloch & CNRS), Manuel Moussallam (Deezer Research), Thomas Louail (CNRS), and Camille Roth (Centre Marc Bloch & CNRS)
The role of recommendation systems in the diversity of content consumption on platforms is a much-debated issue. The quantitative state of the art often overlooks the existence of individual attitudes toward guidance, and eventually of different categories of users in this regard. Focusing on the case of music streaming, we analyze the complete listening history of about 9k users over one year and demonstrate that there is no blanket answer to the intertwinement of recommendation use and consumption diversity: it depends on users. First we compute for each user the relative importance of different access modes within their listening history, introducing a trichotomy distinguishing so-called ‘organic’ use from algorithmic and editorial guidance. We thereby identify four categories of users. We then focus on two scales related to content diversity, both in terms of dispersion – how much users consume the same content repeatedly – and popularity – how popular is the content they consume. We show that the two types of recommendation offered by music platforms – algorithmic and editorial – may drive the consumption of more or less diverse content in opposite directions, depending also strongly on the type of users. Finally, we compare users’ streaming histories with the music programming of a selection of popular French radio stations during the same period. While radio programs are usually more tilted toward repetition than users’ listening histories, they often program more songs from less popular artists. On the whole, our results highlight the nontrivial effects of platform-mediated recommendation on consumption, and lead us to speak of ‘filter niches’ rather than ‘filter bubbles’. They hint at further ramifications for the study and design of recommendation systems.
Full text in ACM Digital Library
|
- PARecommendation on Live-Streaming Platforms: Dynamic Availability and Repeat Consumption
by Jérémie Rappaz (EPFL, Switzerland), Julian McAuley (UC San Diego, United States), and Karl Aberer (LSIR EPFL, Switzerland)
Live-streaming platforms broadcast user-generated video in real-time. Recommendation on these platforms shares similarities with traditional settings, such as a large volume of heterogeneous content and highly skewed interaction distributions. However, several challenges must be overcome to adapt recommendation algorithms to live-streaming platforms: first, content availability is dynamic which restricts users to choose from only a subset of items at any given time; during training and inference we must carefully handle this factor in order to properly account for such signals, where ‘non-interactions’ reflect availability as much as implicit preference. Streamers are also fundamentally different from ‘items’ in traditional settings: repeat consumption of specific channels plays a significant role, though the content itself is fundamentally ephemeral. In this work, we study recommendation in this setting of a dynamically evolving set of available items. We propose LiveRec, a self-attentive model that personalizes item ranking based on both historical interactions and current availability. We also show that carefully modelling repeat consumption plays a significant role in model performance. To validate our approach, and to inspire further research on this setting, we release a dataset containing 475M user interactions on Twitch over a 43-day period. We evaluate our approach on a recommendation task and show our method to outperform various strong baselines in ranking the currently available content.
Full text in ACM Digital Library
|