Paper Session 1: Beyond Accuracy
Date: Saturday, Sept 17, 2016, 10:40-12:20
Location: Kresge Auditorium
Chair: Joe Konstan
- PPFRecommendations with a Purpose
by Dietmar Jannach, Gediminas AdomaviciusThe purpose of recommenders is often summarized as “help the users find relevant items”, and the predominant operationalization of this goal has been to focus on the ability to numerically estimate the users’ preferences for unseen items or to provide users with item lists ranked in accordance to the estimated preferences. This dominant, albeit narrow, view of the recommendation problem has been tremendously helpful in advancing research in different ways, e.g., through the establishment of standardized evaluation procedures and metrics. In reality, recommender systems can serve a variety of purposes from the point of view of both consumers and providers. Most of the purposes, however, are significantly underexplored, even though many of them are arguably more aligned with the real-world expectations for recommenders than our current predominant paradigm. Therefore, it is important to revisit our conceptualizations of the potential goals of recommenders and their operationalization as research problems. In this paper, we discuss a framework of recommendation goals and purposes and highlight possible future directions and challenges related to the operationalization of such alternative problem formulations.
- PPFRecommender Systems for Self-Actualization
by Bart P. Knijnenburg, Saadhika Sivakumar, Daricia WilkinsonEvery day, we are confronted with an abundance of decisions that require us to choose from a seemingly endless number of choice options. Recommender systems are supposed to help us deal with this formidable task, but some scholars claim that these systems instead put us inside a “Filter Bubble” that severely limits our perspectives. This paper presents a new direction for recommender systems research with the main goal of supporting users in developing, exploring, and understanding their unique personal preferences.
- LP BPNA Coverage-Based Approach to Recommendation Diversity On Similarity Graph
by Shameem A Puthiya Parambath, Nicolas Usunier, Yves GrandvaletWe consider the problem of generating diverse, personalized recommendations such that a small set of recommended items covers a broad range of the user’s interests. We represent items in a similarity graph, and we formulate the relevance/diversity trade-off as finding a small set of unrated items that best covers a subset of items positively rated by the user. In contrast to previous approaches, our method does not rely on an explicit trade-off between a relevance objective and a diversity objective, as the estimations of relevance and diversity are implicit in the coverage criterion. We show on several benchmark datasets that our approach compares favorably to the state-of-the-art diversification methods according to various relevance and diversity measures.
- SPA Scalable Approach for Periodical Personalized Recommendations
by Zhen Qin, Ish Rishabh, John CarnahanWe develop a highly scalable and effective contextual bandit approach towards periodical personalized recommendations. The online bootstrapping-based technique provides a principled way for UCB-type exploitation-exploration algorithms, while being able to handle arbitrary sized datasets, well suited to learn the ever evolving user preference drift from streaming data, and essentially parameter-free. We further introduce techniques to handle arbitrary sized feature spaces using feature hashing, leverage existing state-of-art machine learning via learning reduction, and increase cache hits by managing bootstrapped models in memory effectively. The resulted model trains on millions of examples and billions of features within minutes on a single personal computer. It shows persistent performance in both offline and online evaluation. We observe around 10% click through rate (CTR) and conversion lift over a collaborative filtering approach in real-world A/B testing across more than 40 million users on the major Ticketmaster email recommendation product.
- SPMulti-Word Generative Query Recommendation Using Topic Modeling
by Matthew Mitsui, Chirag ShahQuery recommendation predominantly relies on search logs to use existing queries for recommendation, typically calculating query similarity metrics or transition probabilities from the log. While effective, such recommendations are limited to the queries, words, and phrases in the log. They hence do not recommend potentially useful, entirely novel queries. Recent query recommendation methods have proposed generating queries on a topical or thematic level, though current approaches are limited to generating single words. We propose a hybrid method for constructing multi-word queries in this generative sense. It uses Latent Dirichlet Allocation to generate a topic for exploration and skip-gram modeling to generate queries from the topic. According to additional evaluation metrics we present, our model improves diversity and has some room for improving relevance, yet offers an interesting avenue for query recommendation.
- SPContrasting Offline and Online Results when Evaluating Recommendation Algorithms
by Marco Rossetti, Fabio Stella, Markus ZankerMost evaluations of novel algorithmic contributions assess their accuracy in predicting what was withheld in an offline evaluation scenario. However, several doubts have been raised that standard offline evaluation practices are not appropriate to select the best algorithm for field deployment. The goal of this work is therefore to compare the offline and the online evaluation methodology with the same study participants, i.e. a within users experimental design. This paper presents empirical evidence that the ranking of algorithms based on offline accuracy measurements clearly contradicts the results from the online study with the same set of users. Thus the external validity of the most commonly applied evaluation methodology is not guaranteed.
- SP BPNAdaptive, Personalized Diversity for Visual Discovery
by Choon Hui Teo, Houssam Nassif, Daniel Hill, Sriram Srinivasan, Mitchell Goodman, Vijai Mohan, S. V. N. VishwanathanSearch queries are appropriate when users have explicit intent, but they perform poorly when the intent is difficult to express or if the user is simply looking to be inspired. Visual browsing systems allow e-commerce platforms to address these scenarios while offering the user an engaging shopping experience. Here we explore extensions in the direction of adaptive personalization and item diversification within Stream, a new form of visual browsing and discovery by Amazon. Our system presents the user with a diverse set of interesting items while adapting to user interactions. Our solution consists of three components (1) a Bayesian regression model for scoring the relevance of items while leveraging uncertainty, (2) a submodular diversification framework that re-ranks the top scoring items based on category, and (3) personalized category preferences learned from the user’s behavior. When tested on live traffic, our algorithms show a strong lift in click-through-rate and session duration.
- SP BPNIntent-Aware Diversification Using a Constrained PLSA
by Jacek Wasilewski, Neil HurleyThe intent-aware diversification framework was introduced initially in information retrieval and adopted to the context of recommender systems in the work of Vargas et al. The framework considers a set of aspects associated with items to be recommended. For instance, aspects may correspond to genres in movie recommendations. The framework depends on input aspect model consisting of item selection or relevance probabilities, given an aspect, and user intents, in the form of probabilities that the user is interested in each aspect. In this paper, we examine a number of input aspect models and evaluate the impact that different models have on the framework. In particular, we propose a constrained PLSA model that allows for interpretable output, in terms of known aspects, while achieving greater performance that the explicit co-occurrence counting method used in previous work. We evaluate the proposed models using a well-known MovieLens dataset for which item genres are available.




















