Paper Session 4: Recommendations in Advertising, Promotions, Intent and Search

Date: Tuesday, Sept 17, 2019, 16:00-17:30
Location: Auditorium
Chair: Robin Burke

  • LPA Comparison of Calibrated and Intent-Aware Recommendations
    by Mesut Kaya, Derek Bridge

    Calibrated and intent-aware recommendation are recent approaches to recommendation that have apparent similarities. Both try, to a certain extent, to cover the user’s interests, as revealed by her user profile. In this paper, we compare them in detail. On two datasets, we show the extent to which intent-aware recommendations are calibrated and the extent to which calibrated recommendations are diverse. We consider two ways of defining a user’s interests, one based on item features, the other based on subprofiles of the user’s profile. We find that defining interests in terms of subprofiles results in highest precision and the best relevance/diversity trade-off. Along the way, we define a new version of calibrated recommendation and three new evaluation metrics.

  • LPLORE: A Large-Scale Offer Recommendation Engine with Eligibility and Capacity Constraints
    by Rahul Makhijani, Shreya Chakrabarti, Dale Struble, Yi Liu

    Businesses, such as Amazon, department store chains, home furnishing store chains, Uber, and Lyft, frequently offer deals, product discounts and incentives to drive sales, increase new product acceptance and engage with users. In order to appeal to diverse user groups, these businesses typically design more than one promotion offer but market different ones to different users. For instance, Uber offers a percentage discount in the rides to some users and a low fixed price to others. In this paper, we propose solutions to optimally recommend promotions and items to maximize user conversion constrained by user eligibility and item or offer capacity (limited quantity of items or offers) simultaneously. We achieve this through an offer recommendation model based on Min-Cost Flow network optimization, which enables us to satisfy the constraints within the optimization itself and solve it in polynomial time. We present two approaches that can be used in various settings: single period solution and sequential time period offering. We evaluate these approaches against competing methods using counterfactual evaluation in offline mode. We also discuss three practical aspects that may affect the online performance of constrained optimization: capacity determination, traffic arrival pattern and clustering for large scale setting.

  • LPDomain Adaptation in Display Advertising: An Application for Partner Cold-Start
    by Karan Aggarwal, Pranjul Yadav, S. Sathiya Keerthi

    Digital advertisements connects partners (sellers) to potentially interested online users. Within the digital advertisement domain,there are multiple platforms,e.g.,user re-targeting and prospecting. Partners usually start with re-targeting campaigns and later employ prospecting campaigns to reach out to untapped customer base. There are two major challenges involved with prospecting. The first challenge is successful on-boarding of a new partner on the prospecting platform, referred to as partner cold-start problem. The second challenge revolves around the ability to leverage large amounts of re-targeting data for partner cold-start problem. In this work, we study domain adaptation for the partner cold-start problem. To this end, we propose two domain adaptation techniques, SDA-DANN and SDA-Ranking. SDA-DANN and SDA-Ranking extend domain adaptation techniques for partner cold-start by incorporating sub-domain similarities (product category level information). Through rigorous experiments, we demonstrate that our method SDA-DANN outperforms baseline domain adaptation techniques on real-world dataset, obtained from a major online advertiser. Furthermore, we show that our proposed technique SDA-Ranking outperforms baseline methods for low CTR partners

  • LPAddressing Delayed Feedback for Continuous Training with Neural Networks in CTR prediction
    by Sofia Ira Ktena, Alykhan Tejani, Lucas Theis, Pranay Kumar Myana, Deepak Dilipkumar, Ferenc Huszár, Steven Yoo, Wenzhe Shi

    One of the challenges in display advertising is that the distribution of features and click through rate (CTR) can exhibit large shifts over time due to seasonality, changes to ad campaigns and other factors. The predominant strategy to keep up with these shifts is to train predictive models continuously, on fresh data, in order to prevent them from becoming stale. However, in many ad systems positive labels are only observed after a possibly long and random delay. These delayed labels pose a challenge to data freshness in continuous training: fresh data may not have complete label information at the time they are ingested by the training algorithm. Naive strategies which consider any data point a negative example until a positive label becomes available tend to underestimate CTR, resulting in inferior user experience and suboptimal performance for advertisers. The focus of this paper is to identify the best combination of loss functions and models that enable large-scale learning from a continuous stream of data in the presence of delayed labels. In this work, we compare 5 different loss functions, 3 of them applied to this problem for the first time. We benchmark their performance in offline settings on both public and proprietary datasets in conjunction with shallow and deep model architectures. We also discuss the engineering cost associated with implementing each loss function in a production environment. Finally, we carried out online experiments with the top performing methods, in order to validate their performance in a continuous training scheme. While training on 668 million in-house data points offline, our proposed methods outperform previous state-of-the-art by 3% relative cross entropy (RCE). During online experiments, we observed 55% gain in revenue per thousand requests (RPMq) against naive log loss.

  • SPOGhosting: Contextualized Inline Query Completion in Large Scale Retail Search
    by Lakshmi Ramachandran, Uma Murthy

    Query auto-completion presents a ranked list of queries as suggestions for a user-entered prefix. Ghosting is the process of auto-completing a search recommendation by highlighting the suggested text inline within the search box. We propose the use of a behavior-based recommendation model along with customer search context to ghost on high-confidence queries. We tested ghosting on a retail production system, on over 140 million search sessions. We found that session-context based ghosting significantly increased the acceptance of offered suggestions by 6.18%, reduced misspellings among searches by 4.42%, and improved net sales by 0.14%.

  • LPFiBiNET: Combining Feature Importance and Bilinear feature Interaction for Click-Through Rate Prediction
    by Tongwen Huang, Zhiqi Zhang, Junlin Zhang

    Advertising and feed ranking are essential to many Internet companies such as Facebook and Sina Weibo. Among many real-world advertising and feed ranking systems, click through rate (CTR) prediction plays a central role. There are many proposed models in this field such as logistic regression, tree based models, factorization machine based models and deep learning based CTR models. However, many current works calculate the feature interactions in a simple way such as Hadamard product and inner product and they care less about the importance of features. In this paper, a new model named FiBiNET as an abbreviation for Feature Importance and Bilinear feature Interaction NETwork is proposed to dynamically learn the feature importance and fine-grained feature interactions. On the one hand, the FiBiNET can dynamically learn the importance of features via the Squeeze-Excitation network (SENET) mechanism. On the other hand, it is able to effectively learn the feature interactions via bilinear function. We conduct extensive experiments on two real-world datasets and show that our shallow model outperforms other shallow models such as factorization machine(FM) and field-aware factorization machine(FFM). In order to improve performance further, we combine a classical deep neural network(DNN) component with the shallow model to be a deep model. The deep FiBiNET consistently outperforms the other state-of-the-art deep models such as DeepFM and extreme deep factorization machine(XdeepFM).

Back to Program

Diamond Supporters
 
 
Platinum Supporters
 
 
 
Gold Supporters
 
 
 
 
Silver Supporters
 
 
Special Supporter