RecSys 2024 — Session 6: Multi-task Learning - RecSys

Session 6: Multi-task Learning

Date: Wednesday October 16, 8:30 AM – 09:30 AM (GMT+2)
Room: Petruzzelli Theater
Session Chair: Fedelucio Narducci

RES 🕓15Touch the Core: Exploring Task Dependence Among Hybrid Targets for Recommendation
by Xing Tang (Tencent), Yang Qiao (Tencent), Fuyuan Lyu (McGill University), Dugang Liu (Guangdong Laboratory of Artificial Intelligence and Digital Economy (SZ)) and Xiuqiang He (Tencent)

As user behaviors become complicated on business platforms, online recommendations focus more on how to touch the core conversions, which are highly related to the interests of platforms. These core conversions are usually continuous targets, such as watch time, revenue, and so on, whose predictions can be enhanced by previous discrete conversion actions. Therefore, multi-task learning (MTL) can be adopted as the paradigm to learn these hybrid targets. However, existing works mainly emphasize investigating the sequential dependence among discrete conversion actions, which neglects the complexity of dependence between discrete conversions and the final continuous conversion. Moreover, simultaneously optimizing hybrid tasks with stronger task dependence will suffer from volatile issues where the core regression task might have a larger influence on other tasks. In this paper, we study the MTL problem with hybrid targets for the first time and propose the model named Hybrid Targets Learning Network (HTLNet) to explore task dependence and enhance optimization. Specifically, we introduce label embedding for each task to explicitly transfer the label information among these tasks, which can effectively explore logical task dependence. We also further design the gradient adjustment regime between the final regression task and other classification tasks to enhance the optimization. Extensive experiments on two offline public datasets and one real-world industrial dataset are conducted to validate the effectiveness of HTLNet. Moreover, online A/B tests on the financial recommender system also show that our model has improved significantly. Our implementation is available here.

Full text in ACM Digital Library
RES 🕓15Bridging Search and Recommendation in Generative Retrieval: Does One Task Help the Other?
by Gustavo Penha (Spotify), Ali Vardasbi (Spotify), Enrico Palumbo (Spotify), Marco De Nadai (Spotify) and Hugues Bouchard (Spotify)

Generative retrieval for search and recommendation is a promising paradigm for retrieving items, offering an alternative to traditional methods that depend on external indexes and nearest-neighbor searches. Instead, generative models directly associate inputs with item IDs. Given the breakthroughs of Large Language Models (LLMs), these generative systems can play a crucial role in centralizing a variety of Information Retrieval (IR) tasks in a single model that performs tasks such as query understanding, retrieval, recommendation, explanation, re-ranking, and response generation. Despite the growing interest in such a unified generative approach for IR systems, the advantages of using a single, multi-task model over multiple specialized models are not well established in the literature. This paper investigates whether and when such a unified approach can outperform task-specific models in the IR tasks of search and recommendation, broadly co-existing in multiple industrial online platforms, such as Spotify, YouTube, and Netflix. Previous work shows that (1) the latent representations of items learned by generative recommenders are biased towards popularity, and (2) content-based and collaborative-filtering-based information can improve an item’s representations. Motivated by this, our study is guided by two hypotheses: [H1] the joint training regularizes the estimation of each item’s popularity, and [H2] the joint training regularizes the item’s latent representations, where search captures content-based aspects of an item and recommendation captures collaborative-filtering aspects. Our extensive experiments with both simulated and real-world data support both [H1] and [H2] as key contributors to the effectiveness improvements observed in the unified search and recommendation generative models over the single-task approaches.

Full text in ACM Digital Library
RES 🕓15Utilizing Non-click Samples via Semi-supervised Learning for Conversion Rate Prediction
by Jiahui Huang (University of Science and Technology of China), Lan Zhang (University of Science and Technology of China), Junhao Wang (University of Science and Technology of China), Shanyang Jiang (University of Science and Technology of China), Dongbo Huang (Tencent), Cheng Ding (Tencent) and Lan Xu (Tencent)

Conversion rate (CVR) prediction is essential in recommender systems, facilitating precise matching between recommended items and users’ preferences. However, the sample selection bias (SSB) and data sparsity (DS) issues pose challenges to accurate prediction. Existing works have proposed the click-through and conversion rate (CTCVR) prediction task which models samples from exposure to “click and conversion” in entire space and incorporates multi-task learning. This approach has shown efficacy in mitigating these challenges. Nevertheless, it intensifies the false negative sample (FNS) problem. To be more specific, the CTCVR task implicitly treats all the CVR labels of non-click samples as negative, overlooking the possibility that some samples might convert if clicked. This oversight can negatively impact CVR model performance, as empirical analysis has confirmed. To this end, we advocate for discarding the CTCVR task and proposing a Non-click samples Improved Semi-supErvised (NISE) method for conversion rate prediction, where the non-click samples are treated as unlabeled. Our approach aims to predict their probabilities of conversion if clicked, utilizing these predictions as pseudo-labels for further model training. This strategy can help alleviate the FNS problem, and direct modeling of the CVR task across the entire space also mitigates the SSB and DS challenges. Additionally, we conduct multi-task learning by introducing an auxiliary click-through rate prediction task, thereby enhancing embedding layer representations. Our approach is applicable to various multi-task architectures. Comprehensive experiments are conducted on both public and production datasets, demonstrating the superiority of our proposed method in mitigating the FNS challenge and improving the CVR estimation. The implementation code is available at https://github.com/Hjh233/NISE.

Full text in ACM Digital Library
RES 🕓15Ranking-Aware Unbiased Post-Click Conversion Rate Estimation via AUC Optimization on Entire Exposure Space
by Yu Liu (Nanjing University;Huawei Technologies Co., Ltd.), Qinglin Jia (Huawei Noah’s Ark Lab), Shuting Shi (Huawei Technologies Co., Ltd.), Chuhan Wu (Huawei Noah’s Ark Lab), Zhaocheng Du (Huawei Noah’s Ark Lab), Zheng Xie (Nanjing University), Ruiming Tang (Huawei Noah’s Ark Lab), Muyu Zhang (Huawei Technologies Co., Ltd.) and Ming Li (Nanjing University)

Estimating the post-click conversion rate (CVR) accurately in ranking systems is crucial in industrial applications. However, this task is often challenged by data sparsity and selection bias, which hinder accurate ranking. Previous approaches to address these challenges have typically focused on either modeling CVR across the entire exposure space which includes all exposure events, or providing unbiased CVR estimation separately. However, the lack of integration between these objectives has limited the overall performance of CVR estimation. Therefore, there is a pressing need for a method that can simultaneously provide unbiased CVR estimates across the entire exposure space. To achieve it, we formulate the CVR estimation task as an Area Under the Curve (AUC) optimization problem and propose the Entire-space Weighted AUC (EWAUC) framework. EWAUC utilizes sample reweighting techniques to handle selection bias and employs pairwise AUC risk, which incorporates more information from limited clicked data, to handle data sparsity. In order to model CVR across the entire exposure space unbiasedly, EWAUC treats the exposure data as both conversion data and non-conversion data to calculate the loss. The properties of AUC risk guarantee the unbiased nature of the entire space modeling. We provide comprehensive theoretical analysis to validate the unbiased nature of our approach. Additionally, extensive experiments conducted on real-world datasets demonstrate that our approach outperforms state-of-the-art methods in terms of ranking performance for the CVR estimation task.

Full text in ACM Digital Library

Back to program

Session 6: Multi-task Learning

RecSys 2024 (Bari)

Sapphire Supporter

Diamond Supporter

Platinum Supporter

Gold Supporter

Silver Supporter

Bronze Supporter

Women in RecSys’s Event Supporter

Challenge Sponsor

Special Supporters

About this site

RecSys 2026