Accepted Contributions

 

List of all long papers accepted for RecSys 2023 (in alphabetical order).

 

  • RESA Lightweight Method for Modeling Confidence in Recommendations with Learned Beta Distributions
    by Norman Knyazev (Radboud University) and Harrie Oosterhuis (Radboud University).

    Most recommender systems (RecSys) do not provide an indication of confidence in their decisions. Therefore, they do not distinguish between recommendations of which they are certain, and those where they are not. Existing confidence methods for RecSys are either inaccurate heuristics, conceptually complex or very computationally expensive. Consequently, real-world RecSys applications rarely adopt these methods, and thus, provide no confidence insights in their behavior. In this work, we propose learned beta distributions (LBD) as a simple and practical recommendation method with an explicit measure of confidence. Our main insight is that beta distributions predict user preferences as probability distributions that naturally model confidence on a closed interval, yet can be implemented with the minimal model-complexity. Our results show that LBD maintains competitive accuracy to existing methods while also having a significantly stronger correlation between its accuracy and confidence. Furthermore, LBD has higher performance when applied to a high-precision targeted recommendation task. Our work thus shows that confidence in RecSys is possible without sacrificing simplicity or accuracy, and without introducing heavy computational complexity. Thereby, we hope it enables better insight into real-world RecSys and opens the door for novel future applications.

    Full text in ACM Digital Library

  • RESA Multi-view Graph Contrastive Learning Framework for Cross-Domain Sequential Recommendation
    by Zitao Xu (Shenzhen University), Weike Pan (Shenzhen University) and Zhong Ming (Shenzhen University).

    Sequential recommendation methods play an irreplaceable role in recommender systems which can capture the users’ dynamic preferences from the behavior sequences. Despite their success, these works usually suffer from the sparsity problem commonly existed in real applications. Cross-domain sequential recommendation aims to alleviate this problem by introducing relatively richer source-domain data. However, most existing methods capture the users’ preferences independently of each domain, which may neglect the item transition patterns across sequences from different domains, i.e., a user’s interaction in one domain may influence his/her next interaction in other domains. Moreover, the data sparsity problem still exists since some items in the target and source domains are interacted with only a limited number of times. To address these issues, in this paper we propose a generic framework named multi-view graph contrastive learning (MGCL). Specifically, we adopt the contrastive mechanism in an intra-domain item representation view and an inter-domain user preference view. The former is to jointly learn the dynamic sequential information in the user sequence graph and the static collaborative information in the cross-domain global graph, while the latter is to capture the complementary information of the user’s preferences from different domains. Extensive empirical studies on three real-world datasets demonstrate that our MGCL significantly outperforms the state-of-the-art methods.

    Full text in ACM Digital Library

  • RESAdversarial Collaborative Filtering for Free
    by Huiyuan Chen (Visa Research), Xiaoting Li (Visa Research), Vivian Lai (Visa Research), Chin-Chia Michael Yeh (Visa Research), Yujie Fan (Visa Research), Yan Zheng (Visa Research), Mahashweta Das (Visa Research) and Hao Yang (Visa Research).

    Collaborative Filtering (CF) has been successfully applied to help users discover the items of interest. Nevertheless, existing CF methods suffer from noisy data issue, which negatively impacts the quality of personalized recommendation. To tackle this problem, many prior studies leverage the adversarial learning principle to regularize the representations of users and items, which has shown great ability in improving both generalizability and robustness. Generally, those methods learn adversarial perturbations and model parameters using min-max optimization framework. However, there still have two major limitations: 1) Existing methods lack theoretical guarantees of why adding perturbations improve the model generalizability and robustness since noisy data is naturally different from adversarial attacks; 2) Solving min-max optimization is time-consuming. In addition to updating the model parameters, each iteration requires additional computations to update the perturbations, making them not scalable for industry-scale datasets.

    In this paper, we present Sharpness-aware Matrix Factorization (SharpMF), a simple yet effective method that conducts adversarial training without extra computational cost over the base optimizer. To achieve this goal, we first revisit the existing adversarial collaborative filtering and discuss its connection with recent Sharpness-aware Minimization. This analysis shows that adversarial training actually seeks model parameters that lie in neighborhoods having uniformly low loss values, resulting in better generalizability. To reduce the computational overhead, SharpMF introduces a novel trajectory loss to measure sharpness between current weights and past weights. Experimental results on real-world datasets demonstrate that our SharpMF achieves superior performance with almost zero additional computational cost comparing to adversarial training.

    Full text in ACM Digital Library

  • RESAlleviating the Long-Tail Problem in Conversational Recommender Systems
    by Zhipeng Zhao (Singapore Management University), Kun Zhou (School of Information, Renmin University of China), Xiaolei Wang (Gaoling School of Artificial Intelligence, Renmin University of China), Wayne Xin Zhao (Gaoling School of Artificial Intelligence, Renmin University of China), Fan Pan (Poisson Lab, Huawei), Zhao Cao (Poisson Lab, Huawei) and Ji-Rong Wen (Gaoling School of Artificial Intelligence, Renmin University of China).

    Conversational recommender systems (CRS) aim to provide the recommendation service via natural language conversations. To develop an effective CRS, high-quality CRS datasets are very crucial. However, existing CRS datasets suffer from the long-tail issue, \ie a large proportion of items are rarely (or even never) mentioned in the conversations, which are called long-tail items. As a result, the CRSs trained on these datasets tend to recommend frequent items, and the diversity of the recommended items would be largely reduced, making users easier to get bored.

    To address this issue, this paper presents \textbf{LOT-CRS}, a novel framework that focuses on simulating and utilizing a balanced CRS dataset (\ie covering all the items evenly) for improving \textbf{LO}ng-\textbf{T}ail recommendation performance of CRSs. In our approach, we design two pre-training tasks to enhance the understanding of simulated conversation for long-tail items, and adopt retrieval-augmented fine-tuning with label smoothness strategy to further improve the recommendation of long-tail items. Extensive experiments on two public CRS datasets have demonstrated the effectiveness and extensibility of our approach, especially on long-tail recommendation. All the experimental codes will be released after the review period.

    Full text in ACM Digital Library

  • RESAugmented Negative Sampling for Collaborative Filtering
    by Yuhan Zhao (Harbin Engineering University), Rui Chen (Harbin Engineering University), Riwei Lai (Harbin Engineering University), Qilong Han (Harbin Engineering University), Hongtao Song (Harbin Engineering University) and Li Chen (Hong Kong Baptist University).

    Negative sampling is essential for implicit-feedback-based collaborative filtering, which is used to constitute negative signals from massive unlabeled data to guide supervised learning. The state-of-the-art idea is to utilize hard negative samples that carry more useful information to form a better decision boundary. To balance efficiency and effectiveness, the vast majority of existing methods follow the two-pass approach, in which the first pass samples a fixed number of unobserved items by a simple static distribution and then the second pass selects the final negative items using a more sophisticated negative sampling strategy. However, selecting negative samples from the original items from a dataset is inherently limited due to the limited available choices, and thus may not be able to contrast positive samples well. In this paper, we confirm this observation via carefully designed experiments and introduce two major limitations of existing solutions: ambiguous trap and information discrimination.

    Our response to such limitations is to introduce “augmented” negative samples that may not exist in the original dataset. This direction renders a substantial technical challenge because constructing unconstrained negative samples may introduce excessive noise that eventually distorts the decision boundary. To this end, we introduce a novel generic augmented negative sampling (ANS) paradigm and provide a concrete instantiation. First, we disentangle the hard and easy factors of negative items. Next, we generate new candidate negative samples by augmenting only the easy factors in a regulated manner: the direction and magnitude of the augmentation are carefully calibrated. Finally, we design an advanced negative sampling strategy to identify the final augmented negative samples, which considers not only the score used in existing methods but also a new metric called augmentation gain. Extensive experiments on five real-world datasets demonstrate that our method significantly outperforms state-of-the-art baselines. Our code is publicly available at https://anonymous.4open.science/r/ANS-Recbole-B070/.

    Full text in ACM Digital Library

  • RESAutoOpt: Automatic Hyperparameter Scheduling and Optimization for Deep Click-through Rate Prediction
    by Yujun Li (Noah’s Ark Lab), Xing Tang (Noah’s Ark Lab), Bo Chen (Noah’s Ark Lab), Yimin Huang (Noah’s Ark Lab), Ruiming Tang (Noah’s Ark Lab) and Zhenguo Li (Noah’s Ark Lab).

    Click-through Rate (CTR) prediction is essential for commercial recommender systems. Recently, to improve the prediction accuracy, plenty of deep learning-based CTR models have been proposed, which are sensitive to hyperparameters and difficult to optimize well. General hyperparameter optimization methods fix these hyperparameters across the entire model training and repeat them multiple times. This trial-and-error process not only leads to suboptimal performance but also requires non-trivial computation efforts. In this paper, we propose an automatic hyperparameters scheduling and optimization method for deep CTR models, \emph{AutoOpt}, making the optimization process more stable and efficient. Specifically, the whole training regime is firstly divided into several consecutive stages, where a data-efficient model is learned to model the relation between model states and prediction performance. To optimize the stage-wise hyperparameters, AutoOpt uses the \textit{global} and \textit{local} scheduling modules to propose proper hyperparameters for the next stage based on the training in the current stage. Extensive experiments on three public benchmarks are conducted to validate the effectiveness of AutoOpt. Moreover, AutoOpt has been deployed onto an advertising platform and a music platform, where online A/B tests also demonstrate superior improvement.

    Full text in ACM Digital Library

  • RESBVAE: Behavior-aware Variational Autoencoder for Multi-Behavior Multi-Task Recommendation
    by Qianzhen Rao (Shenzhen University), Yang Liu (Shenzhen University), Weike Pan (Shenzhen University) and Zhong Ming (Shenzhen University).

    A practical recommender system should be able to handle heterogeneous behavioral feedback as inputs and has multi-task outputs ability. Although the heterogeneous one-class collaborative filtering (HOCCF) and multi-task learning (MTL) methods has been well studied, there is still a lack of targeted manner in their combined fields, i.e., Multi-behavior Multi-task Recommendation (MMR). To fill the gap, we propose a novel recommendation framework called Behavior-aware Variational AutoEncoder (BVAE), which meliorates the parameter sharing and loss minimization method with the VAE structure to address the MMR problem. Specifically, our BVAE includes address behavior-aware semi-encoders and decoders, and a target feature fusion network with a global feature filtering network, while using standard deviation to weigh loss. These modules generate the behavior-aware recommended item list via constructing better semantic feature vectors for users, i.e., from dual perspectives of behavioral preference and global interaction. In addition, we optimize our BVAE in terms of adaptability and robustness, i.e., it is concise and flexible to consume any amount of behaviors with different distributions. Extensive empirical studies on two real and widely used datasets confirm the validity of our design and show that our BVAE can outperform the state-of-the-art related baseline methods under multiple evaluation metrics.

    Full text in ACM Digital Library

  • RESContrastive Learning with Frequency-Domain Interest Trends for Sequential Recommendation
    by Yichi Zhang (Harbin Engineering University), Guisheng Yin (Harbin Engineering University) and Yuxin Dong (Harbin Engineering University).

    Recently, contrastive learning for sequential recommendation has demonstrated its powerful ability to learn high-quality user representations. However, constructing augmented samples in the time domain poses challenges due to various reasons, such as fast-evolving trends, interest shifts, and system factors. Furthermore, the F-principle indicates that deep learning preferentially fits the low-frequency part, resulting in poor performance on high-frequency tasks. The complexity of time series and the low-frequency preference limit the utility of sequence encoders. To address these challenges, we need to construct augmented samples from the frequency domain, thus improving the ability to accommodate events of different frequency sizes. To this end, we propose a novel Contrastive Learning with Frequency-Domain Interest Trends for Sequential Recommendation (CFIT4SRec). We treat the embedding representations of historical interactions as “images” and introduce the second-order Fourier transform to construct augmented samples. The components of different frequency sizes reflect the interest trends between attributes and their surroundings in the hidden space. We introduce three data augmentation operations to accommodate events of different frequency sizes: low-pass augmentation, high-pass augmentation, and band-stop augmentation. Extensive experiments on four public benchmark datasets demonstrate the superiority of CFIT4SRec over the state-of-the-art baselines. The implementation code is available at https://github.com/zhangyichi1Z/CFIT4SRec.

    Full text in ACM Digital Library

  • RESCorrecting for Interference in Experiments: A Case Study at Douyin
    by Vivek Farias (MIT), Hao Li (Bytedance), Tianyi Peng (MIT), Xinyuyang Ren (Bytedance), Huawei Zhang (Bytedance) and Andrew Zheng (MIT).

    Interference is a ubiquitous problem in experiments conducted on two-sided content marketplaces, such as Douyin (China’s analog of TikTok). In many cases, creators are the natural unit of experimentation, but creators interfere with each other through competition for viewers’ limited time and attention. “Naive” estimators currently used in practice simply ignore the interference, but in doing so incur bias on the order of the treatment effect. We formalize the problem of inference in such experiments as one of policy evaluation. Off-policy estimators, while unbiased, are impractically high variance. We introduce a novel Monte-Carlo estimator, based on “Differences-in-Qs” (DQ) techniques, which achieves bias which is second-order in the treatment effect, while remaining sample-efficient to estimate. On the theoretical side, our contribution is to develop a generalized theory of Taylor expansions for policy evaluation, which extends DQ theory to all major MDP formulations. On the practical side, we implement our estimator on Douyin’s experimentation platform, and in the process develop DQ into a truly “plug-and-play” estimator for interference in real-world settings: one which provides robust, low-bias, low-variance treatment effect estimates; admits computationally cheap, asymptotically exact uncertainty quantification; and reduces MSE by 99\% compared to the best existing alternatives in our applications.

    Full text in ACM Digital Library

  • RESData-free Knowledge Distillation for Reusing Recommendation Models
    by Cheng Wang (Huazhong University of Science and Technology), Jiacheng Sun (Huawei Noah’s Ark Lab), Zhenhua Dong (Huawei Noah’s Ark Lab), Jieming Zhu (Huawei Noah’s Ark Lab), Zhenguo Li (Huawei Noah’s Ark Lab), Ruixuan Li (Huazhong University of Science and Technology) and Rui Zhang (ruizhang.info).

    A common practice to keep the freshness of an offline Recommender System (RS) is to train models that fit the user’s most recent behaviours while directly replacing the outdated historical model. However, many feature engineering and computing resources are used to train these historical models, but they are underutilized in the downstream RS model training. In this paper, to turn these historical models into treasures, we introduce a model inversed data synthesis framework, which can recover training data information from the historical model and use it for knowledge transfer. This framework synthesizes a new form of data from the historical model. Specifically, we ‘invert’ an off-the-shield pretrained model to synthesize binary class user-item pairs beginning from random noise without requiring any additional information from the training dataset. To synthesize new data from a pretrained model, we update the input from random float initialization rather than one- or multi-hot vectors. An additional statistical regularization is added to further improve the quality of the synthetic data inverted from the deep model with batch normalization. The experimental results show that our framework can generalize across different types of models. We can efficiently train different types of classical Click-Through-Rate (CTR) prediction models from scratch with significantly few inversed synthetic data (2 orders of magnitude). Moreover, our framework can also work well in the knowledge transfer scenarios such as continual updating and data-free knowledge distillation.

    Full text in ACM Digital Library

  • RESDeep Situation-Aware Interaction Network for Click-Through Rate Prediction
    by Yimin Lv (Institute of Software, Chinese Academy of Sciences), Shuli Wang (Meituan), Beihong Jin (Institute of Software, Chinese Academy of Sciences), Yisong Yu (Institute of Software, Chinese Academy of Sciences), Yapeng Zhang (Meituan), Jian Dong (Meituan), Yongkang Wang (Meituan), Xingxing Wang (Meituan) and Dong Wang (Meituan).

    User behavior sequence modeling plays a significant role in Click-Through Rate (CTR) prediction on e-commerce platforms. Except for the interacted items, user behaviors contain rich interaction information, such as the behavior type, time, location, etc. However, so far, the information related to user behaviors has not yet been fully exploited. In the paper, we propose the concept of a situation and situational features for distinguishing interaction behaviors and then design a CTR model named Deep Situation-Aware Interaction Network (DSAIN). DSAIN first adopts the reparameterization trick to reduce noise in the original user behavior sequences. Then it learns the embeddings of situational features by feature embedding parameterization and tri-directional correlation fusion. Finally, it obtains the embedding of behavior sequence via heterogeneous situation aggregation. We conduct extensive offline experiments on three real-world datasets. Experimental results demonstrate the superiority of the proposed DSAIN model. More importantly, DSAIN has increased the CTR by 2.70\%, the CPM by 2.62\%, and the GMV by 2.16\% in the online A/B test. Now, DSAIN has been deployed on the Meituan food delivery platform and serves the main traffic of the Meituan takeout app. Our source code is available at https://github.com/W-void/DSAIN

    Full text in ACM Digital Library

  • RESDisentangling Motives behind Item Consumption and Social Connection for Mutually-enhanced Joint Prediction
    by Youchen Sun (Nanyang Technological University), Zhu Sun (A*STAR), Xiao Sha (Nanyang Technological University), Jie Zhang (Nanyang Technological University) and Yew Soon Ong (Nanyang Technological University).

    Item consumption and social connection, as common user behaviors in many web applications, have been extensively studied. However, most current works separately perform either item or social link prediction tasks, possibly with the help of the other as an auxiliary signal. Moreover, they merely consider the behaviors in a holistic manner yet neglect the multi-faceted motives behind them (e.g., watching movies to kill time or with friends; connecting with others due to friendships or colleagues). To fill the gap, we propose to disentangle the multi-faceted motives in each network, defined respectively by the two types of behaviors, for mutually- enhanced joint prediction (DMJP). Specifically, we first learn the disentangled user representations driven by motives of multi-facets in both networks. Thereafter, the mutual influence of the two networks is subtly discriminated at the facet-to-facet level. The fine-grained mutual influence, proven to be asymmetric, is then exploited to help refine user representations in both networks, with the goal of achieving a mutually-enhanced joint item and social link prediction. Empirical studies on three public datasets showcase the superiority of DMJP against state-of-the-arts (SOTAs) on both tasks.

    Full text in ACM Digital Library

  • RESDistribution-based Learnable Filters with Side Information for Sequential Recommendation
    by Haibo Liu (School of Cyber Security and Computer, HeBei university), Zhixiang Deng (School of Cyber Security and Computer, HeBei university), Liang Wang (School of Cyber Security and Computer, HeBei university), Jinjia Peng (School of Cyber Security and Computer, HeBei university) and Shi Feng (School of Computer Science & Engineering, Northeastern University).

    Sequential Recommendation aims to predict the next item by mining out the dynamic preference from user previous interactions. However, most methods represent each item as a single fixed vector, which is incapable of capturing the uncertainty of item-item transitions that result from time-dependent and multifarious interests of users. Besides, they fail to effectively exploit side information that helps to better express user preferences. At last, the noise in user’s access sequence, which is due to accidental clicks, can interfere with the next item prediction and lead to lower recommendation performance. To deal with these issues, we propose DLFS-Rec, a novel model that combines Distribution-based Learnable Filters with Side information for sequential Recommendation. Specifically, items and their side information are represented by stochastic Gaussian distribution, which is described by mean and covariance embeddings, and then the corresponding embeddings are fused to generate a final representation for each item. To attenuate noise, stacked learnable filter layers are applied to smooth the fused embeddings. The similarities between the distributions inferred from the last filter layer and candidates are measured by 2-Wasserstein distance for generating recommendation list. Extensive experiments on four public real-world datasets demonstrate the superiority of the proposed model over state-of-the-art baselines, especially on cold start users and items.

    Full text in ACM Digital Library

  • RESDomain Disentanglement with Interpolative Data Augmentation for Dual-Target Cross-Domain Recommendation
    by Jiajie Zhu (Macquarie University), Yan Wang (Macquarie University), Feng Zhu (Ant Group) and Zhu Sun (Macquarie University).

    The conventional single-target Cross-Domain Recommendation (CDR) aims to improve the recommendation performance on a sparser target domain by transferring the knowledge from a source domain that contains relatively richer information. By contrast, in recent years, dual-target CDR has been proposed to improve the recommendation performance on both domains simultaneously. However, to this end, there are two challenges in dual-target CDR: (1) how to generate both relevant and diverse augmented user representations, and (2) how to effectively decouple domain-independent information from domain-specific information, in addition to domain-shared information, to capture comprehensive user preferences. To address the above two challenges, we propose a Disentanglement-based framework with Interpolative Data Augmentation for dual-target Cross-Domain Recommendation, called DIDA-CDR. In DIDA-CDR, we first propose an interpolative data augmentation approach to generating both relevant and diverse augmented user representations to augment sparser domain and explore potential user preferences. We then propose a disentanglement module to effectively decouple domain-specific and domain-independent information to capture comprehensive user preferences. Both steps significantly contribute to capturing more comprehensive user preferences, thereby improving the recommendation performance on each domain. Extensive experiments conducted on five real-world datasets show the significant superiority of DIDA-CDR over the state-of-the-art methods.

    Full text in ACM Digital Library

  • RESDREAM: Decoupled Representation via Extraction Attention Module and Supervised Contrastive Learning for Cross-Domain Sequential Recommender
    by Xiaoxin Ye (School of Computer Science and Engineering, UNSW), Yun Li (School of Computer Science and Engineering, UNSW) and Lina Yao (CSIRO Data61, School of Computer Science and Engineering UNSW).

    Cross-Domain Sequential Recommendation(CDSR) aims to generate accurate predictions for future interactions by leveraging users’ cross-domain historical interactions. One major challenge of CDSR is how to jointly learn the single- and cross-domain user preferences efficiently. To enhance the target domain’s performance, most existing solutions start by learning the single-domain user preferences within each domain and then transferring the acquired knowledge from the rich domain to the target domain. However, this approach ignores the inter-sequence item relationship and also limits the opportunities for target domain knowledge to enhance the rich domain performance. Moreover, it also ignores the information within the cross-domain sequence. Despite cross-domain sequences being generally noisy and hard to learn directly, they contain valuable user behavior patterns with great potential to enhance performance. Another key challenge of CDSR is data sparsity, which also exists in other recommendation system problems. In the real world, the data distribution of the recommendation system is highly skewed to the popular products, especially on the large-scale dataset with millions of users and items. One more challenge is the class imbalance problem, inherited by the Sequential Recommendation problem. Generally, each sample only has one positive and thousands of negative samples. To address the above problems together, an innovative Decoupled Representation via Extraction Attention Module (DREAM) is proposed for CDSR to simultaneously learn single- and cross-domain user preference via decoupled representations. A novel Supervised Contrastive Learning framework is introduced to model the inter-sequence relationship as well as address the data sparsity via data augmentations. DREAM also leverages Focal Loss to put more weight on misclassified samples to address the class-imbalance problem, with another uplift on the overall model performance. Extensive experiments had been conducted on two cross-domain recommendation datasets, demonstrating DREAM outperforms various SOTA cross-domain recommendation algorithms achieving up to a 75% uplift in Movie-Book Scenarios.

    Full text in ACM Digital Library

  • RESEquivariant Contrastive Learning for Sequential Recommendation
    by Peilin Zhou (HKUST (Guangzhou)), Jingqi Gao (Upstage), Yueqi Xie (HKUST), Qichen Ye (Peking University), Yining Hua (Harvard Medical School), Jaeboum Kim (The University of Hong Kong Science and Technology, Upstage), Shoujin Wang (Data Science Institute, University of Technology Sydney) and Sunghun Kim (The University of Hong Kong Science and Technology).

    Contrastive learning (CL) benefits the training of sequential recommendation models with informative self-supervision signals. Existing solutions apply general sequential data augmentation strategies to generate positive pairs and encourage their representations to be invariant. However, due to the inherent properties of user behavior sequences, some augmentation strategies, such as item substitution, can lead to changes in user intent. Learning indiscriminately invariant representations for all augmentation strategies might be sub-optimal. Therefore, we propose Equivariant Contrastive Learning for Sequential Recommendation (ECL-SR), which endows SR models with great discriminative power, making the learned user behavior representations sensitive to invasive augmentations (e.g., item substitution) and insensitive to mild augmentations (e.g., feature-level dropout masking). In detail, we use the conditional discriminator to capture differences in behavior due to item substitution, which encourages the user behavior encoder to be equivariant to invasive augmentations. Comprehensive experiments on four benchmark datasets show that the proposed ECL-SR framework achieves competitive performance compared to state-of-the-art SR models. The source code will be released.

    Full text in ACM Digital Library

  • RESExploring False Hard Negative Sample in Cross-Domain Recommendation
    by Haokai Ma (Shandong University), Ruobing Xie (WeChat, Tencent), Lei Meng (School of software, Shandong University), Xin Chen (tencent), Xu Zhang (WeChat Search Application Department, Tencent Inc.), Leyu Lin (WeChat Search Application Department, Tencent) and Jie Zhou (Wechat, Tencent).

    Negative Sampling in recommendation aims to capture informative negative instances for the sparse user-item interactions to improve the performance. Conventional negative sampling methods tend to select informative hard negative samples (HNS) besides the default random samples. However, these hard negative sampling methods usually struggle with false hard negative samples (FHNS), which happens when a user-item interaction has not been observed yet and is picked as a negative sample, while the user will actually interact with this item once exposed to it. Such FHNS issues may seriously confuse the model training, while most conventional hard negative sampling methods do not systematically explore and distinguish FHNS from HNS. To address this issue, we propose a novel model-agnostic Real Hard Negative Sampling (RealHNS) framework specially for cross-domain recommendation (CDR), which aims to discover the false and refine the real from all HNS via both general and cross-domain real hard negative sample selectors. For the general part, we conduct the coarse-grained and fine-grained real HNS selectors sequentially, armed with a dynamic item-based FHNS filter to find high-quality HNS. For the cross-domain part, we further design a new cross-domain HNS for alleviating negative transfer in CDR and discover its corresponding FHNS via a dynamic user-based FHNS filter to keep its power. We conduct experiments on four datasets based on three representative model-agnostic hard negative sampling methods, along with extensive model analyses, ablation studies, and universality analyses. The consistent improvements indicate the effectiveness, robustness, and universality of RealHNS, which is also easy-to-deploy in real-world systems as a plug-and-play strategy. The source code will be released in the future.

    Full text in ACM Digital Library

  • RESFast and Examination-agnostic Reciprocal Recommendation in Matching Markets
    by Yoji Tomita (CyberAgent, Inc.), Riku Togashi (CyberAgent, Inc.), Yuriko Hashizume (CyberAgent, Inc.) and Naoto Ohsaka (CyberAgent, Inc.).

    n matching markets such as job posting and online dating platforms, the recommender system plays a critical role in the success of the platform. Unlike standard recommender systems that suggest items to users, reciprocal recommender systems (RRSs) that suggest other users must take into account the mutual interests of users. In addition, ensuring that recommendation opportunities do not disproportionately favor popular users is essential for the total number of matches and for fairness among users. Existing recommendation methods in matching markets, however, face computational challenges on large-scale platforms and depend on specific examination functions in the position-based model (PBM). In this paper, we introduce the reciprocal recommendation method based on the matching with transferable utility (TU matching) model in the context of ranking recommendations in matching markets and propose a fast and examination-model-free algorithm. Furthermore, we evaluate our approach on experiments with synthetic data and real-world data from an online dating platform in Japan. Our method performs better than or as well as existing methods in terms of the number of total matches and works well even in a large-scale dataset for which one existing method does not work.

    Full text in ACM Digital Library

  • RESFull Index Deep Retrieval: End-to-End User and Item Structures for Cold-start and Long-tail Item Recommendation
    by Zhen Gong (Shanghai Jiao Tong University), Xin Wu (Bytedance Inc.), Lei Chen (Bytedance Inc.), Zhenzhe Zheng (Shanghai Jiao Tong University), Shengjie Wang (Bytedance Inc.), Anran Xu (Shanghai Jiao Tong University), Chong Wang (Bytedance Inc.) and Fan Wu (Shanghai Jiao Tong University).

    End-to-end retrieval models, such as Tree-based Models (TDM) and Deep Retrieval (DR), have attracted a lot of attention, but they are flawed in cold-start and long-tail item recommendation scenarios. Specifically, DR learns a compact indexing structure, enabling efficient and accurate retrieval for large recommendation systems. However, it is discovered that DR largely fails on retrieving cold-start and long-tail items. This is because DR only utilizes user-item interaction data, which is rare and often noisy for cold-start and long-tail items. And the end-to-end retrieval models are unable to make use of the rich item content features. To address this issue while maintaining the efficiency of DR indexing structure, we propose Full Index Deep Retrieval (FIDR) that learns indices for the full corpus items, including cold-start and long-tail items. In addition to the original structure in DR (called User Structure in FIDR) that learns with user-item interaction data (e.g., clicks), we add an Item Structure to embed items directly based on item content features (e.g., categories). With joint efforts of User Structure and Item Structure, FIDR makes cold-start items retrievable and also improves the recommendation quality of long-tail items. To our best knowledge, FIDR is the first to solve the cold-start and long-tail recommendation problem for the end-to-end retrieval models. Through extensive experiments on three real-world datasets, we demonstrate that FIDR can effectively recommend cold-start and long-tail items and largely promote overall recommendation performance without sacrificing inference efficiency. According to the experiments, the recall of FIDR is improved by 8.8% ~ 11.9%, while the inference of FIDR is as efficient as DR.

    Full text in ACM Digital Library

  • RESGenerative Learning Plan Recommendation for Employees: A Performance-aware Reinforcement Learning Approach
    by Zhi Zheng (University of Science and Technology of China), Ying Sun (The Hong Kong University of Science and Technology (Guangzhou)), Xin Song (Baidu), Hengshu Zhu (BOSS Zhipin) and Hui Xiong (The Hong Kong University of Science and Technology (Guangzhou)).

    With the rapid development of enterprise Learning Management Systems (LMS), more and more companies are trying to build enterprise training and course learning platforms for promoting the career development of employees. Indeed, through course learning, many employees have the opportunity to improve their knowledge and skills. For these systems, a major issue is how to recommend learning plans, i.e., a set of courses arranged in the order they should be learned, that can help employees improve their work performance. Existing studies mainly focus on recommending courses that users are most likely to click on by capturing their learning preferences. However, the learning preference of employees may not be the right fit for their career development, and thus it may not necessarily mean their work performance can be improved accordingly. Furthermore, how to capture the mutual correlation and sequential effects between courses, and ensure the rationality of the generated results, is also a major challenge. To this end, in this paper, we propose the Generative Learning plAn recommenDation (GLAD) framework, which can generate personalized learning plans for employees to help them improve their work performance. Specifically, we first design a performance predictor and a rationality discriminator, which have the same transformer-based model architecture, but with totally different parameters and functionalities. In particular, the performance predictor is trained for predicting the work performance of employees based on their work profiles and historical learning records, while the rationality discriminator aims to evaluate the rationality of the generated results. Then, we design a learning plan generator based on the gated transformer and the cross-attention mechanism for learning plan generation. We calculate the weighted sum of the output from the performance predictor and the rationality discriminator as the reward, and we use Self-Critical Sequence Training (SCST) based policy gradient methods to train the generator following the Generative Adversarial Network (GAN) paradigm. Finally, extensive experiments on real-world data clearly validate the effectiveness of our GLAD framework compared with state-of-the-art baseline methods and reveal some interesting findings for talent management

    Full text in ACM Digital Library

  • RESGoal-Oriented Multi-Modal Interactive Recommendation with Verbal and Non-Verbal Relevance Feedback
    by Yaxiong Wu (University of Glasgow), Craig Macdonald (University of Glasgow) and Iadh Ounis (University of Glasgow).

    Interactive recommendation enables users to provide verbal and non-verbal relevance feedback (such as natural-language critiques and likes/dislikes) when viewing a ranked list of recommendations (such as images of fashion products) to guide the recommender system towards their desired items (i.e. goals) across multiple interaction turns. The multi-modal interactive recommendation (MMIR) task has been successfully formulated with deep reinforcement learning (DRL) algorithms by simulating the interactions between an environment (i.e. a user) and an agent (i.e. a recommender system). However, it is typically challenging and unstable to optimise the agent to improve the recommendation quality associated with implicit learning of multi-modal representations in an end-to-end fashion in DRL. This is known as the coupling of policy optimisation and representation learning. To address this coupling issue, we propose a novel goal-oriented multi-modal interactive recommendation model (GOMMIR) that uses both verbal and non-verbal relevance feedback to effectively incorporate the users’ preferences over time. Specifically, our GOMMIR model employs a multi-task learning approach to explicitly learn the multi-modal representations using a multi-modal composition network when optimising the recommendation agent. Moreover, we formulate the MMIR task using goal-oriented reinforcement learning and enhance the optimisation objective by leveraging non-verbal relevance feedback for hard negative sampling and providing extra goal-oriented rewards to effectively optimise the recommendation agent. Following previous work, we train and evaluate our GOMMIR model by using user simulators that can generate natural-language feedback about the recommendations as a surrogate for real human users. Experiments conducted on four well-known fashion datasets demonstrate that our proposed GOMMIR model yields significant improvements in comparison to the existing state-of-the-art baseline models.

    Full text in ACM Digital Library

  • RESGoing Beyond Local: Global Graph-Enhanced Personalized News Recommendations
    by Boming Yang (The University of Tokyo), Dairui Liu (University College Dublin), Toyotaro Suzumura (The University of Tokyo), Ruihai Dong (University College Dublin) and Irene Li (The University of Tokyo).

    Precisely recommending candidate news articles to users has always been a core challenge for personalized news recommendation systems. Most recent work primarily focuses on using advanced natural language processing (NLP) techniques to extract semantic information from rich textual data, employing content-based methods derived from locally viewed historical clicked news. However, this approach lacks a global perspective, failing to account for users’ hidden motivations and behaviors beyond semantic information. To address this challenge, we propose a novel model called GLORY(Global-LOcal news Recommendation sYstem), which combines global news representations learned from other users with local news representations to enhance personalized recommendation systems. We accomplish this by constructing a Global Clicked News Encoder, which includes a global news graph and employs gated graph neural networks to fuse news representations, thereby enriching clicked news representations. Similarly, we extend this approach to a Global Candidate News Encoder, utilizing a global entity graph and candidate news fusion to enhance candidate news representation. Evaluation results on two public news datasets demonstrate that our method outperforms existing approaches. Furthermore, our model offers more diverse recommendations.

    Full text in ACM Digital Library

  • RESGradient Matching for Categorical Data Distillation in CTR Prediction
    by Cheng Wang (School of Cyber Science and Engineering,Huazhong University of Science and Technology, Wuhan), Jiacheng Sun (Huawei Noah’s Ark Lab), Zhenhua Dong (Huawei Noah’s Ark Lab), Ruixuan Li (School of Computer Science and Technology,Huazhong University of Science and Technology, Wuhan) and Rui Zhang (ruizhang.info).

    The cost of hardware and energy consumption on training a click-through rate (CTR) model is highly prohibitive. A recent promising direction for reducing such costs is data distillation with gradient matching, which aims to synthesize a small distilled dataset to guide the model to a similar parameter space as those trained on real data. However, there are two main challenges to implementing such a method in the recommendation field: (1) The categorical recommended data are high dimensional and sparse one- or multi-hot data which will block the gradient flow, causing backpropagation-based data distillation invalid. (2) The data distillation process with gradient matching is computationally expensive due to the bi-level optimization. To this end, we investigate efficient data distillation tailored for recommendation data with plenty of side information where we formulate the discrete data to the dense and continuous data format. Then, we further introduce a one-step gradient matching scheme, which performs gradient matching for only a single step to overcome the inefficient training process. The overall proposed method is called Categorical data distillation with Gradient Matching (CGM), which is capable of distilling a large dataset into a small of informative synthetic data for training CTR models from scratch. Experimental results show that our proposed method not only outperforms the state-of-the-art coreset selection and data distillation methods but also has remarkable cross-architecture performance. Moreover, we explore the application of CGM on continual updating and mitigate the effect of different random seeds on the training results.

    Full text in ACM Digital Library

  • RESgSASRec: Reducing Overconfidence in Sequential Recommendation Trained with Negative Sampling
    by Aleksandr V. Petrov (University of Glasgow) and Craig Macdonald (University of Glasgow).

    Large catalogue size is one of the central challenges in training recommendation models: a large number of items makes it infeasible to compute scores for all items during training, forcing models to deploy negative sampling. However, negative sampling increases the proportion of positive interactions in the training data. Therefore models trained with negative sampling tend to overestimate the probabilities of positive interactions — a phenomenon we call overconfidence. While the absolute values of the predicted scores/probabilities are unimportant for ranking retrieved recommendations, overconfident models may fail to estimate nuanced differences in the top-ranked items, resulting in degraded performance. This paper shows that overconfidence explains why the popular SASRec model underperforms when compared to BERT4Rec (contrary to the BERT4Rec authors’ attribution to the bi-directional attention mechanism). We propose a novel Generalised Binary Cross-Entropy Loss function (gBCE) to mitigate overconfidence and theoretically prove that it can mitigate overconfidence. We further propose the gSASRec model, an improvement over SASRec that deploys an increased number of negatives and gBCE loss. We show through detailed experiments on three datasets that gSASRec does not exhibit the overconfidence problem. As a result, gSASRec can outperform BERT4Rec (e.g.\ +9.47\% NDCG on MovieLens-1M), while requiring less training time (e.g.\ -73\% training time on MovieLens-1M). Moreover, in contrast to BERT4Rec, gSASRec is suitable for large datasets that contain more than 1 million items.

    Full text in ACM Digital Library

  • RESHow Should We Measure Filter Bubbles? A Regression Model and Evidence for Online News
    by Lien Michiels (UAntwerpen), Jorre Vannieuwenhuyze (Statistiek Vlaanderen), Jens Leysen (University of Antwerp), Robin Verachtert (Froomle NV), Annelien Smets (imec-SMIT, Vrije Universiteit Brussel) and Bart Goethals (University of Antwerp).

    News media play an important role in democratic societies. Central to fulfilling this role is the premise that users should be exposed to diverse news. However, news recommender systems are gaining popularity on news websites, which has sparked concerns over filter bubbles. Editors, policy-makers and scholars are worried that news recommender systems may expose users to less diverse content over time. To the best of our knowledge, this hypothesis has not been tested in a longitudinal observational study of real users that interact with a real news website. Such observational studies require the use of research methods that are robust and can account for the many covariates that may influence the diversity of recommendations at any given time. In this work, we propose an analysis model to study whether the variety of articles recommended to a user decreases over time, in observational studies of real news websites with real users. Further, we present results from two case studies using aggregated and anonymized data that were collected by two western European news websites employing a collaborative filtering-based news recommender system to serve (personalized) recommendations to their users. Through these case studies we validate empirically that our modeling assumptions are sound and supported by the data, and that our model obtains more reliable and interpretable results than analysis methods used in prior empirical work on filter bubbles. Our case studies provide evidence of a small decrease in the topic variety of a user’s recommendations in the first weeks after they sign up, but no evidence of a decrease in political variety.

    Full text in ACM Digital Library

  • RESIncentivizing Exploration in Linear Contextual Bandits under Information Gap
    by Huazheng Wang (Oregon State University), Haifeng Xu (University of Chicago), Chuanhao Li (University of Virginia), Zhiyuan Liu (University of colorado,boulder) and Hongning Wang (University of Virginia).

    Contextual bandit algorithms have been popularly used to address interactive recommendation, where the users are assumed to be cooperative to explore all recommendations from a system. In this paper, we relax this strong assumption and study the problem of incentivized exploration with myopic users, where the users are only interested in recommendations with their currently highest estimated reward. As a result, in order to obtain long-term optimality, the system needs to offer compensation to incentivize the users to take the exploratory recommendations. We consider a new and practically motivated setting where the context features employed by the user are more \emph{informative} than those used by the system: for example, features based on users’ private information are not accessible by the system. We develop an effective solution for incentivized exploration under such an information gap, and prove that the method achieves a sublinear rate in both regret and compensation. We theoretically and empirically analyze the added compensation due to the information gap, compared with the case where the system has access to the same context features as the user does, i.e., without information gap. Moreover, we also provide a compensation lower bound of this problem.

    Full text in ACM Digital Library

  • RESInTune: Reinforcement Learning-based Data Pipeline Optimization for Deep Recommendation Models
    by Kabir Nagrecha (University of California, San Diego), Lingyi Liu (Netflix, Inc.), Pablo Delgado (Netflix, Inc.) and Prasanna Padmanabhan (Netflix, Inc.).

    Deep learning-based recommendation models (DLRMs) have become an essential component of many modern recommender systems. Several companies are now building large compute clusters reserved only for DLRM training, driving new interest in cost- & time- saving optimizations. The systems challenges faced in this setting are unique; while typical deep learning (DL) training jobs are dominated by model execution times, the most important factor in DLRM training performance is often online data ingestion.

    In this paper, we explore the unique characteristics of this data ingestion problem and provide insights into the specific bottlenecks and challenges of the DLRM training pipeline at scale. We study real-world DLRM data processing pipelines taken from our compute cluster to both observe the performance impacts of online ingestion and to identify shortfalls in existing data pipeline optimizers. We find that current tooling either yields sub-optimal performance, frequent crashes, or else requires impractical cluster re-organization to adopt. Our studies lead us to design and build a new solution for data pipeline optimization, InTune. InTune employs a reinforcement learning (RL) agent to learn how to distribute CPU resources across a DLRM data pipeline to more effectively parallelize data-loading and improve throughput. Our experiments show that InTune can build an optimized data pipeline configuration within only a few minutes, and can easily be integrated into existing training workflows. By exploiting the responsiveness and adaptability of RL, InTune achieves significantly higher online data ingestion rates than existing optimizers, thus reducing idle times in model execution and increasing efficiency. We apply InTune to our real-world cluster, and find that it increases data ingestion throughput by as much as 2.29X versus current state-of-the-art data pipeline optimizers while also improving both CPU & GPU utilization.

    Full text in ACM Digital Library

  • RESKGTORe: Tailored Recommendations through Knowledge-aware GNN Models
    by Alberto Carlo Maria Mancino (Politecnico di Bari), Antonio Ferrara (Politecnico di Bari), Salvatore Bufi (Polytechnic University of Bari), Daniele Malitesta (Polytechnic University of Bari), Tommaso Di Noia (Polytechnic University of Bari) and Eugenio Di Sciascio (Polytechnic University of Bari).

    Knowledge graphs (KG) have been proven to be a powerful source of side information to enhance the performance of recommendation algorithms. Their graph-based structure paves the way for the adoption of graph-aware learning models such as Graph Neural Networks (GNNs). In this respect, state-of-the-art models achieve good performance and interpretability via user-level combinations of intents leading users to their choices. Unfortunately, such results often come from and end-to-end learnings that considers a combination of the whole set of features contained in the KG without any analysis of the user decisions. In this paper, we introduce KGTORe, a GNN-based model that exploits KG to learn latent representations for the semantic features, and consequently, interpret the user decisions as a personal distillation of the item feature representations. Differently from previous models, KGTORe does not need to process the whole KG at training time but relies on a selection of the most discriminative features for the users, thus resulting in improved performance and personalization. Experimental results on three well-known datasets show that KGTORe achieves remarkable accuracy performance and several ablation studies demonstrate the effectiveness of its components.

    Full text in ACM Digital Library

  • RESKnowledge-based Multiple Adaptive Spaces Fusion for Recommendation
    by Meng Yuan (Institute of Artificial Intelligence, Beihang University, Beijing 100191, China), Fuzhen Zhuang (Institute of Artificial Intelligence, Beihang University, Beijing 100191, China), Zhao Zhang (University of Chinese Academy of Sciences, Beijing 100191, China), Deqing Wang (School of Computer Science and Engineering, Beihang University, Beijing 100191, China) and Jin Dong (Beijing Academy of Blockchain and Edge Computing).

    Since Knowledge Graphs (KGs) contain rich semantic information, recently there has been an influx of KG-enhanced recommendation methods. Most of existing methods are entirely designed based on euclidean space without considering curvature. However, recent studies have revealed that a tremendous graph-structured data exhibits highly non-euclidean properties. Motivated by these observations, in this work, we propose a knowledge-based multiple adaptive spaces fusion method for recommendation, namely MCKG. Unlike existing methods that solely adopt a specific manifold, we introduce the unified space that is compatible with hyperbolic, euclidean and spherical spaces. Furthermore, we fuse the multiple unified spaces in an attention manner to obtain the high-quality embeddings for better knowledge propagation. In addition, we propose a geometry-aware optimization strategy which enables the pull and push processes benefited from both hyperbolic and spherical spaces. Specifically, in hyperbolic space, we set smaller margins in the area near to the origin, which is conducive to distinguishing between highly similar positive items and negative ones. At the same time, we set larger margins in the area far from the origin to ensure the model has sufficient error tolerance. The similar manner also applies to spherical spaces. Extensive experiments on three real-world datasets demonstrate that the MCKG has a significant improvement over state-of-the-art recommendation methods. Further ablation experiments verify the importance of multi-space fusion and geometry-aware optimization strategy, justifying the rationality and effectiveness of MCKG.

    Full text in ACM Digital Library

  • RESMasked and Swapped Sequence Modeling for Next Novel Basket Recommendation in Grocery Shopping
    by Ming Li (University of Amsterdam), Mozhdeh Ariannezhad (University of Amsterdam), Andrew Yates (University of Amsterdam) and Maarten de Rijke (University of Amsterdam).

    Next basket recommendation (NBR) is the task of predicting the next set of items based on a sequence of already purchased baskets. It is a recommendation task that has been widely studied, especially in the context of grocery shopping. In NBR, it is useful to distinguish between repeat items, i.e., items that a user has consumed before, and explore items, i.e., items that a user has not consumed before. Most NBR work either ignores this distinction or focuses on repeat items.

    We formulate the next novel basket recommendation (NNBR) task, i.e., the task of recommending a basket that only consists of novel items, which is valuable for both real-world application and NBR evaluation. We evaluate how existing NBR methods perform on the NNBR task and find that, so far, limited progress has been made w.r.t. the NNBR task. To address the NNBR task, we propose a simple bi-directional transformer basket recommendation model (BTBR), which is focused on directly modeling item-to-item correlations within and across baskets instead of learning complex basket representations. To properly train BTBR, we propose and investigate several masking strategies and training objectives: (i) item-level random masking, (ii) item-level select masking, (iii) basket-level all masking, (iv) item basket-level explore masking, and (v) joint masking. In addition, an item-basket swapping strategy is proposed to enrich the item interactions within the same baskets.

    We conduct extensive experiments on three open datasets with various characteristics. The results demonstrate the effectiveness of BTBR and our masking and swapping strategies for the NNBR task. BTBR with a properly selected masking and swapping strategy can substantially improve the NNBR performance.

    Full text in ACM Digital Library

  • RESMulti-Relational Contrastive Learning for Recommendation
    by Wei Wei (University of Hong Kong), Lianghao Xia (University of Hong Kong) and Chao Huang (University of Hong Kong).

    Dynamic behavior modeling has become a crucial task for personalized recommender systems that aim to learn users’ time-evolving preferences on online platforms. However, many recommendation models rely on a single type of behavior learning, which significantly limits their ability to represent user-item relationships in real-life applications where interactions between users and items often come in multiple types (e.g., click, tag-as-favorite, review, and purchase). To offer better recommendations, this paper proposes the Evolving Graph Contrastive Memory Network (EGCM) to model dynamic interaction heterogeneity. Firstly, we develop a multi-relational graph encoder to capture short-term preference heterogeneity and preserve the dedicated relation semantics for different types of user-item interactions. Additionally, we design a dynamic cross-relational memory network that enables EGCM to capture users’ long-term multi-behavior preferences and the underlying evolving cross-type behavior dependencies over time. To obtain robust and informative user representations with both commonality and diversity across multi-behavior interactions, we design a multi-relational contrastive learning paradigm with heterogeneous short- and long-term interest modeling. We further provide theoretical analyses to support the modeling of commonality and diversity from the perspective of enhancing model optimization. Experiments on several real-world datasets demonstrate the superiority of our recommender system over various state-of-the-art baselines.

    Full text in ACM Digital Library

  • RESMulti-task Item-attribute Graph Pre-training for Strict Cold-start Item Recommendation
    by Yuwei Cao (University of Illinois at Chicago), Liangwei Yang (University of Illinois Chicago), Chen Wang (University of Illinois Chicago), Zhiwei Liu (Salesforce Inc.), Hao Peng (Beihang University), Chenyu You (Yale University) and Philip Yu (University of Illinois Chicago).

    Recommendation systems suffer in the strict cold-start (SCS) scenario, where the user-item interactions are entirely unavailable. The well-established, dominating identity (ID)-based approaches completely fail to work. Cold-start recommenders, on the other hand, leverage item contents (brand, title, descriptions, etc.) to map the new items to the existing ones. However, the existing SCS recommenders explore item contents in coarse-grained manners that introduce noise or information loss. Moreover, informative data sources other than item contents, such as users’ purchase sequences and review texts, are largely ignored. In this work, we explore the role of the fine-grained item attributes in bridging the gaps between the existing and the SCS items and pre-train a knowledgeable item-attribute graph for SCS item recommendation. Our proposed framework, ColdGPT, models item-attribute correlations into an item-attribute graph by extracting fine-grained attributes from item contents. ColdGPT then transfers knowledge into the item-attribute graph from various available data sources, i.e., item contents, historical purchase sequences, and review texts of the existing items, via multi-task learning. To facilitate the positive transfer, ColdGPT designs specific submodules according to the natural forms of the data sources and proposes to coordinate the multiple pre-training tasks via unified alignment-and-uniformity losses. Our pre-trained item-attribute graph acts as an implicit, extendable item embedding matrix, which enables the SCS item embeddings to be easily acquired by inserting these items into the item-attribute graph and propagating their attributes’ embeddings. We carefully process three public datasets, i.e., Yelp, Amazon-home, and Amazon-sports, to guarantee the SCS setting for evaluation. Extensive experiments show that ColdGPT consistently outperforms the existing SCS recommenders by large margins and even surpasses models that are pre-trained on 75 – 224 times more, cross-domain data on two out of four datasets. Our code and pre-processed datasets for SCS evaluations are publicly available to help future SCS studies.

    Full text in ACM Digital Library

  • RESOnline Matching: A Real-time Bandit System for Large-scale Recommendations
    by Xinyang Yi (Google), Shao-Chuan Wang (Google), Ruining He (Google), Hariharan Chandrasekaran (Google), Charles Wu (Google), Lukasz Heldt (Google), Lichan Hong (Google), Minmin Chen (Google) and Ed Chi (Google).

    The last decade has witnessed many successes of deep learning-based models for industry-scale recommender systems. These models are typically trained offline in a batch manner. While being effective in capturing users’ past interactions with recommendation platforms, batch learning suffers from long model-update latency and is vulnerable to system biases, making it hard to adapt to distribution shift and explore new items or user interests. Although online learning-based approaches (e.g., multi-armed bandits) have demonstrated promising theoretical results in tackling these challenges, their practical real-time implementation in large-scale recommender systems remains limited. First, the scalability of online approaches in servicing a massive online traffic while ensuring timely updates of bandit parameters poses a significant challenge. Additionally, exploring uncertainty in recommender systems can easily result in unfavorable user experience, highlighting the need for devising intricate strategies that effectively balance the trade-off between exploitation and exploration. In this paper, we introduce \textsl{Online Matching}: a scalable closed-loop bandit system learning from users’ direct feedback on items in real time. We present a hybrid \textsl{offline + online} approach for constructing this system, accompanied by a comprehensive exposition of the end-to-end system architecture. We propose Diag-LinUCB — a novel extension of the LinUCB algorithm — to enable distributed updates of bandits parameter in a scalable and timely manner. We conduct live experiments in YouTube and show that Online Matching is able to enhance the capabilities of fresh content discovery and item exploration in the present platform.

    Full text in ACM Digital Library

  • RESPairwise Intent Graph Embedding Learning for Context-Aware Recommendation
    by Dugang Liu (Guangdong Laboratory of Artificial Intelligence and Digital Economy (SZ)), Yuhao Wu (Shenzhen University), Weixin Li (Shenzhen University), Xiaolian Zhang (Huawei 2012 Lab), Hao Wang (Huawei 2012 Lab), Qinjuan Yang (Huawei 2012 Lab) and Zhong Ming (College of Computer Science and Software Engineering, Shenzhen University).

    Although knowledge graph have shown their effectiveness in mitigating data sparsity in many recommendation tasks, they remain underutilized in context-aware recommender systems (CARS) with the specific sparsity challenges associated with the contextual features, i.e., feature sparsity and interaction sparsity. To bridge this gap, in this paper, we propose a novel pairwise intent graph embedding learning (PING) framework to efficiently integrate knowledge graph into CARS. Specifically, our PING contains three modules: 1) a graph construction module is used to obtain a pairwise intent graph (PIG) containing nodes for users, items, entities and enhanced intent, where enhanced intent nodes are generated by applying user intent fusion (UIF) on relational intent and contextual intent, and two sub-intents are derived from the semantic information and contextual information, respectively; 2) a pairwise intent joint graph convolution module is used to obtain the refined embeddings of all the features by executing a customized convolution strategy on PIG, where each enhanced intent node acts as a hub to efficiently propagate information among different features and between all the features and knowledge graph; 3) a recommendation module with the refined embeddings is used to replace the randomly initialized embeddings of downstream recommendation models to improve model performance. Finally, we conduct extensive experiments on three public datasets to verify the effectiveness and compatibility of our PING.

    Full text in ACM Digital Library

  • RESReciprocal Sequential Recommendation
    by Bowen Zheng (Renmin University of China), Yupeng Hou (Renmin University of China), Wayne Xin Zhao (Renmin University of China), Yang Song (BOSS Zhipin) and Hengshu Zhu (BOSS Zhipin).

    Reciprocal recommender system (RRS), considering a two-way matching between two parties, has been widely applied in online platforms like online dating and recruitment. Existing RRS models mainly capture static user preferences, which have neglected the evolving user tastes and the dynamic matching relation between the two parties. Although dynamic user modeling has been well-studied in sequential recommender systems, existing solutions are developed in a user-oriented manner. Therefore, it is non-trivial to adapt sequential recommendation algorithms to reciprocal recommendation. In this paper, we formulate RRS as a distinctive sequence matching task, and further propose a new approach ReSeq for RRS, which is short for Reciprocal Sequential recommendation. To capture duel-perspective matching, we propose to learn fine-grained sequence similarities by co-attention mechanism across different time steps. Further, to improve the inference efficiency, we introduce the self-distillation technique to distill knowledge from the fine-grained matching module into the more efficient student module. In the deployment stage, only the efficient student module is used, greatly speeding up the similarity computation. Extensive experiments on five real-world datasets from two scenarios demonstrate the effectiveness and efficiency of the proposed method. Our code is available at https://anonymous.4open.science/r/ReSeq/.

    Full text in ACM Digital Library

  • RESRethinking Multi-Interest Learning for Candidate Matching in Recommender Systems
    by Yueqi Xie (HKUST), Jingqi Gao (Upstage), Peilin Zhou (HKUST (gz)), Qichen Ye (Peking University), Yining Hua (Massachusetts Institute of Technology), Jae Boum Kim (Hong Kong University of Science and Technology), Fangzhao Wu (MSRA) and Sunghun Kim (Hong Kong University of Science and Technology).

    Existing research efforts for multi-interest candidate matching in recommender systems mainly focus on improving model architecture or incorporating additional information, neglecting the importance of training schemes. This work revisits the training framework and uncovers two major problems hindering the expressiveness of learned multi-interest representations. First, the current training objective (i.e., uniformly sampled softmax) fails to effectively train discriminative representations in a multi-interest learning scenario due to the severe increase in easy negative samples. Second, a routing collapse problem is observed where each learned interest may collapse to express information only from a single item, resulting in information loss. To address these issues, we propose the REMI framework, consisting of an Interest-aware Hard Negative mining strategy (IHN) and a Routing Regularization (RR) method. IHN emphasizes interest-aware hard negatives by proposing an ideal sampling distribution and developing a Monte-Carlo strategy for efficient approximation. RR prevents routing collapse by introducing a novel regularization term on the item-to-interest routing matrices. These two components enhance the learned multi-interest representations from both the optimization objective and the composition information. REMI is a general framework that can be readily applied to various existing multi-interest candidate matching methods. Experiments on three real-world datasets show our method can significantly improve state-of-the-art methods with easy implementation and negligible computational overhead. The source code is available at https://anonymous.4open.science/r/ReMIRec-B64C/.

    Full text in ACM Digital Library

  • RESSPARE: Shortest Path Global Item Relations for Efficient Session-based Recommendation
    by Andreas Peintner (Universität Innsbruck), Amir Reza Mohammadi (Universität Innsbruck) and Eva Zangerle (Universität Innsbruck).

    Session-based recommendation aims to predict the next item based on a set of anonymous sessions. Capturing user intent from a short interaction sequence imposes a variety of challenges since no user profiles are available and interaction data is naturally sparse. Recent approaches relying on graph neural networks (GNNs) for session-based recommendation use global item relations to explore collaborative information from different sessions. These methods capture the topological structure of the graph and rely on multi-hop information aggregation in GNNs to exchange information along edges. Consequently, graph-based models suffer from noisy item relations in the training data and introduce high complexity for large item catalogs. We propose to explicitly model the multi-hop information aggregation mechanism over multiple layers via shortest-path edges based on knowledge from the sequential recommendation domain. Our approach does not require multiple layers to exchange information and ignores unreliable item-item relations. Furthermore, to address inherent data sparsity, we are the first to apply supervised contrastive learning by mining data-driven positive and hard negative item samples from the training data. Extensive experiments on three different datasets show that the proposed approach outperforms almost all of the state-of-the-art methods.

    Full text in ACM Digital Library

  • RESSTAN: Stage-Adaptive Network for Multi-Task Recommendation by Learning User Lifecycle-Based Representation
    by Wanda Li (Tsinghua University), Wenhao Zheng (Shopee Company), Xuanji Xiao (Shopee Company) and Suhang Wang (Penn State University).

    Recommendation systems play a vital role in many online platforms, with their primary objective being to satisfy and retain users. As directly optimizing user retention is challenging, multiple evaluation metrics are often employed. Existing methods generally formulate the optimization of these evaluation metrics as a multi-task learning problem, but often overlook the fact that user preferences for different tasks are personalized and change over time. Identifying and tracking the evolution of user preferences can lead to better user retention. To address this issue, we introduce the concept of “user lifecycle,” consisting of multiple stages characterized by users’ varying preferences for different tasks. We propose a novel \textbf{St}age-\textbf{A}daptive \textbf{N}etwork (\textbf{STAN}) framework for modeling user lifecycle stages. STAN first identifies latent user lifecycle stages based on learned user preferences, and then employs the stage representation to enhance multi-task learning performance. Our experimental results using both public and industrial datasets demonstrate that the proposed model significantly improves multi-task prediction performance compared to state-of-the-art methods, highlighting the importance of considering user lifecycle stages in recommendation systems. Furthermore, online A/B testing reveals that our model outperforms the existing model, achieving a significant improvement of 3.05\% in staytime per user and 0.88\% in CVR. These results indicate that our approach effectively improves the overall efficiency of the multi-task recommendation system.

    Full text in ACM Digital Library

  • RESSTRec: Sparse Transformer for Sequential Recommendations
    by Chengxi Li (City University of Hong Kong), Xiangyu Zhao (City University of Hong Kong), Yejing Wang (City University of Hong Kong), Qidong Liu (Xi’an Jiaotong University, City University of Hong Kong), Wanyu Wang (City University of Hong Kong), Yiqi Wang (Michigan State University), Lixin Zou (Wuhan University), Wenqi Fan (The Hong Kong Polytechnic University) and Qing Li (The Hong Kong Polytechnic University).

    With the rapid evolution of transformer architectures, an increasing number of researchers are exploring their application in sequential recommender systems (SRSs). Compared with the former SRS models, the transformer-based models get promising performance on SRS tasks. Existing transformer-based SRS frameworks, however, retain the vanilla attention mechanism, which calculates the attention scores between all item-item pairs in each layer, i.e., item interactions. Consequently, redundant item interactions may downgrade the inference speed and cause high memory costs for the model. In this paper, we first identify the sparse information phenomenon in transformer-based SRS scenarios and propose an efficient model, i.e., Sparse Transformer sequential Recommendation model (STRec). First, we devise a cross-attention-based sparse transformer for efficient sequential recommendation. Then, a novel sampling strategy is derived to preserve the necessary interactions. Extensive experimental results validate the effectiveness of our framework, which could outperform the state-of-the-art accuracy while reducing 54% inference time and 70% memory cost. Besides, we provide massive extended experiments to further investigate the property of our framework. Our code is available to ease reproducibility.

    Full text in ACM Digital Library

  • RESTask Aware Feature Extraction Framework for Sequential Dependence Multi-Task Learning
    by Xuewen Tao (Mybank, Ant Group), Mingming Ha (School of Automation and Electrical Engineering, University of Science and Technology Bejing; Mybank, Ant Group), Qiongxu Ma (Mybank, Ant Group), Hongwei Cheng (Mybank, Ant Group), Wenfang Lin (Mybank, Ant Group) and Xiaobo Guo (Institute of Information Science, Beijing Jiaotong Univeristy; Mybank, Ant Group).

    In online recommendation, financial service, etc., the most common application of multi-task learning (MTL) is the multi-step conversion estimations. A core property of the multi-step conversion is the sequential dependence among tasks. Most existing works focus far more on the specific post-view click-through rate (CTR) and post-click conversion rate (CVR) estimations, which neglect the generalization of sequential dependence multi-task learning (SDMTL). Besides, the performance of the SDMTL framework is also deteriorated by the interference derived from implicitly conflict information passing between adjacent tasks. In this paper, a systematic learning paradigm of the SDMTL problem is established for the first time, which can transform the SDMTL problem into a general MTL problem and be applicable to more general multi-step conversion scenarios with longer conversion path or stronger task dependence. Also, the distribution dependence between adjacent task spaces is illustrated from a theoretical point of view. On the other hand, an SDMTL architecture, named Task Aware Feature Extraction (TAFE), is developed to enable dynamic task representation learning from a sample-wise view. TAFE selectively reconstructs the implicit shared information corresponding to each sample case and performs explicit task-specific extraction under dependence constraints. Extensive experiments on offline public and real-world industrial datasets, and online A/B implementations demonstrate the effectiveness and applicability of proposed theoretical and implementation frameworks.

    Full text in ACM Digital Library

  • RESTowards Robust Fairness-aware Recommendation
    by Hao Yang (Gaoling School of Artificial Intelligence, Renmin University of China, Beijing, China), Zhining Liu (Ant Group), Zeyu Zhang (Gaoling School of Artificial Intelligence, Renmin University of China, Beijing, China), Chenyi Zhuang (Ant Group) and Xu Chen (Gaoling School of Artificial Intelligence, Renmin University of China, Beijing, China).

    Due to the progressive advancement of trustworthy machine learning algorithms, fairness in recommender systems is attracting increasing attention and is often considered from the perspective of users. Conventional fairness-aware recommendation models make the assumption that user preferences remain the same between the training set and the testing set. However, this assumption is disagreed with reality, where user preference can shift in the testing set due to the natural spatial or temporal heterogeneity. It is concerning that conventional fairness-aware models may be unaware of such distribution shifts, leading to a sharp decline in the model performance. To address the distribution shift problem, we propose a robust fairness-aware recommendation framework based on Distributionally Robust Optimization (DRO) technique. In specific, we assign learnable weights for each sample to approximate the distributions that leads to the worst-case model performance, and then optimize the fairness-aware recommendation model to improve the worst-case performance in terms of both fairness and recommendation accuracy. By iteratively updating the weights and the model parameter, our framework can be robust to unseen testing sets. To ease the learning difficulty of DRO, we use a hard clustering technique to reduce the number of learnable sample weights. To optimize our framework in a full differentiable manner, we soften the above clustering strategy. Empirically, we conduct extensive experiments based on four real-world datasets to verify the effectiveness of our proposed framework. For benefiting the research community, we have released our project at https://anonyrobfair.github.io/.

    Full text in ACM Digital Library

  • RESTrending Now: Modeling Trend Recommendations
    by Hao Ding (AWS AI Labs), Branislav Kveton (AWS AI Labs), Yifei Ma (AWS AI Labs), Youngsuk Park (AWS AI Labs), Venkataramana Kini (AWS AI Labs), Yupeng Gu (AWS AI Labs), Ravi Divvela (AWS AI Labs), Fei Wang (AWS AI Labs), Anoop Deoras (AWS AI Labs) and Hao Wang (AWS AI Labs).

    Modern recommender systems usually include separate recommendation carousels such as ‘trending now’ to list trending items and further boost their popularity, thereby attracting active users. Though widely useful, such ‘trending now‘ carousels typically generate item lists based on simple heuristics, e.g., the number of interactions within a time interval, and therefore still leave much room for improvement. This paper aims to systematically study this under-explored but important problem from the new perspective of time series forecasting. We first provide a set of rigorous definitions related to item trendiness with associated evaluation protocols, and then propose a deep latent variable model, dubbed Trend Recommender (TrendRec), to forecast items’ future trend and generate trending item lists. Experiments on real-world datasets from various domains show that our TrendRec significantly outperforms the baselines, verifying our model’s effectiveness.

    Full text in ACM Digital Library

  • RESTwo-sided Calibration for Quality-aware Responsible Recommendation
    by Chenyang Wang (Tsinghua University), Yankai Liu (China Mobile Research), Yuanqing Yu (Tsinghua University), Weizhi Ma (Tsinghua University), Min Zhang (Tsinghua University), Yiqun Liu (Tsinghua University), Haitao Zeng (China Mobile Research), Junlan Feng (China Mobile Research) and Chao Deng (China Mobile Research).

    Calibration in recommender systems ensures that the user’s interests distribution over groups of items is reflected with their corresponding proportions in the recommendation, which has gained increasing attention recently. For example, a user who watched 80 entertainment videos and 20 knowledge videos is expected to receive recommendations comprising about 80% entertainment and 20% knowledge videos as well. However, with the increasing calls for quality-aware responsible recommendation, it has become inadequate to just match users’ historical behaviors, which could still lead to undesired effects at the system level (e.g., overwhelming clickbaits). In this paper, we envision the two-sided calibration task that not only matches the users’ past interests distribution (user-level calibration) but also guarantees an overall target exposure distribution of different item groups (system-level calibration). The target group exposure distribution can be explicitly pursued by users, platform owners, and even the law (e.g., the platform owners expect about 50% knowledge video recommendation on the whole). To support this scenario, we propose a post-processing method named PCT. PCT first solves personalized calibration targets that minimize the changes in users’ historical interest distributions while ensuring the overall target group exposure distribution. Then, PCT reranks the original recommendation lists according to personalized calibration targets to generate both relevant and two-sided calibrated recommendations. Extensive experiments demonstrate the superior performance of the proposed method compared to calibrated and fairness-aware recommendation approaches. We also release a new dataset with item quality annotations to support further studies about quality-aware responsible recommendation.

    Full text in ACM Digital Library

  • RESUncovering User Interest from Biased and Noised Watch Time in Video Recommendation
    by Haiyuan Zhao (Renmin University of China), Lei Zhang (Renmin University of China), Jun Xu (Renmin University of China), Guohao Cai (Huawei Noah’s ark lab), Zhenhua Dong (Huawei Noah’s ark lab) and Ji-Rong Wen (Renmin University of China).

    In micro-video recommendation scenarios, watch time is commonly adopted as an indicator of users’ interest. However, watch time is not only determined by the matching of users’ interests but is affected by other factors. These factors mainly lie in two folds: on the one hand, users tend to spend more time on those charming videos with the growth of the duration (i.e., video length), named as duration bias; on the other hand, it costs people a period of time to judge whether they like the video, named as noisy watching. Consequently, the existence of duration bias and noisy watching make watch time an inadequate label for training a reliable recommendation model. Moreover, current methods focus only on the duration bias and ignore the duration noise, so they do not really uncover the user interest from watch time. In this study, we first analyze the generation mechanism of users’ watch time in a unified causal viewpoint. Unlike current methods, which only notice the duration bias in watch time, we considered the watch time as a mixture of the user’s actual interest, the duration biased watch time, and the noisy watch time. To mitigate both the duration bias and noisy watching, we propose Debiased and Denoised watch time Correction (D$^2$Co), which can be divided into two steps: First, we employ a duration-wise Gaussian Mixture Model plus frequency-weighted moving average for estimating the bias and noise terms; Then we utilize a sensitivity-controlled correction function to separate the user interest from the watch time, which is robust to the estimation error of bias and noise terms. The experiments on two public video recommendation datasets indicate the effectiveness of the proposed method.

    Full text in ACM Digital Library

  • RESUnderstanding and Modeling Passive-Negative Feedback for Short-video Sequential Recommendation
    by Yunzhu Pan (UESTC), Chen Gao (Tsinghua University), Yang Song (Kuaishou Inc.), Kun Gai (Unaffiliated), Depeng Jin (Department of Electronic Engineering, Tsinghua University) and Yong Li (Tsinghua University).

    Sequential recommendation is one of the most important tasks in recommender systems, which aims to recommend the next interacted item with historical behaviors as input. Traditional sequential recommendation always mainly considers the collected positive feedback such as click, purchase, etc. However, in short-video platforms such as TikTok, video viewing behavior may not always represent positive feedback. Specifically, the videos are played automatically, and users passively receive the recommended videos. In this new scenario, users passively express negative feedback by skipping over videos they do not like, which provides valuable information about their preferences. Different from the negative feedback studied in traditional recommender systems, this passive-negative feedback can reflect users’ interests and serve as an important supervision signal in extracting users’ preferences. Therefore, it is essential to carefully design and utilize it in this novel recommendation scenario. In this work, we first conduct analyses based on a large-scale real-world short-video behavior dataset and illustrate the significance of leveraging passive feedback. We then propose a novel method that deploys the sub-interest encoder, which incorporates positive feedback and passive-negative feedback as supervision signals to learn the user’s current active sub-interest. Moreover, we introduce an adaptive fusion layer to integrate various sub-interests effectively. To enhance the robustness of our model, we then introduce a multi-task learning module to simultaneously optimize two kinds of feedback – passive-negative feedback and traditional randomly-sampled negative feedback. The experiments on two large-scale datasets verify that the proposed method can significantly outperform state-of-the-art approaches. The codes and collected datasets are anonymously released at https:// anonymous.4open.science/ r/ SINE-2047/ to benefit the community.

    Full text in ACM Digital Library

  • RESWhat We Evaluate When We Evaluate Recommender Systems: Understanding Recommender Systems’ Performance using Item Response Theory
    by Yang Liu (University of Helsinki), Alan Medlar (University of Helsinki) and Dorota Glowacka (University of Helsinki).

    Current practices in offline evaluation use rank-based metrics to measure the quality of recommendation lists. This approach has practical benefits as it centers assessment on the output of the recommender system and, therefore, measures performance from the perspective of end-users. However, this methodology neglects how recommender systems more broadly model user preferences, which is not captured by only considering the top-n recommendations. In this article, we use item response theory (IRT), a family of latent variable models used in psychometric assessment, to gain a comprehensive understanding of offline evaluation. We used IRT to jointly estimate the latent abilities of 51 recommendation algorithms and the characteristics of 3 commonly used benchmark data sets. For all data sets, the latent abilities estimated by IRT suggest that higher scores from traditional rank-based metrics do not reflect improvements in modeling user preferences. Furthermore, we show the top-n recommendations with the most discriminatory power are biased towards lower difficulty items, leaving much room for improvement. Lastly, we highlight the role of popularity in evaluation by investigating how user engagement and item popularity influence recommendation difficulty.

    Full text in ACM Digital Library

  • RESWhen Fairness meets Bias: a Debiased Framework for Fairness aware Top-N Recommendation
    by Jiakai Tang (Gaoling School of Artificial Intelligence, Renmin University of China), Shiqi Shen (Wechat, Tencent, Beijing), Zhipeng Wang (Wechat, Tencent, Beijing), Zhi Gong (Wechat, Tencent, Beijing), Jingsen Zhang (Gaoling School of Artificial Intelligence, Renmin University of China) and Xu Chen (Gaoling School of Artificial Intelligence, Renmin University of China).

    Fairness in the recommendation domain has recently attracted increasing attention due to the more and more concerns on the algorithm discrimination and ethics. While recent years have witnessed many promising fairness aware recommender models, an important problem has been largely ignored, that is, the fairness can be biased due to the user personalized selection tendencies or the non-uniform item exposure probabilities. To study this problem, in this paper, we formally define a novel task named as unbiased fairness aware Top-N recommendation. For solving this task, we firstly define an ideal loss function based on all the user-item pairs. Considering that, in real-world datasets, only a small number of user-item interactions can be observed, we then approximate the above ideal loss with a more tractable objective based on the inverse propensity score (IPS). Since the recommendation datasets can be noisy and quite sparse, which brings difficulties for accurately estimating the IPS, we propose to optimize the objective in an IPS range instead of a specific point, which improve the model fault tolerance capability. In order to make our model more applicable to the commonly studied Top-N recommendation, we soften the ranking metrics such as Precision, Hit-Ratio and NDCG to derive an fully differentiable framework. We conduct extensive experiments to demonstrate the effectiveness of our model based on four real-world datasets.

    Full text in ACM Digital Library

Diamond Supporter
 
 
Platinum Supporter
 
 
Amazon Science
 
Gold Supporter
 
 
Silver Supporter
 
 
Bronze Supporter
 
Challenge Sponsor
ShareChat
 
Special Supporters