Session 11: Sequential Recommendation 2

Date: Thursday September 2`, 4:05 PM – 5:25 PM (GMT+8)
Room: Hall 406CX
Session Chair: Robin Burke
Parallel with: Session 12: Evaluation

  • RESgSASRec: Reducing Overconfidence in Sequential Recommendation Trained with Negative Sampling
    by Aleksandr V. Petrov (University of Glasgow) and Craig Macdonald (University of Glasgow).

    Large catalogue size is one of the central challenges in training recommendation models: a large number of items makes it infeasible to compute scores for all items during training, forcing models to deploy negative sampling. However, negative sampling increases the proportion of positive interactions in the training data. Therefore models trained with negative sampling tend to overestimate the probabilities of positive interactions — a phenomenon we call overconfidence. While the absolute values of the predicted scores/probabilities are unimportant for ranking retrieved recommendations, overconfident models may fail to estimate nuanced differences in the top-ranked items, resulting in degraded performance. This paper shows that overconfidence explains why the popular SASRec model underperforms when compared to BERT4Rec (contrary to the BERT4Rec authors’ attribution to the bi-directional attention mechanism). We propose a novel Generalised Binary Cross-Entropy Loss function (gBCE) to mitigate overconfidence and theoretically prove that it can mitigate overconfidence. We further propose the gSASRec model, an improvement over SASRec that deploys an increased number of negatives and gBCE loss. We show through detailed experiments on three datasets that gSASRec does not exhibit the overconfidence problem. As a result, gSASRec can outperform BERT4Rec (e.g.\ +9.47\% NDCG on MovieLens-1M), while requiring less training time (e.g.\ -73\% training time on MovieLens-1M). Moreover, in contrast to BERT4Rec, gSASRec is suitable for large datasets that contain more than 1 million items.

    Full text in ACM Digital Library

  • RESEquivariant Contrastive Learning for Sequential Recommendation
    by Peilin Zhou (HKUST (Guangzhou)), Jingqi Gao (Upstage), Yueqi Xie (HKUST), Qichen Ye (Peking University), Yining Hua (Harvard Medical School), Jaeboum Kim (The University of Hong Kong Science and Technology, Upstage), Shoujin Wang (Data Science Institute, University of Technology Sydney) and Sunghun Kim (The University of Hong Kong Science and Technology).

    Contrastive learning (CL) benefits the training of sequential recommendation models with informative self-supervision signals. Existing solutions apply general sequential data augmentation strategies to generate positive pairs and encourage their representations to be invariant. However, due to the inherent properties of user behavior sequences, some augmentation strategies, such as item substitution, can lead to changes in user intent. Learning indiscriminately invariant representations for all augmentation strategies might be sub-optimal. Therefore, we propose Equivariant Contrastive Learning for Sequential Recommendation (ECL-SR), which endows SR models with great discriminative power, making the learned user behavior representations sensitive to invasive augmentations (e.g., item substitution) and insensitive to mild augmentations (e.g., feature-level dropout masking). In detail, we use the conditional discriminator to capture differences in behavior due to item substitution, which encourages the user behavior encoder to be equivariant to invasive augmentations. Comprehensive experiments on four benchmark datasets show that the proposed ECL-SR framework achieves competitive performance compared to state-of-the-art SR models. The source code will be released.

    Full text in ACM Digital Library

  • RESContrastive Learning with Frequency-Domain Interest Trends for Sequential Recommendation
    by Yichi Zhang (Harbin Engineering University), Guisheng Yin (Harbin Engineering University) and Yuxin Dong (Harbin Engineering University).

    Recently, contrastive learning for sequential recommendation has demonstrated its powerful ability to learn high-quality user representations. However, constructing augmented samples in the time domain poses challenges due to various reasons, such as fast-evolving trends, interest shifts, and system factors. Furthermore, the F-principle indicates that deep learning preferentially fits the low-frequency part, resulting in poor performance on high-frequency tasks. The complexity of time series and the low-frequency preference limit the utility of sequence encoders. To address these challenges, we need to construct augmented samples from the frequency domain, thus improving the ability to accommodate events of different frequency sizes. To this end, we propose a novel Contrastive Learning with Frequency-Domain Interest Trends for Sequential Recommendation (CFIT4SRec). We treat the embedding representations of historical interactions as “images” and introduce the second-order Fourier transform to construct augmented samples. The components of different frequency sizes reflect the interest trends between attributes and their surroundings in the hidden space. We introduce three data augmentation operations to accommodate events of different frequency sizes: low-pass augmentation, high-pass augmentation, and band-stop augmentation. Extensive experiments on four public benchmark datasets demonstrate the superiority of CFIT4SRec over the state-of-the-art baselines. The implementation code is available at

    Full text in ACM Digital Library

  • RESTask Aware Feature Extraction Framework for Sequential Dependence Multi-Task Learning
    by Xuewen Tao (Mybank, Ant Group), Mingming Ha (School of Automation and Electrical Engineering, University of Science and Technology Bejing; Mybank, Ant Group), Qiongxu Ma (Mybank, Ant Group), Hongwei Cheng (Mybank, Ant Group), Wenfang Lin (Mybank, Ant Group) and Xiaobo Guo (Institute of Information Science, Beijing Jiaotong Univeristy; Mybank, Ant Group).

    In online recommendation, financial service, etc., the most common application of multi-task learning (MTL) is the multi-step conversion estimations. A core property of the multi-step conversion is the sequential dependence among tasks. Most existing works focus far more on the specific post-view click-through rate (CTR) and post-click conversion rate (CVR) estimations, which neglect the generalization of sequential dependence multi-task learning (SDMTL). Besides, the performance of the SDMTL framework is also deteriorated by the interference derived from implicitly conflict information passing between adjacent tasks. In this paper, a systematic learning paradigm of the SDMTL problem is established for the first time, which can transform the SDMTL problem into a general MTL problem and be applicable to more general multi-step conversion scenarios with longer conversion path or stronger task dependence. Also, the distribution dependence between adjacent task spaces is illustrated from a theoretical point of view. On the other hand, an SDMTL architecture, named Task Aware Feature Extraction (TAFE), is developed to enable dynamic task representation learning from a sample-wise view. TAFE selectively reconstructs the implicit shared information corresponding to each sample case and performs explicit task-specific extraction under dependence constraints. Extensive experiments on offline public and real-world industrial datasets, and online A/B implementations demonstrate the effectiveness and applicability of proposed theoretical and implementation frameworks.

    Full text in ACM Digital Library

Back to program

Diamond Supporter
Platinum Supporter
Amazon Science
Gold Supporter
Silver Supporter
Bronze Supporter
Challenge Sponsor
Special Supporters