Session 5: Language and Knowledge

Date: Tuesday 16:00 – 17:30 CET
Chair: Tomislav Duricic (Know-Center GmbH and Technische Universität Graz)

  • PATransformers4Rec: Bridging the Gap between NLP and Sequential / Session-Based Recommendation
    by Gabriel de Souza Pereira Moreira (NVIDIA, Brazil), Sara Rabhi (NVIDIA, Canada), Jeong Min Lee (Facebook AI, United States), Ronay Ak (NVIDIA, United States), and Even Oldridge (NVIDIA, Canada)

    Much of the recent progress in sequential and session-based recommendation has been driven by improvements in model architecture and pretraining techniques originating in the field of Natural Language Processing. Transformer architectures in particular have facilitated building higher-capacity models and provided data augmentation and training techniques which demonstrably improve the effectiveness of sequential recommendation. But with a thousandfold more research going on in NLP, the application of transformers for recommendation understandably lags behind. To remedy this we introduce Transformers4Rec, an open-source library built upon HuggingFace’s Transformers library with a similar goal of opening up the advances of NLP based Transformers to the recommender system community and making these advancements immediately accessible for the tasks of sequential and session-based recommendation. Like its core dependency, Transformers4Rec is designed to be extensible by researchers, simple for practitioners, and fast and robust in industrial deployments.
    In order to demonstrate the usefulness of the library and the applicability of Transformer architectures in next-click prediction for user sessions, where sequence lengths are much shorter than those commonly found in NLP, we have leveraged Transformers4Rec to win two recent session-based recommendation competitions. In addition, we present in this paper the first comprehensive empirical analysis comparing many Transformer architectures and training approaches for the task of session-based recommendation. We demonstrate that the best Transformer architectures have superior performance across two e-commerce datasets while performing similarly to the baselines on two news datasets. We further evaluate in isolation the effectiveness of the different training techniques used in causal language modeling, masked language modeling, permutation language modeling and replacement token detection for a single Transformer architecture, XLNet. We establish that training XLNet with replacement token detection performs well across all datasets. Finally, we explore techniques to include side information such as item and user context features in order to establish best practices and show that the inclusion of side information uniformly improves recommendation performance. Transformers4Rec library is available at

    Full text in ACM Digital Library

  • PASparse Feature Factorization for Recommender Systems with Knowledge Graphs
    by Vito Walter Anelli (Polytechnic University of Bari, Italy), Tommaso Di Noia (Polytechnic University of Bari, Italy), Eugenio Di Sciascio (Politecnico di Bari, Italy), Antonio Ferrara (Politecnico di Bari, Italy), and Alberto Carlo Maria Mancino (Politecnico di Bari, Italy)

    Deep Learning and factorization-based collaborative filtering recommendation models have undoubtedly dominated the scene of recommender systems in recent years. However, despite their outstanding performance, these methods require a training time proportional to the size of the embeddings and it further increases when also side information is considered for the computation of the recommendation list. In fact, in these cases we have that with a large number of high-quality features, the resulting models are more complex and difficult to train. This paper addresses this problem by presenting KGFlex: a sparse factorization approach that grants an even greater degree of expressiveness. To achieve this result, KGFlex analyzes the historical data to understand the dimensions the user decisions depend on (e.g., movie direction, musical genre, nationality of book writer). KGFlex represents each item feature as an embedding and it models user-item interactions as a factorized entropy-driven combination of the item attributes relevant to the user. KGFlex facilitates the training process by letting users update only those relevant features on which they base their decisions. In other words, the user-item prediction is mediated by the user’s personal view that considers only relevant features. An extensive experimental evaluation shows the approach’s effectiveness, considering the recommendation results’ accuracy, diversity, and induced bias. The public implementation of KGFlex is available at

    Full text in ACM Digital Library

  • PATowards Source-Aligned Variational Models for Cross-Domain Recommendation
    by Aghiles Salah (Rakuten Institute of Technology, France), Thanh Binh Tran (School of Computing and Information Systems Singapore Management University, Singapore), and Hady Lauw (School of Computing and Information Systems Singapore Management University, Singapore)

    Data sparsity is a long-standing challenge in recommender systems. Among existing approaches to alleviate this problem, cross-domain recommendation consists in leveraging knowledge from a source domain or category (e.g., Movies) to improve item recommendation in a target domain (e.g., Books). In this work, we advocate a probabilistic approach to cross-domain recommendation and rely on variational autoencoders (VAEs) as our latent variable models. More precisely, we assume that we have access to a VAE trained on the source domain that we seek to leverage to improve preference modeling in the target domain. To this end, we propose a model which learns to fit the target observations and align its hidden space with the source latent space jointly. Since we model the latent spaces by the variational posteriors, we operate at this level, and in particular, we investigate two approaches, namely rigid and soft alignments. In the former scenario, the variational model in the target domain is set equal to the source variational model. That is, we only learn a generative model in the target domain. In the soft-alignment scenario, the target VAE has its variational model, but which is encouraged to look like its source counterpart. We analyze the proposed objectives theoretically and conduct extensive experiments to illustrate the benefit of our contribution. Empirical results on six real-world datasets show that the proposed models outperform several comparable cross-domain recommendation models.

    Full text in ACM Digital Library

Platinum Supporters
Gold Supporters
Silver Supporters
Special Supporter