RecSys 2025 - Session 3 - RecSys

Session 3: Representation Meets Recommendation & Search

Date: Tuesday September 23, 16:30–18:00 (GMT+2)
Session Chair: Alejandro Bellogin

INDDecoupled Entity Representation Learning for Pinterest Ads Ranking
by Jie Liu, Yinrui Li, Jiankai Sun, Kungang Li, Han Sun, Sihan Wang, Huasen Wu, Siyuan Gao, Paulo Soares, Nan Li, Zhifang Liu, Haoyang Li, Siping Ji, Ling Leng, Prathibha Deshikachar

In this paper, we introduce a novel framework following an upstream-downstream paradigm to construct user and item (Pin) embeddings from diverse data sources, which are essential for Pinterest to deliver personalized Pins and ads effectively. Our upstream models are trained on extensive data sources featuring varied signals, utilizing complex architectures to capture intricate relationships between users and Pins on Pinterest. To ensure scalability of the upstream models, entity embeddings are learned, and regularly refreshed, rather than real-time computation, allowing for asynchronous interaction between the upstream and downstream models. These embeddings are then integrated as input features in numerous downstream tasks, including ad retrieval and ranking models for CTR and CVR predictions. We demonstrate that our framework achieves notable performance improvements in both offline and online settings across various downstream tasks. This framework has been deployed in Pinterest’s production ad ranking systems, resulting in significant gains in online metrics.

Full text in ACM Digital Library
RESDeterminants of Users’ Chance-Seeking Behavior in Search-Based Recommendation
by Yuki Ninomiya, Yutaro Sone, Kazuhisa Miwa, Yuichiro Sumi, Ryosuke Nakanishi, Eiji Mitsuda, Koji Sato, Tadashi Odashima

Serendipity has emerged as a promising strategy to counter overspecialization in retrieval and recommendation systems. While prior studies focus on algorithmic approaches, few have examined users’ desire for chance. This study investigates psychological determinants of chance seeking through two experiments. Experiment 1 found that greater goal specificity suppresses chance seeking. Experiment 2 showed that extraversion, diversive curiosity, enjoyment of ambiguity, and maximization enhance chance seeking, whereas neuroticism and specific curiosity reduce it. These findings suggest that users actively regulate the degree of chance in response to their goal and individual characteristics. The results indicate the importance of considering users’ chance seeking when designing serendipitous recommendation systems.

Full text in ACM Digital Library
INDEnhancing Embedding Representation Stability in Recommendation Systems with Semantic ID
by Carolina Zheng, Minhui Huang, Dmitrii Pedchenko, Kaushik Rangadurai, Siyu Wang, Fan Xia, Gaby Nahum, Jie Lei, Yang Yang, Tao Liu, Zutian Luo, Xiaohan Wei, Dinesh Ramasamy, Jiyan Yang, Yiping Han, Lin Yang, Hangjun Xu, Rong Jin, Shuang Yang

The exponential growth of online content has posed significant challenges to ID-based models in industrial recommendation systems, ranging from extremely high cardinality and dynamically growing ID space, to highly-skewed engagement distributions, to prediction instability as a result of natural id life cycles. This paper examines these challenges and introduces Semantic ID prefix-ngram, a novel token parameterization technique that significantly improves the performance of the original Semantic ID. Semantic ID prefix-ngram creates semantically meaningful collisions by hierarchically clustering items based on their content embeddings, as opposed to random assignments. Through extensive experimentation, we demonstrate that Semantic ID prefix-ngram not only addresses embedding instability but also significantly improves tail id modeling, and mitigates representation shifts. We report our experience of integrating Semantic ID into Meta’s production Ads Ranking system, leading to notable performance gains.

Full text in ACM Digital Library
RESGenSAR: Unifying Balanced Search and Recommendation with Generative Retrieval
by Teng Shi, Jun Xu, Xiao Zhang, Xiaoxue Zang, Kai Zheng, Yang Song, Enyun Yu

Many commercial platforms provide both search and recommendation (S&R) services to meet different user needs. This creates an opportunity for joint modeling of S&R. Although many joint S&R studies have demonstrated the advantages of integrating S&R, they have also identified a trade-off between the two tasks. That is, when recommendation performance improves, search performance may decline, or vice versa. This trade-off stems from the different information requirements: search prioritizes the semantic relevance between the queries and the items, while recommendation heavily relies on the collaborative relationship between users and items. To balance semantic and collaborative information and mitigate this trade-off, two main challenges arise: (1) How to incorporate both semantic and collaborative information in item representations. (2) How to train the model to understand the different information requirements of S&R. The recent rise of generative retrieval based on Large Language Models (LLMs) for S&R offers a potential solution. Generative retrieval represents each item as an identifier, allowing us to assign multiple identifiers to each item to capture both semantic and collaborative information. Additionally, generative retrieval formulates both S&R as sequence-to-sequence tasks, enabling us to unify different tasks through varied prompts, thereby helping the model better understand the requirements of each task. Based on this, we propose GenSAR, a method that unifies balanced S&R through generative retrieval. We design joint S&R identifiers and training tasks to address the above challenges, mitigate the trade-off between S&R, and further improve both tasks. Experimental results on a public dataset and a commercial dataset validate the effectiveness of GenSAR.

Full text in ACM Digital Library
INDOrthogonal Low Rank Embedding Stabilization
by Kevin Zielnicki, Ko-Jen Hsiao

The instability of embedding spaces across model retraining cycles presents significant challenges to downstream applications using user or item embeddings derived from recommendation systems as input features. This paper introduces a novel orthogonal low-rank transformation methodology designed to stabilize the user/item embedding space, ensuring consistent embedding dimensions across retraining sessions. Our approach leverages a combination of efficient low-rank singular value decomposition and orthogonal Procrustes transformation to map embeddings into a standardized space. This transformation is computationally efficient, lossless, and lightweight, preserving the dot product and inference quality while reducing operational burdens. Unlike existing methods that modify training objectives or embedding structures, our approach maintains the integrity of the primary model application and can be seamlessly integrated with other stabilization techniques.

Full text in ACM Digital Library
REPRRethinking the Privacy of Text Embeddings: A Reproducibility Study of “Text Embeddings Reveal (Almost) As Much As Text”
by Dominykas Seputis, Yongkang Li, Karsten Langerak, Serghei Mihailov

Text embeddings are fundamental to many natural language processing (NLP) tasks, extensively applied in domains such as recommendation systems and information retrieval (IR). Traditionally, transmitting embeddings instead of raw text has been seen as privacy-preserving. However, recent methods such as Vec2Text challenge this assumption by demonstrating that controlled decoding can successfully reconstruct original texts from black-box embeddings. The unexpectedly strong results reported by Vec2Text motivated us to conduct further verification, particularly considering the typically non-intuitive and opaque structure of high-dimensional embedding spaces. In this work, we reproduce the Vec2Text framework and evaluate it from two perspectives: (1) validating the original claims, and (2) extending the study through targeted experiments. First, we successfully replicate the original key results in both in-domain and out-of-domain settings, with only minor discrepancies arising due to missing artifacts, such as model checkpoints and dataset splits. Furthermore, we extend the study by conducting a parameter sensitivity analysis, evaluating the feasibility of reconstructing sensitive inputs (e.g., passwords), and exploring embedding quantization as a lightweight privacy defense. Our results show that Vec2Text is effective under ideal conditions, capable of reconstructing even password-like sequences that lack clear semantics. However, we identify key limitations, including its sensitivity to input sequence length. We also find that Gaussian noise and quantization techniques can mitigate the privacy risks posed by Vec2Text, with quantization offering a simpler and more widely applicable solution. Our findings emphasize the need for caution in using text embeddings and highlight the importance of further research into robust defense mechanisms for NLP systems. Our code and experiment results are available at https://github.com/dqmis/vec2text-repro.

Full text in ACM Digital Library
INDScaling Retrieval for Web-Scale Recommenders: Lessons from Inverted Indexes to Embedding Search
by Yuchin Juan, Jianqiang Shen, Shaobo Zhang, Qianqi Shen, Caleb Johnson, Luke Simon, Liangjie Hong, Wenjing Zhang

Web-scale search and recommendation systems depend on efficient retrieval to manage massive datasets and user traffic. This paper chronicles our evolutionary path in building the retrieval layer at LinkedIn, progressing from a CPU-based inverted index system to a GPU-accelerated embedding-based retrieval system. Initially anchored by traditional term-based retrieval, we enhanced relevance and productivity through learning-to-retrieve approaches by generating mappings among inferred attributes. As these early efforts encountered limitations in inferring and matching attributes at scale, we transitioned to embedding-based retrieval for greater flexibility and performance, but found that existing infrastructure couldn’t support large-scale production needs. This led us to develop a GPU-based retrieval system designed for high performance, flexible modeling, and multi-objective business optimization. We present the infrastructure innovations, optimizations, and key lessons learned throughout this transition, offering practical insights for building scalable, flexible retrieval systems.

Full text in ACM Digital Library
INDThe Future is Sparse: Embedding Compression for Scalable Retrieval in Recommender Systems
by Petr Kasalický, Martin Spišák, Vojtěch Vančura, Daniel Bohuněk, Rodrigo Alves, Pavel Kordík

Industry-scale recommender systems face a core challenge: representing entities with high cardinality, such as users or items, using dense embeddings that must be accessible during both training and inference. However, as embedding sizes grow, memory constraints make storage and access increasingly difficult. We describe a lightweight, learnable embedding compression technique that projects dense embeddings into a high-dimensional, sparsely activated space. Designed for retrieval tasks, our method reduces memory requirements while preserving retrieval performance, enabling scalable deployment under strict resource constraints. Our results demonstrate that leveraging sparsity is a promising approach for improving the efficiency of large-scale recommenders. We release our code at https://github.com/recombee/CompresSAE.

Full text in ACM Digital Library
INDYou Say Search, I Say Recs: A Scalable Agentic Approach to Query Understanding and Exploratory Search at Spotify
by Enrico Palumbo, Marcus Isaksson, Alexandre Tamborrino, Maria Movin, Catalin Dincu, Ali Vardasbi, Lev Nikeshkin, Oksana Gorobets, Anders Nyman, Poppy Newdick, Hugues Bouchard, Paul Bennett, Mounia Lalmas, Dani Doro, Christine Doig Cardet, Ziad Sultan

On online content platforms, users often aim to explore the catalog and discover new, personalized content through exploratory searches—such as “new releases for me.” Traditional search systems, which prioritize lexical and semantic matching over personalized retrieval, have historically struggled to support this type of intent. In contrast, recommendation services that leverage user-item and item-item signals tend to be more effective for addressing exploratory queries. Agentic technologies offer a promising opportunity to enhance exploratory search by harnessing large language models (LLMs) to interpret complex query intents and route them to the most suitable downstream services. However, deploying such agentic systems at scale remains a significant challenge. In this paper, we present a scalable agentic approach to query understanding and exploratory search at Spotify. Our system combines an LLM router, post-training adaptation techniques, search and recommendation APIs, and specialized sub-agents to interpret user intent and deliver personalized results at scale. We outline the high-level system design and share key experimental results. By addressing the limitations of conventional search, our approach yields substantial improvements across several exploratory use cases, including discovering similar artists (+115%), broad podcast searches (+15%), new music releases (+91%), and broad music searches (+25%).

Full text in ACM Digital Library

Back to program

Session 3: Representation Meets Recommendation & Search

RecSys 2025 (Prague)

Diamond Supporter

Platinum Supporter

Gold Supporter

Bronze Supporter

Challenge Supporter

Women in RecSys’s Event Supporter

Breakfast Symposium

Coffee Break Sponsor

Special Supporters

About this site

RecSys 2026