RecSys 2025 - Session 6 - RecSys

Session 6: Recommender Systems in the Wild: Domains and Society

Date: Wednesday September 24, 14:00–15:20 (GMT+2)
Session Chair: Julia Neidhardt

REPRA Reproducibility Study of Product-side Fairness in Bundle Recommendation
by Huy-Son Nguyen, Yuanna Liu, Masoud Mansoury, Mohammad Aliannejadi, Alan Hanjalic, Maarten de Rijke

Recommender systems are known to exhibit fairness issues, particularly on the product side, where products and their associated suppliers receive unequal exposure in recommended results. While this problem has been widely studied in traditional recommendation settings, its implications for bundle recommendation (BR) remain largely unexplored. This emerging task introduces additional complexity: recommendations are generated at the bundle level, yet user satisfaction and product (or supplier) exposure depend on both the bundle and the individual items it contains. Existing fairness frameworks and metrics designed for traditional recommender systems may not directly translate to this multi-layered setting. In this paper, we conduct a comprehensive reproducibility study of product-side fairness in BR across three real-world datasets using four state-of-the-art BR methods. We analyze exposure disparities at both the bundle and item levels using multiple fairness metrics, uncovering important patterns. Our results show that exposure patterns differ notably between bundles and items, revealing the need for fairness interventions that go beyond bundle-level assumptions. We also find that fairness assessments vary considerably depending on the metric used, reinforcing the need for multi-faceted evaluation. Furthermore, user behavior plays a critical role: when users interact more frequently with bundles than with individual items, BR systems tend to yield fairer exposure distributions across both levels. Overall, our findings offer actionable insights for building fairer bundle recommender systems and establish a vital foundation for future research in this emerging domain.

Full text in ACM Digital Library
RESAffect-aware Cross-Domain Recommendation for Art Therapy via Music Preference Elicitation
by Bereket A. Yilma, Luis A. Leiva

Art Therapy (AT) is an established practice that facilitates emotional processing and recovery through creative expression. Recently, Visual Art Recommender Systems (VA RecSys) have emerged to support AT, demonstrating their potential by personalizing therapeutic artwork recommendations. Nonetheless, current VA RecSys rely on visual stimuli for user modeling, limiting their ability to capture the full spectrum of emotional responses during preference elicitation. Previous studies have shown that music stimuli elicit unique affective reflections, presenting an opportunity for cross-domain recommendation (CDR) to enhance personalization in AT. Since CDR has not yet been explored in this context, we propose a family of CDR methods for AT based on music-driven preference elicitation. A large-scale study with 200 users demonstrates the efficacy of music-driven preference elicitation, outperforming the classic visual-only elicitation approach. Our source code, data, and models are available at https://github.com/ArtAICare/Affect-aware-CDR.

Full text in ACM Digital Library
RESBreaking Knowledge Boundaries: Cognitive Distillation-enhanced Cross-Behavior Course Recommendation Model
by Ruoyu Li, Yangtao Zhou, Chenzhang Li, Hua Chu, Jianan Li, Yuhan Bian

Online Course Recommendation (CR) stands as a promising educational strategy within online education platforms, with the goal of providing personalized learning experiences for learners and enhancing their learning efficiency. Existing CR methods focus on modeling learners’ learning needs from their historical course interactions by adopting general recommendation techniques, but fail to consider the shifts in course preferences caused by cognitive states. While Cognitive Diagnosis (CD) techniques are adept at tracking cognitive states’ evolution via mining learner-exercise interactions and benefit the CR task, it is non-trivial to integrate CD and CR properly due to several challenges, including accurate diagnosis, divergent task objectives, and inconsistent data magnitude. To address these challenges, we propose a Cognitive Distillation-enhanced Cross-Behavior Course Recommendation model (C3Rec), which aims to transfer the knowledge of learners’ cognitive states to enhance the CR task. Specifically, for accurate diagnosis, we introduce a dual-granularity cognitive diagnosis module to capture learner representations at both coarse and fine granularities, thereby achieving a comprehensive construction of learners’ cognitive states. For divergent task objectives, we design a cross-behavior course recommendation module to jointly profile the dynamic course preferences from two temporal interleaved learning behaviors, achieving the seamlessly semantic alignment between these two tasks. For inconsistent data magnitude, we introduce a triple-stage distillation mechanism to exploit cognitive state features as prior knowledge, enhancing the CR task by further profiling learners’ course preferences. Experimental comparisons with multiple state-of-the-art methods on two real-world educational datasets demonstrate the effectiveness of our model.

Full text in ACM Digital Library
INDCross-Batch Aggregation for Streaming Learning from Label Proportions in Industrial-Scale Recommendation Systems
by Jonathan Valverde, Tiansheng Yao, Xiang Li, Yuan Gao, Yin Zhang, Andrew Evdokimov, Adam Kraft, Samuel Ieong, Jerry Zhang, Ed H. Chi, Derek Zhiyuan Cheng, Ruoxi Wang

Recent controls over user data have diluted user signals essential to train industrial recommendation systems, replacing traditional event-level labels with aggregated item-level labels. Fitting these noisy aggregates into the event-level paradigm used by industrial recommendation systems causes models to be biased and miscalibrated, hurting critical business metrics. Learning from Label Proportions (LLP), a framework where instance-level prediction models are trained from aggregated signals, offers a principled solution to this problem — as long as all samples from an aggregate are present within the same training batch. Unfortunately, industry-scale recommender systems impose infrastructure constraints that fail this critical assumption because (1) they are trained in a sequential streaming framework that spreads aggregates across batches, (2) aggregates often exceed the size of a single batch, and (3) label noise makes it difficult to identify the time boundaries that correspond to the aggregated label. To address these issues, we propose a novel technique called Cross-Batch Aggregate (XBA) Loss to adapt LLP to the streaming setting. We design the loss to have a gradient that mimics the true aggregated loss gradient, approximating the distribution of the aggregate by using cumulative statistics across each aggregate. This enables (1) optimizing for model calibration and (2) learning a conversion model from the aggregate signals. We have deployed this technique to a Google Ads system impacted by conversion signal loss due to privacy constraints, delivering significant improvements on model calibration (48.8% reduction in online bias), advertiser value, and business metrics. Our key contribution is the extension of LLP to the streaming setting, providing a practical solution that bridges the gap between LLP research and industrial applications

Full text in ACM Digital Library
INDEmotion Vector-Based Fine-Tuning of Large Language Models for Age-Aware Teenage Book Recommendations
by Kate Hill, Yiu-Kai Ng, Joey Sherrill

Reading is a vital skill for teenagers as described by the National Institute of Child Health and Human Development, “Reading is the single most important skill necessary for a happy, productive, and successful life.” Yet, teens and their parents often struggle to find engaging books amid an overwhelming number of options. Moreover, existing book recommender systems rely heavily on user data such as profiles, reviews, or browsing behavior—information often restricted for minors due to privacy laws. To address this, we propose a privacy-conscious, teenage book recommender system that analyzes the emotional content of books using the NRC Emotion Intensity Lexicon (NRC-EIL). By extracting emotion vectors from book descriptions, we capture each book’s emotional tone and intensity. Our system then uses patterns in emotional preferences across age groups to recommend books that align with teen readers’ developmental and emotional needs. While LLMs can make content-based book recommendations for teenagers as well, they still face challenges like training bias, limited sensitivity to age-specific nuances, and lack of transparency. By integrating our emotion vector approach, we fine-tune LLMs to better detect age-relevant emotional cues, enhancing their ability to suggest meaningful and appropriate content for teen audiences. Experimental results confirm that fine-tuning LLMs with our emotional vector approach significantly enhances their ability to generate accurate, age-appropriate book recommendations for teenagers.

Full text in ACM Digital Library
RESIntegrating Individual and Group Fairness for Recommender Systems through Social Choice
by Amanda Aird, Elena Štefancová, Anas Buhayh, Cassidy All, Martin Homola, Nicholas Mattei, Robin Burke

Fairness in recommender systems is a complex concept, involving multiple definitions, different parties for whom fairness is sought, and various scopes over which fairness might be measured. Researchers seeking fairness-aware systems have derived a variety of solutions, usually highly tailored to specific choices along each of these dimensions, and typically aimed at tackling a single fairness concern, i.e., a single definition for a specific stakeholder group and measurement scope. However, in practical contexts, there are a multiplicity of fairness concerns within a given recommendation application and solutions limited to a single dimension are therefore less useful. We explore a general solution to recommender system fairness using social choice methods to integrate multiple heterogeneous definitions. In this paper, we extend group-fairness results from prior research to provider-side individual fairness, demonstrating in multiple datasets that both individual and group fairness objectives can be integrated and optimized jointly. We identify both synergies and tensions among different objectives with individual fairness correlated with group fairness for some groups and anti-correlated with others.

Full text in ACM Digital Library
RESMapping Stakeholder Needs to Multi-Sided Fairness in Candidate Recommendation for Algorithmic Hiring
by Mesut Kaya, Toine Bogers

Already before the enactment of the EU AI Act, candidate or job recommendation for algorithmic hiring—semi-automatically matching CVs to job postings—was used as an example of a high-risk application where unfair treatment could result in serious harms to job seekers. Recommending candidates to jobs or jobs to candidates, however, is also a fitting example of a multi-stakeholder recommendation problem. In such multi-stakeholder systems, the end user is not the only party whose interests should be considered when generating recommendations. In addition to job seekers, other stakeholders—such as recruiters, organizations behind the job postings, and the recruitment agency itself—are also stakeholders in this and deserve to have their perspectives included in the design of relevant fairness metrics. Nevertheless, past analyses of fairness in algorithmic hiring have been restricted to single-side fairness, ignoring the perspectives of the other stakeholders. In this paper, we address this gap and present a multi-stakeholder approach to fairness in a candidate recommender system that recommends relevant candidate CVs to human recruiters in a human-in-the-loop algorithmic hiring scenario. We conducted semi-structured interviews with 40 different stakeholders (job seekers, companies, recruiters, and other job portal employees). We used these interviews to explore their lived experiences of unfairness in hiring, co-design definitions of fairness as well as metrics that might capture these experiences. Finally, we attempt to reconcile and map these different (and sometimes conflicting) perspectives and definitions to existing (categories of) fairness metrics that are relevant for our candidate recommendation scenario

Full text in ACM Digital Library

RESYou Don’t Bring Me Flowers: Mitigating Unwanted Recommendations Through Conformal Risk Control
by Giovanni De Toni, Erasmo Purificato, Emilia Gomez, Andrea Passerini, Bruno Lepri, Cristian Consonni

Recommenders are significantly shaping online information consumption. While effective at personalizing content, these systems increasingly face criticism for propagating irrelevant, unwanted, and even harmful recommendations. Such content degrades user satisfaction and contributes to significant societal issues, including misinformation, radicalization, and erosion of user trust. Although platforms offer mechanisms to mitigate exposure to undesired content, these mechanisms are often insufficiently effective and slow to adapt to users’ feedback. This paper introduces an intuitive, model-agnostic, and distribution-free method that uses conformal risk control to provably bound unwanted content in personalized recommendations by leveraging simple binary feedback on items. We also address a limitation of traditional conformal risk control approaches, i.e., the fact that the recommender can provide a smaller set of recommended items, by leveraging implicit feedback on consumed items to expand the recommendation set while ensuring robust risk mitigation. Our experimental evaluation on data coming from a popular online video-sharing platform demonstrates that our approach ensures an effective and controllable reduction of unwanted recommendations with minimal effort. The source code is available here: https://github.com/geektoni/mitigating-harm-recsys.

Full text in ACM Digital Library

Back to program

Session 6: Recommender Systems in the Wild: Domains and Society

RecSys 2025 (Prague)

Diamond Supporter

Platinum Supporter

Gold Supporter

Bronze Supporter

Challenge Supporter

Women in RecSys’s Event Supporter

Breakfast Symposium

Coffee Break Sponsor

Special Supporters

About this site

RecSys 2026