- INImbalanced Data Sparsity as Source of Unfair Bias in Collaborative Filtering
by Aditya Joshi (SEEK, Australia), CHIN LIN WONG (SEEK, Malaysia), Diego Marinho de Oliveira (Seek, Australia), FARHAD ZAFARI (Seek, Australia), Fernando MourĂ£o (SEEK, Australia), SABIR RIBAS (Seek, Australia), Saumya Pandey (SEEK, Australia)
Past work has extensively demonstrated that data sparsity critically impacts the accuracy of collaborative filtering (CF). The proposed talk revisits the relation between data sparsity and CF from a new perspective, evincing that the former also impacts the fairness of recommendations. In particular, data sparsity might lead to unfair bias in domains where the volume of activity strongly correlates with personal characteristics that are protected by law (i.e., protected attributes). This concern is critical for recommender systems (RSs) deployed in domains such as the recruitment domain where RSs have been reported to automate or facilitate discriminatory behaviour. Our work at SEEK deals with recommender algorithms that recommend jobs to candidates via SEEK’s multiple channels. While this talk focuses our perspective of the problem in the job recommendation domain, the discussion is relevant to many other domains where recommenders potentially have a social or economic impact on lives of individuals and groups.
In this talk, we refer to a particular scenario: imbalanced data sparsity. It corresponds to situations where the data is more sparse for certain sub-groups of users, resulting in a non-homogenous distribution over user sub-groups. In practice, data is not sparse at random and latent factors might cause or explain the underlying distribution of missing data. This motivates the central question of this talk: What happens when these causal factors correlate with protected attributes thereby reinforcing societal biases?
For the purpose of discussion, we have organised this talk into three parts:
Part I: Why be responsible by design?
Part II: How can unfair bias emerge from CF as a result of data sparsity?
Part III: How can we mitigate unfair bias emerging from imbalanced data sparsity?
Full text in ACM Digital Library
|
- PACountering Popularity Bias by Regularizing Score Differences
by Wondo Rhee (Seoul National University, Korea, Republic of), Sung Min Cho (Seoul National University, Korea, Republic of), Bongwon Suh (Seoul National University, Korea, Republic of)
The recommendation systems often suffer from popularity bias. The issue might be due to the fact that the data inherently exhibits long-tail distribution in item popularity (data bias); on the other hand, recommendation systems could give unfairly higher recommendation scores to popular items even among items a user equally liked, resulting in over-recommendation of popular items (model bias). Whereas the data bias could be alleviated with proper data collection, the model bias needs to be addressed with careful modeling. Our work focuses on the second part. In this study, we propose a novel method to reduce the model bias while maintaining accuracy by directly regularizing the recommendation scores to be equal across items a user preferred. Akin to contrastive learning, we extend the widely used pairwise loss (BPR loss) which maximizes the score differences between preferred and unpreferred items, with a regularization term that minimizes the score differences within preferred and unpreferred items, respectively, thereby achieving both high debias and high accuracy performance with no additional training. As a result, our approach avoids the accuracy-debias tradeoff often suffered by conventional debias methods that collectively adjusted the recommendation scores across preferred and unpreferred items. To evaluate the effectiveness of the proposed method, we conducted quantitative and qualitative analyses with a synthetic dataset as well as four benchmark datasets. We applied our method to four common recommendation models. The results showed our method outperformed earlier debias methods in terms of accuracy, debias performance, and generalizability. We hope that our method could help users enjoy diverse recommendations promoting serendipitous findings. Code available.
Full text in ACM Digital Library
|
- PAToward Fair Federated Recommendation Learning: Characterizing the Inter-Dependence of System and Data Heterogeneity
by Kiwan Maeng (Meta, United States, Pennsylvania State University, United States), Haiyu Lu (Meta, United States), Luca Melis (Meta, United States), John Nguyen (Meta, United States), Mike Rabbat (Meta, United States), Carole-Jean Wu (Meta, United States)
Federated learning (FL) is an effective mechanism for data privacy in recommender systems by running machine learning model training on-device. While prior FL optimizations tackled the data and system heterogeneity challenges faced by FL, they assume the two are independent of each other. This fundamental assumption is not reflective of real-world, large-scale recommender systems — data and system heterogeneity are tightly intertwined. This paper takes a data-driven approach to show the inter-dependence of data and system heterogeneity in real-world data and quantifies its impact on the overall model quality and fairness.
We design a framework, RF^2, to model the inter-dependence and evaluate its impact on state-of-the-art model optimization techniques for federated recommendation tasks. We demonstrate that the impact on fairness can be severe under realistic heterogeneity scenarios, by up to 15.8–41x compared to a simple setup assumed in most (if not all) prior work. It means when realistic system-induced data heterogeneity is not properly modeled, the fairness impact of an optimization can be downplayed by up to 41x. The result shows that modeling realistic system-induced data heterogeneity is essential to achieving fair federated recommendation learning. We plan to open-source RF^2 to enable future design and evaluation of FL innovations.
Full text in ACM Digital Library
|
- PAFairness-aware Federated Matrix Factorization
by Shuchang Liu (Rutgers University, United States), Yingqiang Ge (Rutgers University, United States), Shuyuan Xu (Rutgers University, United States), Yongfeng Zhang (Rutgers University, United States), Amelie Marian (Rutgers University, United States)
Achieving fairness over different user groups in recommender systems is an important problem.
The majority of existing works achieve fairness through constrained optimization that combines the recommendation loss and the fairness constraint.
To achieve fairness, the algorithm usually needs to know each user’s group affiliation feature such as gender or race. However, such involved user group feature is usually sensitive and requires protection.
In this work, we seek a federated learning solution for the fair recommendation problem and identify the main challenge as an algorithmic conflict between the global fairness objective and the localized federated optimization process.
On one hand, the fairness objective usually requires access to all users’ group information.
On the other hand, the federated learning systems restrain the personal data in each user’s local space.
As a resolution, we propose to communicate group statistics during federated optimization and use differential privacy techniques to avoid exposure of users’ group information when users require privacy protection.
We illustrate the theoretical bounds of the noisy signal used in our method that aims to enforce privacy without overwhelming the aggregated statistics.
Empirical results show that federated learning may naturally improve user group fairness and the proposed framework can effectively control this fairness with low communication overheads.
Full text in ACM Digital Library
|
- PADynamic Global Sensitivity for Differentially Private Contextual Bandits
by Huazheng Wang (Princeton University, United States), David B. Zhao (University of Virginia, United States), Hongning Wang (University of Viriginia, United States)
Bandit algorithms have become a reference solution for interactive recommendation. However, as such algorithms directly interact with users for improved recommendations, serious privacy concerns have been raised regarding its practical use. In this work, we propose a differentially private linear contextual bandit algorithm, via a tree-based mechanism to add Laplace or Gaussian noise to model parameters. Our key insight is that as the model converges during online update, the global sensitivity of its parameters shrinks over time (thus named dynamic global sensitivity). Compared with existing solutions, our dynamic global sensitivity analysis allows us to inject less noise to obtain $(\epsilon, \delta)$-differential privacy with added regret caused by noise injection. We provide a rigorous theoretical analysis over the amount of noise added via dynamic global sensitivity and the corresponding upper regret bound of our proposed algorithm.
Experimental results on both synthetic and real-world datasets confirmed the algorithm’s advantage against existing solutions.
Full text in ACM Digital Library
|
- INChallenges in Translating Research to Practice for Evaluating Fairness and Bias in Recommendation Systems
by Lex Beattie (Spotify, United States), Henriette Cramer (Spotify, United States), Dan Taber (Spotify, United States)
Calls to action to implement evaluation of fairness and bias into industry systems are increasing at a rapid rate. The research community has attempted to meet these demands by producing ethical principles and guidelines for AI, but few of these documents provide guidance on how to implement these principles in real world settings. Without readily available standardized and practice-tested approaches for evaluating fairness in recommendation systems, industry practitioners, who are often not experts, may easily run into challenges or implement metrics that are potentially poorly suited to their specific applications. When evaluating recommendations, practitioners are well aware they should evaluate their systems for unintended algorithmic harms, but the most important, and unanswered question, is how? In this talk, we will present practical challenges we encountered in addressing algorithmic responsibility in recommendation systems, which also present research opportunities for the RecSys community. This talk will focus on the steps that need to happen before bias mitigation can even begin.
Full text in ACM Digital Library
|