Workshop on Bandit and Reinforcement Learning from User Interactions

State-of-the-art recommender systems are notoriously hard to design and improve upon, due to their interactive and dynamic nature, since they involve a multi-step decision-making process, where a stream of interactions occurs between the user and the system. Leveraging reward signals from these interactions and creating a scalable and performant recommendation inference model is a key challenge. Traditionally, to make the problem tractable, the interactions are often viewed as independent, but in order to improve recommender systems further, the models will need to take into account the delayed effects of each recommendation and start reasoning/planning for longer-term user satisfaction. To this end, our workshop invites contributions that enable recommender systems to adapt effectively to diverse forms of user feedback and to optimize the quality of each user’s long-term experience.

  • Thorsten Joachims, Cornell University
  • Adith Swaminathan, Deep Learning Technology Center, Microsoft Research
  • Maria Dimakopoulou, Netflix R&D
  • Yves Raimond, Netflix R&D
  • Olivier Koch, Criteo R&D
  • Flavian Vasile, Criteo R&D


Saturday, Sept 26, 2020