Learning to Rank for Information Retrieval from User Interactions
Katja Hofmann‚ Shimon Whiteson‚ Anne Schuth and Maarten de Rijke
In this article we give an overview of our recent work on online learning to rank for information retrieval (IR). This work addresses IR from a reinforcement learning (RL) point of view, with the aim to enable systems that can learn directly from interactions with their users. Learning directly from user interactions is difficult for several reasons. First, user interactions are hard to interpret as feedback for learning because it is usually biased and noisy. Second, the system can only observe feedback on actions (e.g., rankers, documents) actually shown to users, which results in an exploration–exploitation challenge. Third, the amount of feedback and therefore the quality of learning is limited by the number of user interactions, so it is important to use the observed data as effectively as possible. Here, we discuss our work on interpreting user feedback using probabilistic interleaved comparisons, and on learning to rank from noisy, relative feedback.