O Title: Maillard Sampling: Boltzmann Exploration Done Optimally for Interactive Machine Learning
O Speaker: Prof. Kwang-Sung Jun (University of Arizona)
O When: 7/14 (thurs) AM 11:00
O Where: N1-102
At the heart of interactive machine learning is the ability to determine which actions to take next to maximize information under given constraints. For example, recommender systems like to suggest products to users that will not only result in high click-through rates but also inform the system about the user’s preference to better serve the user in the long run. For these problems (commonly referred to as ‘bandit’ problems), the PhD dissertation of Maillard (2013) proposed a less-known algorithm that we call Maillard sampling (MS) that can be viewed as a correction to a popular heuristic called Boltzmann exploration. In this talk, we claim that MS is a strong competitor to Thompson sampling, the industrial standard algorithm. We will show that the performance guarantee of MS matches that of Thompson sampling and showcase practical benefits of MS such as enabling computationally-efficient offline evaluation, which has potential to overthrow the throne of Thompson sampling in industry.