Alexander Terenin “A short introduction to multi-armed bandits“

Abstract: Multi-armed bandits are a class of sequential decision problems which include uncertainty. One of their defining characteristics is the presence of explore-exploit tradeoffs, which require one to balance taking advantage of information that is known with trying different options in order to learn more information in order to make optimal decisions. In this tutorial, we introduce the problem setting and basic techniques of analysis. We conclude by showing how explore-exploit tradeoffs appear in more general settings, and how the ideas discussed can aid in understanding of areas like reinforcement learning. Personal page of Alexander: Seminar web-page:
Back to Top