IA318 -- Reinforcement Learning
Telecom ParisTech (2021 - 2022)
Same as last year !
Good notes slides available here

IA318 -- Reinforcement Learning
Telecom ParisTech (2020 - 2021)
I am teaching three classes in the Reinforcement Learning module at Telecom Paristech (also in the Data AI masters with Polytechnique) directed by Prof. Thomas Bonald.
Introduction to Multi-Armed Bandits
Contextual Linear Bandits
Monte Carlo Tree Search and introduction to planning
This year I taught online as everyone these days... and I experimented with handwritten notes using GoodNotes on iPad. The "clean" versions of the notes are below.



Summer School HI! Paris 2021
I will give a tutorial on sequential decision making and present motivating problems in Reinforcement Learning and Marketing. I will introduce the multi-armed bandit model as a way to pose the statistical problem of exploration-versus-exploitation and show how Thompson Sampling provides and elegant and simple solution. The full notes will be posted here shortly after the class but you can already find a preview below.
Practical Session: In this Lab session, we will learn how to implement Thompson Sampling for linear bandits. The Colab is here :)
Resources:
A Tutorial on Thompson Sampling. D. Russo et al : https://arxiv.org/abs/1707.02038
Bandit Algorithms. T.Lattimore and C. Szepesvari. https://tor-lattimore.com/downloads/book/book.pdf (big pdf !). See also www.banditalgs.com .
