Deep Reinforcement Learning Advanced methods in AI module

For the Advanced Methods in Artificial Intelligence course we created a four lectures and practicals module covering essential topics in Deep Reinforcement Learning.

Students learned about specific issues in training RL agents with non-linear function approximation and implemented during the practical a wide range of algorithms.

  • Prerequisites: Basic PyTorch knowledge.
  • Instructor: Florin Gogianu.

Syllabus

  1. Introduction to RL. Covers motivation, core concepts, value functions,

Bellman equation and TD learning | slides.

  • Practical: Policy evaluation, Q-Learning, SARSA, Expected SARSA in

a tabular setting. Used and modified with the permission of Diana Borsa. | notebook. 2. Approximate methods. MDP definition, introduction to approximate solution methods, geometrical intuition, non-linear function approximation, deadly triad, Deep Q-Networks, action overestimation, maximization bias and Double Q-Learning, Dueling DQN, Prioritized Experience Replay, Distributional RL, Rainbow, Auxiliary Tasks | slides.

  • Practical: Implement DQN, Double-DQN and Dueling DQN on a variety of

MiniGrid environments | notebook. 3. Policy Gradient Methods. Motivation, intuitions, policy gradient theorem, REINFORCE, baselines, advantage function, generalized advantage functions, Asynchronous Actor Critic, optimization perspective and intro to TRPO | slides. - Practical: Implement REINFORCE, value-function baseline, Advantage Actor-Critic, Generalized Advantage Actor-Critic on a range of discrete actions environments | notebook. 4. Advanced Policy Gradient Methods. Detailed TRPO, PPO, IMPALA, importance sampling, V-Trace and a special section with RL applications in machine learning | slides.

Administrative details

  • When: October - November 2020, on Monday evening
  • Where: Faculty of Automatic Control and Computer Science, Politehnica

University of Bucharest

  • Lectures: [lecture

1](https://floringogianu.github.io/courses/01_introduction/), lecture 2, lecture 3, lecture 4