AI/ML Seminar Series: Roy Fox (1/10/2022)

For improve playback, use the chrome extension Embedy High Definition for desktop and watch video at high speed with a resolution of 1080p

UCI AI/ML Seminar Series Roy Fox Assistant Professor Department of Computer Science University of California, Irvine Curiously effective ensemble and double-oracle reinforcement-learning methods Ensemble methods for reinforcement learning have gained attention in recent years, due to their ability to represent model uncertainty and use it to guide exploration and to reduce value estimation bias. We present MeanQ, a very simple ensemble method with improved performance, and show how it reduces estimation variance enough to operate without a stabilizing target network. Curiously, MeanQ is th...eoretically *almost* equivalent to a non-ensemble state-of-the-art method that it significantly outperforms, raising questions about the interaction between uncertainty estimation, representation, and resampling. In adversarial environments, where a second agent attempts to minimize the first’s rewards, double-oracle (DO) methods grow a population of policies

5 views