# Playing Atari with Deep Reinforcement Learning

@article{Mnih2013PlayingAW, title={Playing Atari with Deep Reinforcement Learning}, author={Volodymyr Mnih and Koray Kavukcuoglu and David Silver and Alex Graves and Ioannis Antonoglou and Daan Wierstra and Martin A. Riedmiller}, journal={ArXiv}, year={2013}, volume={abs/1312.5602} }

We present the first deep learning model to successfully learn control policies directly from high-dimensional sensory input using reinforcement learning. The model is a convolutional neural network, trained with a variant of Q-learning, whose input is raw pixels and whose output is a value function estimating future rewards. We apply our method to seven Atari 2600 games from the Arcade Learning Environment, with no adjustment of the architecture or learning algorithm. We find that it… Expand

#### 5,972 Citations

Distributed Deep Q-Learning

- Computer Science
- ArXiv
- 2015

We propose a distributed deep learning model to successfully learn control policies directly from high-dimensional sensory input using reinforcement learning. The model is based on the deep… Expand

Deep Reinforcement Learning With Macro-Actions

- Computer Science
- ArXiv
- 2016

This paper focuses on macro-actions, and evaluates these on different Atari 2600 games, where they yield significant improvements in learning speed and can even achieve better scores than DQN. Expand

Learning to play SLITHER.IO with deep reinforcement learning

- 2019

This project uses deep reinforcement learning to train an agent to play the massively multiplayer online game SLITHER.IO. We collect raw image inputs from sample gameplay via an OpenAI Universe… Expand

Chrome Dino Run using Reinforcement Learning

- Computer Science
- ArXiv
- 2020

This paper has used two of the popular temporal difference approaches namely Deep Q-Learning, and Expected SARSA and also implemented Double DQN model to train the agent and compared the scores with respect to the episodes and convergence of algorithms withrespect to timesteps. Expand

Deep Reinforcement Learning with Regularized Convolutional Neural Fitted Q Iteration

- 2016

We review the deep reinforcement learning setting, in which an agent receiving high-dimensional input from an environment learns a control policy without supervision using multilayer neural networks.… Expand

Transferring Deep Reinforcement Learning with Adversarial Objective and Augmentation

- Computer Science, Mathematics
- ArXiv
- 2018

This approach enables the agents to generalize knowledge from a single source task, and boost the learning progress with a semisupervised learning method when facing a new task. Expand

Deep Q-learning using redundant outputs in visual doom

- Computer Science
- 2016 IEEE Conference on Computational Intelligence and Games (CIG)
- 2016

This paper proposes to use redundant outputs in order to adapt training progress in deep reinforcement learning, and compares its method with general ε-greedy in ViZDoom platform. Expand

Deep Reinforcement Learning for Flappy Bird

- 2015

Reinforcement learning is essential for applications where there is no single correct way to solve a problem. In this project, we show that deep reinforcement learning is very effective at learning… Expand

Reinforcement Learning and Video Games

- Mathematics, Computer Science
- ArXiv
- 2019

Batch normalization is a method to solve internal covariate shift problems in deep neural network and positive influence of this on reinforcement learning has been proved in this study. Expand

Deep reinforcement learning boosted by external knowledge

- Computer Science
- SAC
- 2018

A new architecture to combine external knowledge and deep reinforcement learning using only visual input is presented, augmenting image input by adding environment feature information and combining two sources of decision. Expand

#### References

SHOWING 1-10 OF 37 REFERENCES

Deep auto-encoder neural networks in reinforcement learning

- Computer Science
- The 2010 International Joint Conference on Neural Networks (IJCNN)
- 2010

A framework for combining the training of deep auto-encoders (for learning compact feature spaces) with recently-proposed batch-mode RL algorithms ( for learning policies) is proposed and an emphasis is put on the data-efficiency and on studying the properties of the feature spaces automatically constructed by the deep Auto-encoder neural networks. Expand

Neural Fitted Q Iteration - First Experiences with a Data Efficient Neural Reinforcement Learning Method

- Computer Science
- ECML
- 2005

NFQ, an algorithm for efficient and effective training of a Q-value function represented by a multi-layer perceptron, is introduced and it is shown empirically, that reasonably few interactions with the plant are needed to generate control policies of high quality. Expand

Actor-Critic Reinforcement Learning with Energy-Based Policies

- Computer Science
- EWRL
- 2012

This work introduces the first sound and e"cient algorithm for training energy-based policies, based on an actorcritic architecture, that is computationally e-cient, converges close to a local optimum, and outperforms Sallans and Hinton (2004) in several high dimensional domains. Expand

Reinforcement learning for robots using neural networks

- Computer Science
- 1992

This dissertation concludes that it is possible to build artificial agents than can acquire complex control policies effectively by reinforcement learning and enable its applications to complex robot-learning problems. Expand

Learning multiple layers of representation

- Medicine, Psychology
- Trends in Cognitive Sciences
- 2007

The limitations of backpropagation learning can now be overcome by using multilayer neural networks that contain top-down connections and training them to generate sensory data rather than to classify it. Expand

Reinforcement Learning with Factored States and Actions

- Computer Science, Mathematics
- J. Mach. Learn. Res.
- 2004

A novel approximation method is presented for approximating the value function and selecting good actions for Markov decision processes with large state and action spaces and shows that the product of experts approximation can be used to solve large problems. Expand

Bayesian Learning of Recursively Factored Environments

- Mathematics, Computer Science
- ICML
- 2013

This paper introduces the class of recursively decomposable factorizations, and shows how exact Bayesian inference can be used to efficiently guarantee predictive performance close to the best factorization in this class. Expand

Temporal Difference Learning and TD-Gammon

- Computer Science
- J. Int. Comput. Games Assoc.
- 1995

TD-GAMMON is a neural network that trains itself to be an evaluation function for the game of backgammon by playing against itself and learning from the outcome. Expand

Reinforcement Learning: An Introduction

- Computer Science
- IEEE Transactions on Neural Networks
- 2005

This book provides a clear and simple account of the key ideas and algorithms of reinforcement learning, which ranges from the history of the field's intellectual foundations to the most recent developments and applications. Expand

A Neuroevolution Approach to General Atari Game Playing

- Computer Science
- IEEE Transactions on Computational Intelligence and AI in Games
- 2014

Results suggest that neuroevolution is a promising approach to general video game playing (GVGP) and achieved state-of-the-art results, even surpassing human high scores on three games. Expand