Implementasi Q-Learning dan Backpropagation pada Agen yang Memainkan Permainan Flappy Bird

Ardiansyah Ardiansyah, Ednawati Rainarli

Abstract


This paper shows how to implement a combination of Q-learning and backpropagation on the case of agent learning to play Flappy Bird game. Q-learning and backpropagation are combined to predict the value-function of each action, or called value-function approximation. The value-function approximation is used to reduce learning time and to reduce weights stored in memory. Previous studies using only regular reinforcement learning took longer time and more amount of weights stored in memory. The artificial neural network architecture (ANN) used in this study is an ANN for each action. The results show that combining Q-learning and backpropagation can reduce agent’s learning time to play Flappy Bird up to 92% and reduce the weights stored in memory up to 94%, compared to regular Q-learning only. Although the learning time and the weights stored are reduced, Q-learning combined with backpropagation have the same ability as regular Q-learning to play Flappy Bird game.

Full Text:

PDF

References


Y. Shu et al, “Obstacles Avoidance with Machine Learning Control Methods in Flappy Birds Setting,” Univ. of Stanford, CS229 Machine Learning Final Projects Stanford University, 2014.

S. Vaish. Flappy Bird RL by SarvagyaVaish. [Online], http://sarvagyavaish.github.io/FlappyBirdRL/, tanggal akses 8 Maret 2016.

M. Hatem and F. Abdessemed, “Simulation of the Navigation of a Mobile Robot by the Q-Learning using Artificial Neuron Networks,” CEUR Workshop Proceeding Conférence Internationale sur l'Informatique et ses Applications, vol. 547, paper 81, 2009.

R. Jaksa et al. “Backpropagation in Supervised and Reinforcement Learning for Mobile Robot Control,” Proceedings of Computational Intelligence for Modelling, Control and Automation, 1999.

B. Huang et al. “Reinforcement Learning Neural Network to the Problem of Autonomous Mobile Robot Obstacle Avoidance,” Proceedings of the Fourth International Conference on Machine Learning and Cybernetics, hal. 85-89, 2005.

S. Dini and M. Serrano, “Combining Q-Learning with Artificial Neural Networks in an Adaptive Light Seeking Robot,” Swarthmore College, CS81 Adaptive Robotics Final Projects Swarthmore College, 2012.

R. S. Sutton and A. G. Barto, Reinforcement learning: An introduction, London, England: MIT press, 1998.

R. Rojas, Neural Networks A Systematic Introduction, Berlin, Germany: Springer-Verlag, 1996.




DOI: http://dx.doi.org/10.22146/jnteti.v6i1.287

Refbacks

  • There are currently no refbacks.


Copyright (c) 2017 Jurnal Nasional Teknik Elektro dan Teknologi Informasi (JNTETI)

JNTETI (Jurnal Nasional Teknik Elektro dan Teknologi Informasi)

Departemen Teknik Elektro dan Teknologi Informasi, Fakultas Teknik Universitas Gadjah Mada
Jl. Grafika No 2. Kampus UGM Yogyakarta 55281
+62 274 552305
jnteti@ugm.ac.id