ПРИМЕНА ДУБОКОГ УЧЕЊА УСЛОВЉАВАЊЕМ НА САМОВОЗЕЋИ АУТОМОБИЛ У СИМУЛИРАНОМ ОКРУЖЕЊУ TORCS

Ђорђе Марјановић

doi:10.24867/06BE40Marjanovic

Ђорђе Марјановић

DOI: https://doi.org/10.24867/06BE40Marjanovic

Ključne reči: дубоко учење условљавањем, аутономна возила, DDPG

Apstrakt

У овом раду описан је систем у којем се дубоко учење условљавањем примењује на аутономно возилоу симулираном окружењу. Агент је трениран помоћу Deep deterministic policy gradient (DDPG) алгоритма, а окружење представља 3D тркачка видео иград TORCS. Након више од 200 епизода обучавања агент је успео да заврши цео круг без скретања ван стазе. DDPG алгоритам се показао веома успешно у окружењима са континуалним акцијама.

Reference

[1] 2017. The Numbers Don’t Lie: Self-Driving Cars Are Getting Good. https://www.wired.com/2017/02/california-dmv-autonomouscardisengagement. (2017).
[2] 2017. Autonomous Vehicles Enacted Legislation. https://www.ncsl.org/research/transportation/autonomous-vehiclesselfdriving-vehicles-enacted-legislation. Aspx. (2017).
[3] Takeo Kanade, Chuck Thorpe, and William Whittaker. Autonomous land vehicle project at cmu. In Proceedings of the 1986 ACM fourteenth annual conference on Computer science, pages 71–80. ACM, 1986. 1.1
[4] Pomerleau, D. A. Alvinn, an autonomous land vehicle in a neural network. Technical report, Carnegie Mellon University, Computer Science Department, 1989
[5] Net-Scale Technologies. Autonomous off-road vehicle control using end-to-end learning. Technical report, 2004. Available at: https://netscale.com/ doc/net-scale-dave-report.pdf. [Accessed 17 March 2017]
[6] Bojarski, M., Del Testa, D., Dworakowski, D., Firner, B., Flepp, B., Goyal, P., Jackel, L. D., Monfort, M., Muller, U., Zhang, J. et al. 2016. End to end learning for self-driving cars. arXiv preprint arXiv:1604.07316
[7] A. El Sallab, M. Abdou, E. Perot, and S. Yogamani. Deep reinforcement learning framework for autonomous driving. Autonomous Vehicles and Machines, Electronic Imaging, 2017
[8] V. Mnih, K. Kavukcuoglu, D. Silver, A. Graves, I. Antonoglou, D. Wierstra, and M. Riedmiller. Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602, 2013
[9] Sutton, Richard S., and Andrew G. Barto. Reinforcement learning: An introduction. MIT press, 2018
[10] V. R. Konda and J. N. Tsitsiklis, “On Actor-Critic Algorithms,” SIAM Journal on Control and Optimization, vol. 42, pp. 1143–1166, Jan 2003.
[11] T. P. Lillicrap, J. J. Hunt, A. Pritzel, N. Heess, T. Erez, Y. Tassa, D. Silver, and D. Wierstra, “Continuous control with deep reinforcement learning,” arXiv preprint arXiv:1509.02971, pp. 1–14, 2015.
[12] D. Silver, G. Lever, N. Heess, T. Degris, D. Wierstra, and M. Riedmiller, “Deterministic Policy Gradient Algorithms,” Proceedings of the 31st International Conference on Machine Learning (ICML-14), pp. 387– 395, 2014