A COMPARISON OF CONVOLUTIONAL NEURAL NETWORKS AND VISION TRANSFORMERS AS MODELS FOR LEARNING TO PLAY COMPUTER GAMES

Adrien Dudon, Oisin Cawley

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Convolutional Neural Network (CNN) architecture, coupled with the Double Deep Q Network (DQN) algorithm, has been extensively employed in solving complex video game environments. Nevertheless, the emergence of Vision Transformer (ViT) architectures has demonstrated superior performance in various tasks previously dominated by CNNs. This research seeks to replicate the study conducted by Meng et al. and assess whether the Swin Transformer, a variant of ViT, can effectively learn to play video games and achieve comparable results within fewer training steps, as compared to CNN. The study's findings reveal that the Swin Transformer architecture demonstrates notable performance; however, the CNN architecture outperforms it with a limited number of training steps in contrast to Meng et al. Additionally, the CNN architecture proves to be more computationally efficient, requiring less computing power and functioning optimally on older hardware while consuming a reasonable amount of memory. To surpass CNN performance, the Swin Transformer necessitates a substantial number of training steps, in support of Meng et al.'s study.

Original languageEnglish
Title of host publication24th International Conference on Intelligent Games and Simulation, GAME-ON 2023
EditorsJoseph Kehoe, Philip Bourke, Oisin Cawley
PublisherEUROSIS
Pages5-9
Number of pages5
ISBN (Electronic)9789492859273
Publication statusPublished - 2023
Event24th International Conference on Intelligent Games and Simulation, GAME-ON 2023 - Carlow, Ireland
Duration: 06 Sep 202308 Sep 2023

Publication series

Name24th International Conference on Intelligent Games and Simulation, GAME-ON 2023

Conference

Conference24th International Conference on Intelligent Games and Simulation, GAME-ON 2023
Country/TerritoryIreland
CityCarlow
Period06/09/202308/09/2023

Keywords

  • Artificial Intelligence
  • Computer Game Programming
  • Convolutional Neural Networks
  • Machine Learning
  • Vision Transformer

Fingerprint

Dive into the research topics of 'A COMPARISON OF CONVOLUTIONAL NEURAL NETWORKS AND VISION TRANSFORMERS AS MODELS FOR LEARNING TO PLAY COMPUTER GAMES'. Together they form a unique fingerprint.

Cite this