Sådan bruges AI til at spille Sonic the Hedgehog. Det er NETT!

Generation efter generation har mennesker tilpasset sig til at blive mere fit med vores omgivelser. Vi startede som primater, der lever i en verden af ​​at spise eller blive spist. Til sidst udviklede vi os til, hvem vi er i dag, hvilket afspejler det moderne samfund. Gennem udviklingsprocessen bliver vi klogere. Vi er i stand til at arbejde bedre med vores miljø og opnå det, vi har brug for.

Begrebet læring gennem evolution kan også anvendes på kunstig intelligens. Vi kan træne AI'er til at udføre bestemte opgaver ved hjælp af NEAT, Neuroevolution of Augmented Topologies. Kort sagt, NEAT er en algoritme, der tager en række AI'er (genomer), der forsøger at udføre en given opgave. De mest effektive AI'er "opdrætter" for at skabe den næste generation. Denne proces fortsætter, indtil vi har en generation, der er i stand til at fuldføre, hvad den har brug for.

NEAT er fantastisk, fordi det eliminerer behovet for eksisterende data, der kræves for at træne vores AI'er. Ved hjælp af NEAT og OpenAI's Gym Retro trænede jeg en AI til at spille Sonic the Hedgehog til SEGA Genesis. Lad os lære hvordan!

ET NETTIGT neuralt netværk (Python-implementering)

GitHub-arkiv

Vedant-Gupta523 / sonicNEAT

Bidrag til Vedant-Gupta523 / sonicNEAT-udvikling ved at oprette en konto på GitHub. github.com

Bemærk: Al koden i denne artikel og repoen ovenfor er en let modificeret version af Lucas Thompsons Sonic AI Bot ved hjælp af Open-AI og NEAT YouTube-tutorials og kode.

Forståelse af OpenAI Gym

Hvis du ikke allerede er bekendt med OpenAI Gym, skal du gennemgå terminologien nedenfor. De vil blive brugt ofte i hele artiklen.

agent - AI-afspilleren. I dette tilfælde vil det være Sonic.

miljø - agentens komplette omgivelser. Spillemiljøet.

handling - Noget, som agenten har mulighed for at gøre (dvs. flytte til venstre, gå til højre, springe, ikke gøre noget).

trin - Udfører 1 handling.

tilstand - En ramme for miljøet. Den aktuelle situation, AI er i.

observation - Hvad AI observerer fra miljøet.

fitness - Hvor godt vores AI klarer sig.

gjort - Når AI har afsluttet sin opgave eller ikke kan fortsætte længere.

Installation af afhængigheder

Nedenfor er GitHub-links til OpenAI og NEAT med installationsinstruktioner.

OpenAI : //github.com/openai/retro

NEAT : //github.com/CodeReclaimers/neat-python

Pip-installationsbiblioteker som cv2, numpy, pickle osv.

Importer biblioteker og indstil miljø

For at starte skal vi importere alle de moduler, vi vil bruge:

import retro import numpy as np import cv2 import neat import pickle

Vi vil også definere vores miljø, der består af spillet og staten:

env = retro.make(game = "SonicTheHedgehog-Genesis", state = "GreenHillZone.Act1")

For at træne en AI til at spille Sonic the Hedgehog skal du bruge spillets ROM (spilfil). Den enkleste måde at få det på er ved at købe spillet ud af Steam til $ 5. Du kan også finde gratis download-downloads af ROM online, men det er ulovligt, så gør det ikke.

I OpenAI-arkivet på retro / retro / data / stable / finder du en mappe til Sonic the Hedgehog Genesis. Placer spillets ROM her og sørg for, at det hedder rom.md. Denne mappe indeholder også .statefiler. Du kan vælge en og indstille tilstandsparameteren lig med den. Jeg valgte GreenHillZone Act 1, da det er det allerførste niveau i spillet.

Forståelse af data.json og scenario.json

I mappen Sonic the Hedgehog har du disse to filer:

data.json

{ "info": { "act":  "address": 16776721, "type": ", "level_end_bonus":  "address": 16775126, "type": ", "lives":  "address": 16776722, "type": ", "rings": { "address": 16776736, "type": ">u2" }, "score": { "address": 16776742, "type": ">u4" }, "screen_x": { "address": 16774912, "type": ">u2" }, "screen_x_end": { "address": 16774954, "type": ">u2" }, "screen_y": { "address": 16774916, "type": ">u2" }, "x": { "address": 16764936, "type": ">i2" }, "y": { "address": 16764940, "type": ">u2" }, "zone": u1"  } }

scenario.json

{ "done": { "variables": { "lives": { "op": "zero" } } }, "reward": { "variables": { "x": { "reward": 10.0 } } } }

Begge disse filer indeholder vigtige oplysninger om spillet og dets træning.

As it sounds, the data.json file contains information/data on different game specific variables (i.e. Sonic’s x-position, number of lives he has, etc.).

The scenario.json file allows us to perform actions in sync with the values of the data variables. For example we can reward Sonic 10.0 every time his x-position increases. We could also set our done condition to true when Sonic’s lives hit 0.

Understanding NEAT feedforward configuration

The config-feedforward file can be found in my GitHub repository linked above. It acts like a settings menu to set up our training. To point out a few simple settings:

fitness_threshold = 10000 # How fit we want Sonic to become pop_size = 20 # How many Sonics per generation num_inputs = 1120 # Number of inputs into our model num_outputs = 12 # 12 buttons on Genesis controller

There are tons of settings you can experiment with to see how it effects your AI’s training! To learn more about NEAT and the different settings in the feedfoward configuration, I would highly recommend reading the documentation here

Putting it all together: Creating the Training File

Setting up configuration

Our feedforward configuration is defined and stored in the variable config.

config = neat.Config(neat.DefaultGenome, neat.DefaultReproduction, neat.DefaultSpeciesSet, neat.DefaultStagnation, 'config-feedforward')

Creating a function to evaluate each genome

We start by creating the function, eval_genomes, which will evaluate our genomes (a genome could be compared to 1 Sonic in a population of Sonics). For each genome we reset the environment and take a random action

for genome_id, genome in genomes: ob = env.reset() ac = env.action_space.sample()

We will also record the game environment’s length and width and color. We divide the length and width by 8.

inx, iny, inc = env.observation_space.shape inx = int(inx/8) iny = int(iny/8)

We create a recurrent neural network (RNN) using the NEAT library and input the genome and our chosen configuration.

net = neat.nn.recurrent.RecurrentNetwork.create(genome, config)

Finally, we define a few variables: current_max_fitness (the highest fitness in the current population), fitness_current (the current fitness of the genome), frame (the frame count), counter (to count the number of steps our agent takes), xpos (the x-position of Sonic), and done (whether or not we have reached our fitness goal).

current_max_fitness = 0 fitness_current = 0 frame = 0 counter = 0 xpos = 0 done = False

While we have not reached our done requirement, we need to run the environment, increment our frame counter, and shape our observation to mimic that of the game (still for each genome).

env.render() frame += 1 ob = cv2.resize(ob, (inx, iny)) ob = cv2.cvtColor(ob, cv2.COLOR_BGR2GRAY) ob = np.reshape(ob, (inx,iny))

We will take our observation and put it in a one-dimensional array, so that our RNN can understand it. We receive our output by feeding this array to our RNN.

imgarray = [] imgarray = np.ndarray.flatten(ob) nnOutput = net.activate(imgarray)

Using the output from the RNN our AI takes a step. From this step we can extract fresh information: a new observation, a reward, whether or not we have reached our done requirement, and information on variables in our data.json (info).

ob, rew, done, info = env.step(nnOutput)

At this point we need to evaluate our genome’s fitness and whether or not it has met the done requirement.

We look at our “x” variable from data.json and check if it has surpassed the length of the level. If it has, we will increase our fitness by our fitness threshold signifying we are done.

xpos = info['x'] if xpos >= 10000: fitness_current += 10000 done = True

Otherwise, we will increase our current fitness by the reward we earned from performing the step. We also check if we have a new highest fitness and adjust the value of our current_max_fitness accordingly.

fitness_current += rew if fitness_current > current_max_fitness: current_max_fitness = fitness_current counter = 0 else: counter += 1

Lastly, we check if we are done or if our genome has taken 250 steps. If so, we print information on the genome which was simulated. Otherwise we keep looping until one of the two requirements has been satisfied.

if done or counter == 250: done = True print(genome_id, fitness_current) genome.fitness = fitness_current

Defining the population, printing training stats, and more

The absolute last thing we need to do is define our population, print out statistics from our training, save checkpoints (in case you want to pause and resume training), and pickle our winning genome.

p = neat.Population(config) p.add_reporter(neat.StdOutReporter(True)) stats = neat.StatisticsReporter() p.add_reporter(stats) p.add_reporter(neat.Checkpointer(1)) winner = p.run(eval_genomes) with open('winner.pkl', 'wb') as output: pickle.dump(winner, output, 1)

All that’s left is the matter of running the program and watching Sonic slowly learn how to beat the level!

To see all of the code put together check out the Training.py file in my GitHub repository.

Bonus: Parallel Training

If you have a multi-core CPU you can run multiple training simulations at once, exponentially increasing the rate at which you can train your AI! Although I will not go through the specifics on how to do this in this article, I highly suggest you check the sonicTraning.py implementation in my GitHub repository.

Conclusion

That’s all there is to it! With a few adjustments, this framework is applicable to any game for the NES, SNES, SEGA Genesis, and more. If you have any questions or you just want to say hello, feel free to email me at vedantgupta523[at]gmail[dot]com ?

Also, be sure to check out Lucas Thompson's Sonic AI Bot Using Open-AI and NEAT YouTube tutorials and code to see what originally inspired this article.

Key Takeaways

  1. Neuroevolution of Augmenting Topologies (NEAT) er en algoritme, der bruges til at træne AI til at udføre bestemte opgaver. Det er modelleret efter genetisk udvikling.
  2. NEAT eliminerer behovet for allerede eksisterende data, når du træner AI.
  3. Processen med at implementere OpenAI og NEAT ved hjælp afPython til at træne en AI til at spille ethvert spil.