Google DeepMind says it's created a "multi-world" AI agent that can follow your natural-language instructions to perform a range of tasks for you in a variety of 3D virtual environments. That is, it'll do the grindy bits of video games for you.
It's a return to DeepMind's roots; the company started out developing artificial intelligence by teaching it to play old arcade games like Pong and Breakout, after which it quickly conquered the world's best humans at games like Go, chess, Stratego, Shogi, StarCraft II and others. Now merged with Google Brain, DeepMind has had its business suit on in recent years, focusing on things like its AlphaFold protein structure predictor and its GNoME crystalline materials discovery technology.
But now with the SIMA project, DeepMind's AIs are getting a chance to blow off some steam and play some video games again. The new model has been trained and tested on nine 3D open-world games including No Man's Sky , Teardown and Goat Simulator, but the idea here is to make a generalized AI agent that can take over the controls in pretty much any 3D game and do things in response to voice commands.
The learning process simply had SIMA watch the video and audio outputs of the game, as well as the keyboard and mouse commands of a human user, while listening to that user taking orders from another person. It's been trained and evaluated over about 600 basic skills, and can currently perform short, single-step tasks lasting about ten seconds – although DeepMind says it'll soon expand to tackle larger assignments that include "high-level strategic planning and multiple sub-tasks."
The generalized aspect is working well – SIMA performed nearly as well in games it wasn't specifically trained in as in ones it was trained in, indicating that it has indeed learned a general ability to jump into a 3D game, sus out what's going on and get to work.
What's it for? Well, turns out that a lot of the video games we spend our hard-earned money on are a lot like work, forcing us to grind for coins, or upgrade points, or fancy shields or all manner of other bunkum. SIMA will soon be able to take the controls and grind its little heart out while you're sleeping, or off at work, and you'll come back to a bounty of... Whatever kind of riches it is you've told it to go get. Your castle will be built. Your resources will be gathered. Your game will be primed for the fun bit.
Which is hilarious; AIs aren't just coming for our jobs and our artistic creative outlets, we're going to have to fight these bastards for the PS5 controller soon too. It's also a darkly funny waste of resources; all the coding that's gone into your video game, plus all the vaunted high-power chips and electricity it takes to train AI models, pitted against each other in a giant, useless loop of zero-value busywork so you can skip the boring bits.
Indeed, why the hell are there boring bits anyway? Torches have been lit and pitchforks hefted for less.
There is a grander purpose, of course; AI models are learning to navigate the physical world and do useful work, while embodied in all manner of robots, humanoid or otherwise. These machines "see" the world through video and sensor feeds, and while their control schemes are a lot more complex than your typical mouse plus WASD gamer's inputs, there's one other giant similarity: they'll be told verbally what to do, and they'll have to figure out a high-level plan, assemble resources, and execute things step by step.
So in that sense, the SIMA video game agent may be a building block toward real-world robotic handling of complex, boring tasks that humans don't want to do. And there's no shortage of those.
Source: Google DeepMind