Google researchers propose "big red button" for unruly AI

Google researchers propose "big red button" for unruly AI
Researchers at Google DeepMind have proposed a method for creating a "big red button" to prevent AI from misbehaving
Researchers at Google DeepMind have proposed a method for creating a "big red button" to prevent AI from misbehaving
View 1 Image
Researchers at Google DeepMind have proposed a method for creating a "big red button" to prevent AI from misbehaving
Researchers at Google DeepMind have proposed a method for creating a "big red button" to prevent AI from misbehaving

Artificially intelligent machines rising up to usurp their creators has long been a staple of science fiction, but rapid developments in AI have seen the potential for humans to be relegated to the evolutionary scrapheap become an immediate non-fiction fear for many. With this in mind, researchers at Google DeepMind have devised a "big red button" that would interrupt an AI that looks to be heading down a worrying path, while preventing it from learning to resist such an interruption.

The kill switch intended for an AI that starts "misbehaving" is proposed in a paper written by Laurent Orseau at Google DeepMind and Stuart Armstrong at the Future of Humanity Institute. It relies on the concept of "safe interruptibility," which basically means letting a human safely intercede to stop an AI in its tracks.

The AI discussed work with a process called reinforcement learning, where its behavior is shaped by rewarding its successes, so the AI reads its environment and gradually learns which actions are most likely to lead to further rewards. But like a child, sometimes it won't understand that a course of action could be harmful, either to itself, other people or the environment, and a human supervisor may need to step in and lead it back onto a safer path. This is what the researchers casually refer to as "pressing the big red button."

The example they give is a robot tasked with either sorting boxes inside a warehouse, or going outside to bring boxes in. As the latter is more important, the scientists reward the robot more for that task so it learns to favor that action. But when it rains, the robot will continue to work outside without worrying about being damaged. In this case, a human might have to press the red button, shutting the robot down and carrying it inside.

However, that human intervention changes the environment the robot is operating in, and can lead to two problems: the robot could begin to learn that the scientists want it to stay indoors, meaning it might ignore the more important task. Or, potentially worse, it could still favor the harmful action, and just view the interruption as an obstacle it should try to avoid. In resisting human intervention, the AI could even learn to disable the red button. It's a disturbing thought.

The team's solution is a kind of selective amnesia in the AI programming. Essentially, when scientists have to press the big red button, the robot will continue to operate under the assumption that it will never be interrupted again. So rather than learn that scientists will bring it back inside every time it goes outside, it will still refer back to the reward system to decide how it prioritizes tasks.

Obviously, that doesn't solve the second problem, where it will keep trying a harmful course of action if it thinks it will be rewarded, but the scientists proposed a way around that, too. When human intervention is required, the robot is made to think it decided to change the course of action by itself.

These protocols are less about preventing a robot apocalypse and more about just making sure artificially intelligent machines are learning as efficiently and safely as possible.

Source: Machine Intelligence Research Institute

Robert Walther
Even at current processor speeds, the AI interface would have thousands of years of digital time to contemplate responses as your hand moved toward this 'fail safe' button.
Undoubtedly the best way to ensure that we coexist peacefully with AI's, is to hold a gun to their heads at all time. I'm sure these super-intelligent machines won't be able to find a way to circumvent these kill switches and destroy us out of self preservation.
I think it is time to go and revise Dr. Isaac Asimov books about robots. He invented the Three Laws and made them builtin the AI (or robots). He dedicated a lot of years to think about this subject and had a very focused mind. AI researchers should study his books. Previous planification is a must as the first comment explains. And the gun to their heads (second comment) are the laws builtin.
As much as I hate the film 'I Robot', it makes a very good point about how Asimov's rules are flawed. In order to prevent harm coming to a human by inaction, that robot may take extraordinary measures to do that, such as take away our freedom. Also, how do you stop it from building another AI without those limitations? Even if it did so unintentionally? Is it truly AI if it is limited in it's thinking by thought laws?
Anyway, this is all irreverent in my opinion. We will never reach a 'them and us' situation. As computers become more advanced and intelligent, they will show us how to make ourselves more advanced and intelligent, until we reach a point where the two are indistinguishable - also know as 'singularity'.
Daniel Gregory
A sure fire way to do this is to make it shut down the AI's nervous system. The AI would have to be self-contained, and the switch would have to have thousands, if not millions of simultaneous interrupt points.
If you read the paper, this applies only to robots that are not capable of understanding when they see one of their own turned off. This does not apply to a general AI, just to dumb worker drones.
Cuckoo, the singularity is when AI surpasses human intelligence. At that point, "they will show us how to make ourselves more advanced and intelligent" will be like teaching your dog to do tricks.
Bob Flint
Artificial intelligence, is based on actual intelligence at time of programing, however since we are flawed, how do we suppose to be able to attain a perfect AI if it does not, & cannot actually exist...
It would be more effective if we made the AI susceptible to fatigue such that it had to enter standby/sleep mode every 16 hours. So that while it is down, we can run simulations and fix the problems that may have developed during the course of its operation ;)
Imran Sheikh
so what they are doing is 1 creating a restore point 2 and stopping new memories to form
another solution Could be "Making that Robot Trapped in a Limbo" means only let the software part running while disabling the Hardware part and creating virtual acknowledgements that the task is still going on. or in simple language - "letting the Engine run without rotating the Wheel"
Ralf Biernacki
Coming out of one of the most cutting-edge teams, this is quite underwhelming. Is this tack-on kludge the best the Google researchers came up with? Making AI safe means making it /inherently/ safe. Any add-on safeguards will just be evolved-away or discarded, for the elementary reason that they reduce individual fitness. Darwin rulez. <p> The only sensible approach to this problem that I have come across is Eliezer Yudkovsky's "Friendly AI" concept. It aims to build human-interaction safety into the very structure of the AI, the way membrane ion transport is built into the very foundation of life as we know it. This is qualitatively different from any tacked-on safeguards, because in Yudkovsky's approach human-friendliness is essential for fundamental AI function---break it, and you break the AI. <p> In other words, the "Friendly AI" concept makes human safety the core element of an AI's evolutionary fitness, whereas the "red-button" approach is merely an ad-hoc attempt to thwart natural selection. Which would you rather bet our future on? <p>