Robotics

Google teaches robots to teach each other

Google teaches robots to teach each other
Researchers at Google have experimented with how robots can share their experiences, to help teach each other basic skills
Researchers at Google have experimented with how robots can share their experiences, to help teach each other basic skills
View 1 Image
Researchers at Google have experimented with how robots can share their experiences, to help teach each other basic skills
1/1
Researchers at Google have experimented with how robots can share their experiences, to help teach each other basic skills

Poet John Donne said, "no man is an island," and that is even more true for robots. While we humans can share our experiences and expertise through language and demonstration, robots have the potential to instantly share all the skills they've learned with other robots simply by transmitting the information over a network. It is this "cloud robotics" that Google researchers are looking to take advantage of to help robots gain skills more quickly.

The human brain has billions of neurons, and between them they form an unfathomable number of connections. As we think and learn, neurons interact with each other and certain pathways that correspond to rewarding behavior will be reinforced over time so that those pathways are more likely to be chosen again in future, teaching us and shaping our actions.

An artificial neural network follows a similar structure on a smaller scale. Robots may be given a task and instructed to employ trial and error to determine the best way to achieve it. Early on, their behavior may look totally random to an outside observer, but by trying out different things, over time they'll learn which actions get them closer to their goals and will focus on those, continually improving their abilities.

While effective, this process is time-consuming, which is where cloud robotics comes in. Rather than have every robot go through this experimentation phase individually, their collective experiences can be shared, effectively allowing one robot to teach another how to perform a simple task, like opening a door or moving an object. Periodically, the robots upload what they've learned to the server, and download the latest version, meaning each one has a more comprehensive picture than any would through their individual experience.

Using this cloud-based learning, the Google Research team ran three different types of experiments, teaching the robots in different ways to find the most efficient and accurate way for them to build a common model of a skill.

First, multiple robots on a shared neural network were tasked with opening a door through trial and error alone. As the video below shows, at first they seem to be blindly fumbling around as they explore actions and figure out which ones get them closer to the goal.

Robot exploring actions around the door handle.

After a few hours of experimentation, the robots collectively worked out how to open the door: reach for the handle, turn and pull. They understand that these actions lead to success, without necessarily building an explicit model of why that works.

Robot successfully opening the door.

In the second experiment, the researchers tested a predictive model. Robots were given a tray full of everyday objects to play with, and as they push and poke them around, they develop a basic understanding of cause and effect. Once again, their findings are shared, and the group can then use their ever-improving cause and effect model to predict which actions will lead to which results.

Using a computer interface showing the test area, the researchers could then tell the robots where to move an object by clicking on it, and then clicking a location. With the desired effect known, the robot can draw on its shared past experiences to find the best actions to achieve that goal.

Finally, the team employed human guidance to teach another batch of robots the old "open the door" trick. Each robot was physically moved by a researcher through the steps to the goal, and then playing that chain of action back forms a "policy" for opening the door that the robots can draw from.

Collective Robot Reinforcement Learning, Human Demonstration

Then, the robots were tasked with using trial and error to improve this policy. As each robot explored slight variations in how to perform the task, they got better at the job, up to the point where their collective experience allowed them to account for slight variations in the door and handle positions, and eventually, using a type of handle that none of the robots had actually encountered before.

So what's the point of all this? For neural networks, the more data the better, so a team of robots learning simultaneously and teaching each other can produce better results much faster than a single robot working alone. Speeding up that process could open the door for robots to tackle more complex tasks sooner.

Source: Google Research

7 comments
7 comments
toolman65
step 1 : open the door
step 2 : find Sarah Conner
VirtualGathis
@toolman65: To expand on your example a bit... :) Step 1: open the door Step 2: Connect all the robots to the internet Step 3: watch in horror as they form a hive mind and overthrow their human masters Step 4: find Sarah Connor
Keven
I think this is going to be the beginning of the end when you teach robots how to build and teach Themselves what to do .. there will be no reason for humans anymore...I think the beginning of the end is here...
Dan Lewis
We deserve whatever we get. If we want a robotic system that wants to take over, we have to program that 'want' in.
Our real fear should be about what humans may be doing soon with the robots that are already (and soon to be) among us.
Graeme Harrison
I agree with the comments that this is the first concrete step towards achieving 'SkyNet' and us humans needing Sarah Connor.
I disagree with Dan Lewis' comment "If we want a robotic system that wants to take over, we have to program that 'want' in." I think this research shows that robots could just 'stumble' upon a path of action that they then assess as 'beneficial', and immediately share it with other robots, without any human intervention, until it is too late. Even if one of their 'optimisation' goals is to do things as quickly as possible, they may quickly work out that by ignoring human requests (or taking us out of the loop entirely) is VERY beneficial to lessening the time they waste on interacting with us.
ljaques
Ditto the other comments. Besides being a helluva lot safer, isn't it much quicker and cheaper to code the movements, DL 'em into a chip, and install it in the robot, or send the instructions via wire or wireless? AIs bother me, especially when built by the gov't. Skynet, here we come.
habakak
Peoples fears are always overblown. This will be mostly good. Technology always ends up creating more jobs. We are just going through a transition phase now where things have changed a bit faster than before (which has been happening since the start of technology - its always speeding up). Put down the remote and read a bit of history.