If robots are going to become part of our everyday lives, they'll need to learn to work with everyday things. That means being able to read instruction manuals and figuring out how to use new machines. That's the plan of researchers at Cornell University, who have programmed a robot barista that can not only make a latte, but figure out how to use an unfamiliar espresso maker.

Developing robots is more than a matter of mechanical engineering. There's also the problem of how to teach them how to carry out tasks. For research and factory robots, this can be done by programming them directly, or by guiding them through their paces using controllers like keyboards, joysticks, or Waldos. However, if robots are to work in more human environments like homes, offices, shops, and restaurants, they need to be able to learn by themselves.

Led by Ashutosh Saxena, assistant professor of computer science, the Cornell team set about developing a robot barista that can use its experience and that of other robots, along with text materials, to deduce how to use an unfamiliar machine. If the robot barista has operated three other coffee machines, reasons Saxena, then it should be able to figure out a fourth.

Part of the answer is an online database of mechanical movements produced by other robots and a crowdsourcing project where members of the public provide lessons to the robot barista on how to move. The idea is that when a person learns how to use a new machine, it isn't a matter of learning all over again how to use a knob, button, or lever every time a new one is encountered. Knowing how to use one tells us how to use similar ones.

So it is with the robot barista. It uses movements it learned from working a different coffee machine or any of 116 devices, like flushing a toilet, or putting a cup under a soda vending machine, or opening a water tap. These movements can then be adapted for the new machine.

According to Cornell, this was a big step, but not enough to get the job done. As anyone who has tried to work a new machine can tell you, the instructions are often a big help. So, the robot barista was programmed with a deep learning algorithm that allows it take its experience and combine it with written instructions, which it can download and parse for information on how to work a new coffee machine by breaking the task down into small steps.

One reason why deep learning is such an important part of the Cornell solution is that natural language suffers from "noise." The robot needs to know the linguistic difference between a knob and switch, or a handle and a lever, then match these up against the proper movements to work them. This involves moving through information in layers from the general to specific in terms of both the instructions and the movements.

In addition, the robot barista needs to be able to identify objects by shape rather than location, since this can vary widely from machine to machine. It does this using a 3D camera and laser rangefinder that builds a "point cloud" of coordinates which allow it to identify objects and plan the trajectory of movements to work the controls.

The Cornell team says that the robot barista has a 60 percent accuracy score on new machines, and that part of the reason it isn't higher is because the visual system has trouble with the complex light patterns thrown off by shiny surfaces.

According to the team, the next step in developing the robot barista is to incorporate a sense of touch for tactile feedback and visual monitoring to keep it from bumping into things. In addition, the robot will need to be programmed to use trial and error as part of its learning process – just like a novice human barista.

The Cornell teams findings will be presented as a paper (PDF) and demonstration at the 2015 Robotic Science and Systems conference in Rome on July 16.

The video below shows the robot barista making a latte.

View gallery - 3 images