Robotics

Figure's humanoid can now watch, learn and perform tasks autonomously

Figure's humanoid can now watch, learn and perform tasks autonomously
Figure's 01 humanoid robot demonstrates its new watch-and-learn capabilities by autonomously using a coffee machine. Perhaps not the most spectacular demo, but this could open the floodgates to a massive acceleration of general-purpose robotics
Figure's 01 humanoid robot demonstrates its new watch-and-learn capabilities by autonomously using a coffee machine. Perhaps not the most spectacular demo, but this could open the floodgates to a massive acceleration of general-purpose robotics
View 2 Images
Figure's 01 humanoid robot demonstrates its new watch-and-learn capabilities by autonomously using a coffee machine. Perhaps not the most spectacular demo, but this could open the floodgates to a massive acceleration of general-purpose robotics
1/2
Figure's 01 humanoid robot demonstrates its new watch-and-learn capabilities by autonomously using a coffee machine. Perhaps not the most spectacular demo, but this could open the floodgates to a massive acceleration of general-purpose robotics
Figure's 01 was walking within 12 months of development – a record, Adcock believes
2/2
Figure's 01 was walking within 12 months of development – a record, Adcock believes

Figure's Brett Adcock claimed a "ChatGPT moment" for humanoid robotics on the weekend. Now, we know what he means: the robot can now watch humans doing tasks, build its own understanding of how to do them, and start doing them entirely autonomously.

General-purpose humanoid robots will need to handle all sorts of jobs. They'll need to understand all the tools and devices, objects, techniques and objectives we humans use to get things done, and they'll need to be as flexible and adaptable as we are in an enormous range of dynamic working environments.

They're not going to be useful if they need a team of programmers telling them how to do every new job; they need to be able to watch and learn – and multimodal AIs capable of watching and interpreting video, then driving robotics to replicate what they see, have been taking revolutionary strides in recent months, as evidenced by Toyota's incredible "large behavior model" demonstration in September.

But Toyota is using bench-based robot arms, in a research center. Figure, like Tesla, Agility, and a growing number of other companies, is laser-focused on self-sufficient full-body humanoids that can theoretically go into any workplace and eventually learn to take over any human task. And these are not research programs, these companies want products out there in the market yesterday, starting to pay their way and get useful work done.

Figure's 01 was walking within 12 months of development – a record, Adcock believes
Figure's 01 was walking within 12 months of development – a record, Adcock believes

Adcock told us he hoped to have the 01 robot deployed and demonstrating useful work around Figure's own premises by the end of 2023 – and while that doesn't seem to have transpired at this point, a watch-and-learn capability in a humanoid is indeed big news.

The demonstration in question, mind you, is not the most Earth-shatteringly impressive task; the Figure robot is shown operating a Keurig coffee machine, with a cup already in it. It responds to a verbal command, opens the top hatch, pops a coffee pod in, closes the hatch and presses the button, and lets the guy who asked for the coffee grab the full cup out of the machine himself. Check it out:

So yes, it's fair to say the human and the Keurig machine are still doing some heavy lifting here – but that's not the point. The point is, the Figure robot took 10 hours to study video, and can now do a thing by itself. It's added a new autonomous action to its library, transferrable to any other Figure robot running on the same system via swarm learning.

If that learning process is robust across a broad range of different tasks, then there's no reason why we shouldn't start seeing a new video like this every other day, as the 01 learns to do everything from peeling bananas, to putting pages in a ring binder, to screwing jar lids on and off, to using spanners, drills, angle grinders and screwdrivers.

It shouldn't be long before it can go find a cup in the kitchen, check that the Keurig's plugged in and has plenty of water in it, make the damn press-button coffee, and bring it to your desk without spilling it – a complex task making use of its walking capabilities and Large Language Model AI's ability to break things down into actionable steps.

So don't get hung up on the coffee; watch this space. If Figure's robot really knows how to watch and learn now, we're going to feel a serious jolt of acceleration in the wild frontier of commercial humanoid robotics as 2024 starts to get underway. And even if Figure is overselling its capabilities – not that any tech startup would dream of doing such a thing – it ain't gonna be long, and there's a couple dozen other teams manically racing to ship robots with these capabilities. This is happening.

Make no mistake: humanoid robots stand to be an absolutely revolutionary technology once they're deployed at scale, capable of fundamentally changing the world in ways not even Adcock and the other leaders in this field can predict. The meteoric rise of GPT and other language model AIs has made it clear that human intelligence won't be all that special for very long, and the parallel rise of the humanoids is absolutely designed to put an end to human labor.

Things are happening right now that would've been absolutely unthinkable even five years ago. We appear to be right at the tipping point of a technological and societal upheaval bigger than the agricultural or industrial revolutions, that could unlock a world of unimaginable ease and plenty, and/or possibly relegate 95% of humans to the status of zoo animals or house plants.

How are you feeling about all this, folks? Personally, I'm a little wigged out. My eyebrows can only go so high, and they've been there for a good while now. I'm getting new forehead wrinkles.

Source: Figure

12 comments
12 comments
jzj
Large Language Model (LLM) computing (i.e., ChatGPT) + Large Behavioral Model (LBM) (i.e., computing for learning robotics) = 90% human capabilities for most things and 1000% improvement over human capabilities for those things.
rlseifer
This is not just worrying, it's a 3-alarm fire. A lot of low level workers will be eliminated from the labor force by this evolution, and eventually many not-so-low workers, as well.
BlueOak
Yah, not enough noise being made about risks and limits. You don’t have to watch a sci-fi movie to have concerns.
CraigAllenCorson
I wonder how long it will be until robots such as these are granted full human rights?
Wavmakr
The absolute key phrase of this story is "transferrable to any other Figure robot running on the same system via swarm learning". Only one robot needs to learn a task, a map, how to fire a weapon, and immediately an entire army of robots instantaneously has the same exact knowledge. Not only that, with their problem solving abilities, any particular task or action improved upon by a single unit, will be virally absorbed by the entire army in real time. Something to think about.
doliver
It is watching videos. What happens when it sees that you drink the coffee after you make it. How will it respond to it's first cup of Espresso?!!
Nelson
Supposably there must be more and more of us to pay for the retirement of older workers while AI, robotics and automation are going to make more and more of us obsolete, does not make sense.
Cymon Curcumin
It didn’t even place the cup under the machine, it didn’t fish the used pod out and dispose of it, it didn’t even put the finished cup out on the counter for the human. It did the easiest parts of the easiest form of coffee making. The Tesla Optimus can impressively transfer an egg into an egg cooker but can’t peel the egg afterwards.

I’m not dismissing the potential of this as a new way of programming robots that can yield significant breakthroughs but we are not talking about replacing humans. This is still an issue of picking out tasks that can be automated and redesigning more complex tasks so the automated part doesn’t interfere with the non automated part. That is going to be a lot of re-engineering and it won’t be adopted and implemented quickly. Nothing to do with technological productivity is adopted and implemented quickly. Business still run on DOS based software from the late 80s and electronic shelf tags are still rare despite having been available for years. These bots are too slow to run and find out the price of a product when the product doesn’t scan at the cash register—something that always seems to increase profoundly whenever a new inventory system is introduced. They’ve yet to perform any work that is economically competitive to human labour—not one dollar/hour’s worth.

Would you pay someone to slowly load a Keurig and stand there? Would you pay someone to make you a Keurig coffee in the first place? If you couldn’t do it you would need care far exceeding anything this thing could give

They will not be adopted fast enough to solve the declining global birth rate. When such bots are able to do work that is worth more than what a completely untrained, unskilled, uneducated human living in a developing country with no minimum wage is willing to work for without being enslaved, then they will start to be a tiny factor economically. They are still far from achieving that.
Scott
Game Over. Absolutely incredible.
ANTIcarrot
I would assume that part of the 'learning' process will eventually involve an evaluation of whether a full android is needed for a given task, and/or to make suggestions for simplier model of robot if that would be cheaper.
Load More