Technology

OpenAI's new text-to-3D system: The dawn of voice-controlled CAD

OpenAI's new text-to-3D system: The dawn of voice-controlled CAD
Like DALL-E for 3D models, Shap-E can turn natural language descriptions into 3D assets
Like DALL-E for 3D models, Shap-E can turn natural language descriptions into 3D assets
View 2 Images
Like DALL-E for 3D models, Shap-E can turn natural language descriptions into 3D assets
1/2
Like DALL-E for 3D models, Shap-E can turn natural language descriptions into 3D assets
OpenAI's earlier Point-E system generated 3D point clouds like these
2/2
OpenAI's earlier Point-E system generated 3D point clouds like these

OpenAI is making rapid advances in a new text-to-3D object system it's been working on. The Shap-E AI, available as an open-source download, can generate 3D assets straight from a text description, or build them from supplied images.

Back in December last year – or about a billion years ago in the AI timeframe – OpenAI released its Point-E system, capable of taking a text prompt and using it to build rudimentary 3D models in the form of point clouds.

Now, however, the company has released Shap-E, a new system that's much faster, and capable of building its models as "implicit functions" – mathematical formulas that can either be rendered either as textured meshes, or as neural radiance fields (NeRFs), 3D models developed from 2D images using machine learning.

The tech is pretty nerdy, but the ambition here is really interesting. These 3D models are designed to work with downstream applications, so let's speculate a little on what's going on here.

OpenAI's earlier Point-E system generated 3D point clouds like these
OpenAI's earlier Point-E system generated 3D point clouds like these

If you can talk to a computer and have it generate 3D models in response to natural language, then you'll be able to talk to a GPT-like AI that's capable of acting as a CAD designer. That means you might be able to create designs for products, parts, buildings, sculptures and whatnot without touching a mouse, making edits verbally or through some other means.

It means that video games might be able to generate items flexibly on the fly in response to a player's words or actions. "Ho, blacksmith! Make me a 10-foot sword with a rounded, bell-shaped end and two large, circular guards at the hilt!"

It's an early step toward verbally-programmed 3D visual effects, and a potential way to generate anything from an outfit, to a home, to a companion in VR/AR applications.

And of course, it'll eventually interface with 3D printing, meaning that shapes conjured up by these AIs will most certainly be turning up in the real world once they're getting done at higher quality. Indeed, the way things look like they're going, you'll probably not have to interface with this kind of system directly; you'll talk to your language model-based AI assistant, and it'll go through the process of prompting the 3D-maker AI, because it'll write better prompts than you.

Which of course means you're looking at the birth of the tech that's going to give LLMs like ChatGPT the ability to manufacture objects in the real world of their own volition, should such a volition arise in an AI with sufficient access. A slightly sobering thought, if you're on the "these things are going to murder us all" train.

But on the other hand, the idea of super-intelligent AIs building squads of exterminator robots to eradicate biological life is unlikely to become a problem. That's an idea so dumb a human could come up with it; they'll figure out much more effective methods. Sleep tight!

Source: OpenAI

3 comments
3 comments
stevendkaplan
…this sounds like the beginning of Skynet’s ability to being able to design Terminators
paul314
The next big step will (or won't) be mechanisms. "Computer, design me a mechanical cuckoo clock."
Daishi
Something interesting about AI killing off humans though unexpected ways is that we are not equipped to identify how the would go about it and we would almost need an AI trained to look for ways they could do that to help police against them, but the way of training such an AI would be through "gain of function" training inside a simulation where one AI would be trying to do exactly that and such a thing would involve the risk of that AI escaping the simulation.