Technology

The beautiful, hilarious surrealism of early text-to-video AIs

The beautiful, hilarious surrealism of early text-to-video AIs
Will Smith eating spaghetti in a video generated by Modelscope, with a faint logo from Shutterstock revealing its source
Will Smith eating spaghetti in a video generated by Modelscope, with a faint logo from Shutterstock revealing its source
View 1 Image
Will Smith eating spaghetti in a video generated by Modelscope, with a faint logo from Shutterstock revealing its source
1/1
Will Smith eating spaghetti in a video generated by Modelscope, with a faint logo from Shutterstock revealing its source

A new creative AI system called ModelScope is now pumping out short videos in response to text prompts. The early results are wonderfully bizarre and thoroughly memeworthy – but it's immediately clear how immensely powerful these tools will become.

Developed by a collaborative team at Huggingface, Modelscope is a "multi-stage text-to-video diffusion model," which takes plain English text prompts, attempts to understand what you're hoping to see, then generates and de-noises a short video for you. You can play with it online through a very simple interface. It's very early days for this sort of thing, making it the perfect time to marvel both at its incredible capabilities and at its bizarre misunderstanding of the world.

By far the most popular use of this tech at the moment seems to be making celebrities eat things, and It's easy to see why.

As always, this generative AI has been trained on a large dataset of existing human-created video, raising some interesting legal questions when it comes to IP owned by large copyright holders.

"The fundamental problem with generative AI and deep fakes in all of these new AI systems is that the training data that is being used is not owned by the deep fakers," says Hyperreal founder and CEO Remington Scott. "And the copyright holders aren't getting paid. It's a fundamental problem that is going to become really big in IP. Soon, people will be training AIs on all the Avatar movies, then building whole new stories using AI. That's not gonna fly. We saw how bad Napster was for the music industry; this is Napster 2.0 for the whole IP industry."

"We're in the Wild West right now, but watch how it's gonna play out," he continues. "One studio is going to take somebody to court and say 'open up the training data, let's see what you trained that on.' And if they didn't use that studio's material, every other studio will be watching to say 'ah, but you used mine.'"

Fascinating stuff. Watch how quickly this technology evolves, if image and text generation are any indication, things are about to go asymptotic.

Source: Huggingface

1 comment
1 comment
Smokey_Bear
That's funny stuff, obviously it has a long ways to go, but with AI, that might just mean a couple years.