To make virtual worlds feel more immersive, artists need to fill them with buildings, rocks, trees, and other objects. Creating and placing all those virtual objects quickly adds up to quite a high development time and cost. But now, researchers at Nvidia have taught artificial intelligence systems how to generate and detail new virtual cityscapes, by training neural networks on real video footage.

Unlike traditional algorithms that would need to be programmed with specific instructions, neural networks are more like organic brains, learning from "experience" over time. That experience can be fed as large data sets into the system, which can then use the rules it's learned to generate its own content. In recent years, that's been used to apply different art styles to videos, create short videos out of still photos based on predictions of what would happen next, and generate extra in-between frames to make any clip happen in slow motion.

In the new work, the team used a PyTorch deep learning framework running on Nvidia Tesla V100 GPUs on a DGX-1, and trained them on thousands of Cityscapes videos and Apolloscapes datasets. The researchers would then build the basics of a virtual city using Unreal Engine 4, and highlight the general outline and placement of things like buildings, trees and cars. Then, based on everything it's learned the neural network fills in the blanks, including fine detail, color, lighting and texture.

The end result is a virtual world for animation or video games that can be generated far faster than having human artists build it from scratch. Of course, if specific things need to be added, removed or edited by hand, that can be done afterwards. The technique isn't designed to remove the creative human element of the process but rather speed up the tedium of certain aspects.

"Neural networks – specifically – generative models are going to change the way graphics are created," says Bryan Catanzaro, lead researcher on the team. "One of the main obstacles developers face when creating virtual worlds, whether for game development, telepresence, or other applications is that creating the content is expensive. This method allows artists and developers to create at a much lower cost, by using AI that learns from the real world."

A demo of the system, which includes a simple driving game through an AI-generated cityscape, will be presented at the NeurIPS conference in Montreal this week.

The team describes the work in the video below, and the paper is available online (PDF).

Source: Nvidia

View gallery - 2 images