DeepHand motion tracking enters the VR arms race

June 24, 2016

Researchers at Purdue University have developed a system that uses a depth-sensing camera and a deep-learning network to translate physical hand movements into VR environments

View 2 Images

1/2

Researchers at Purdue University have developed a system that uses a depth-sensing camera and a deep-learning network to translate physical hand movements into VR environments

2/2

The deep-learning network converts the hand position into numbers, then draws from a database of over 2.5 million poses to determine the best match to display

The VR arms race is in full flight, and hands seem to be the next front line in the battle, whether that's fought with infrared cameras, electromagnetic sensors, handheld controllers like those used by the HTC Vive, Playstation Move or Oculus Touch, or even full-body exoskeletons. Now, a team at Purdue University is capturing hand movements with depth-sensing cameras, and using a deep-learning network to understand millions of possible hand positions in order to display them accurately in the virtual world.

The system, which the researchers call DeepHand, basically functions like the Leap Motion, but makes use of a deep-learning "convolutional neural network." The camera reads the position and angle of various points of the hands and runs them through a specialized algorithm, which quickly scans a database of over 2.5 million poses and chooses the best match, which is then represented visually in VR.

"We identify key angles in the hand, and we look at how these angles change, and these configurations are represented by a set of numbers," says one of the paper's authors, doctorate student Ayan Sinha.

To ensure that the virtual hand movements are displayed as fast as possible, the program can predict which configurations the hands are likely to morph into next by identifying the "spatial nearest neighbors" within the database. The algorithm is also able to figure out and display the position of parts of the hand that the camera may be unable to directly see, based on the orientation of adjacent areas.

As a deep-learning network, the researchers first had to train DeepHand to recognize the gestures by feeding the database into it. Doing so required some pretty powerful processing, but once that heavy lifting is out of the way, using the system is possible on a standard computer.

The DeepHand research paper by Ayan Sinha, Chiho Choi and Karthik Ramani is available online.

Watch a demonstration of the technology in the video below.

Source: Purdue University