Music

Amazing optical microphone can separate multiple instruments from afar

Amazing optical microphone can separate multiple instruments from afar
Researchers Mark Sheinin (left) and Dorian Chan with their remarkable optical microphone, which is capable of constructing a high-fidelity sound signal from single instruments playing in groups
Researchers Mark Sheinin (left) and Dorian Chan with their remarkable optical microphone, which is capable of constructing a high-fidelity sound signal from single instruments playing in groups
View 6 Images
Researchers Mark Sheinin (left) and Dorian Chan with their remarkable optical microphone, which is capable of constructing a high-fidelity sound signal from single instruments playing in groups
1/6
Researchers Mark Sheinin (left) and Dorian Chan with their remarkable optical microphone, which is capable of constructing a high-fidelity sound signal from single instruments playing in groups
The system can image the audio from multiple instruments at once
2/6
The system can image the audio from multiple instruments at once
Imaging the sympathetic vibrations of a Doritos bag, shaking in response to the sound from a speaker
3/6
Imaging the sympathetic vibrations of a Doritos bag, shaking in response to the sound from a speaker
Comparing the optical mic's sound with a regular microphone
4/6
Comparing the optical mic's sound with a regular microphone
A laser places a speckle pattern on the sound source, and the cameras track variation in this pattern
5/6
A laser places a speckle pattern on the sound source, and the cameras track variation in this pattern
Combining 63 fps video from two cameras, one with a global shutter and one with a rolling shutter, allows the researchers to recover a sound signal at 63
6/6
Combining 63-fps video from two cameras, one with a global shutter and one with a rolling shutter, allows the researchers to recover a sound signal at 63,000 Hz
View gallery - 6 images

Researchers at Carnegie Mellon University have presented some remarkable audio from a new optical microphone system that uses cameras to see and reconstruct sonic vibrations. Remarkably, it can cleanly separate a single instrument playing in a group.

This kind of sonic isolation is extremely difficult even for high-end audio microphones, so to be able to achieve it using nothing but two cameras and a laser? It feels a bit like black magic. But the results, which you can see in the video embedded at the end of this piece, are stunning.

Here's the basic theory: sound is nothing but a series of pressure waves that travel through the air. Anything that makes sound is simply vibrating to create those pressure waves. An optical microphone is basically a camera system designed to monitor and interpret vibrations on the surface of a sound source – or even objects placed near a sound source, which vibrate in sympathy with the sound waves in the air around them.

The Carnegie Mellon team's system shines a laser on the vibrating surface, creating a precise speckle pattern that distorts as the sound source vibrates. A pair of cameras record the changes in the speckle pattern at 63 frames per second, and a software algorithm is used to analyze the speckle pattern variations in the footage across the two cameras, and reconstruct an audio signal.

A laser places a speckle pattern on the sound source, and the cameras track variation in this pattern
A laser places a speckle pattern on the sound source, and the cameras track variation in this pattern

The 63-fps frame rate might seem counter-intuitive here; human hearing can distinguish tones oscillating at between somewhere around 20 and 20,000 cycles per second, so ignoring all the other challenges here, a 63-fps limit on input data would seem to place a 63-Hz upper limit on the sound this device can "see."

Not so. Indeed, this optical mic can read sound up to a remarkable 63,000 Hz thanks to some very clever use of the cameras involved. One camera uses a global shutter, meaning that it reads its entire image sensor at the same time for each frame. The other camera uses a rolling shutter, so it reads its sensor as a series of a thousand consecutive horizontal lines per frame. The rolling shutter image therefore contains high-frequency information, which can be compared back to the global shutter image to account for things like the movement and tilt of a guitar as a musician plays, and the software algorithm is clever enough to assemble a sound out of this information.

"We've invented a new way to see sound," said Mark Sheinin, lead author on the research paper and a post-doctoral research associate at the Illumination and Imaging Laboratory in Carnegie Mellon's Robotics Institute. "It's a new type of camera system, a new imaging device, that is able to see something invisible to the naked eye."

Combining 63 fps video from two cameras, one with a global shutter and one with a rolling shutter, allows the researchers to recover a sound signal at 63
Combining 63-fps video from two cameras, one with a global shutter and one with a rolling shutter, allows the researchers to recover a sound signal at 63,000 Hz

The team has tested this optical mic on guitar and violin, on a speaker cone, on tuning forks, and even on a Doritos bag sitting in front of a speaker and vibrating in sympathy with the ambient sound. They also used it to separate the audio from two guitars playing a duet, and from two speakers, each playing a different song beside each other.

"This system pushes the boundary of what can be done with computer vision," said co-author Matthew O'Toole, an assistant professor in the Robotics Institute. "This is a new mechanism to capture high speed and tiny vibrations, and presents a new area of research."

The team believe this tech could be used for much more than deciphering audio signals; vibration can be a key indicator of mechanical wear as faults develop in machinery, for example.

"If your car starts to make a weird sound, you know it is time to have it looked at," Sheinin said. "Now imagine a factory floor full of machines. Our system allows you to monitor the health of each one by sensing their vibrations with a single stationary camera."

Check out the optical mic setup and hear the results in the video below.

Dual-Shutter Optical Vibration Sensing (CVPR 2022 Oral)

Source: Carnegie Mellon University

View gallery - 6 images
2 comments
2 comments
christopher
I wonder if you could do the same thing with one camera pointed at a piece of tissue paper?
TpPa
finally a use for them that isn't a Government covert mission.