Creating a 3D map of a room could someday be as simple as randomly placing four microphones within the space, then snapping your fingers. Researchers from Switzerland’s EPFL (École Polytechnique Fédérale de Lausanne/Swiss Federal Institute of Technology) have recently done so on a limited scale, and are now excited about the technology’s possible applications.
A team led by Prof. Martin Vetterli developed the technique, in which a computer algorithm analyzes and compares the sounds (such as finger snaps) picked up by four microphones – it doesn’t matter where in the room those microphones are placed.
For each mic, the computer is able to differentiate between a sound as it’s initially received directly from its source, and echoes of that sound picked up even just a fraction of a second later. By analyzing the lag between the initial sound and its echo, and by comparing the length of the lags measured at each microphone (keeping in mind the mics’ locations relative to one another), the computer can accurately calculate how far away the walls are in all directions.
The algorithm can also tell the difference between echoes rebounding off walls for the first or second time, and can establish a unique echo “signature” for each wall.
The system was first tested in a simple empty room, one wall of which could be moved back and forth, and it consistently mapped the room to within one millimeter of its actual dimensions. It was then tried in a more complex environment, an alcove within the Lausanne Cathedral (above), where it was said to deliver “good partial results.” The researchers believe that further tests using more microphones will prove more successful.
It is now hoped that once perfected, the technology could be used in architecture, forensics, or as an indoor navigational tool on smartphones. Police might even be able to use it to map the room that a mobile call is originating from, if the caller wanders around the room as they’re talking.
Source: EPFL
The only obvious limitation is how quiet you can make the room when undertaking this survey. The church is a nice example, as is a library perhaps. Don't think you could image a train station or busy area easily.
...unless you move to the upper audible frequency band 13-15kHz, and send a specific range of pulsed tones from at least two sources within the area being surveyed. May not have to be much louder than 80-90 dB as most crowd noise is in a lower band. Might get a dirty look from the odd dog or bird in the area, but a small price to pay :)
I don't think they will ever get the accuracy that they desire from echoes only. However, the echoes PLUS a stereo photo or two, or the echoes PLUS a photo of a laser scan or two will probably give amazing results.