DARPA's new 1.8-gigapixel camera is a super high-resolution eye in the sky
DARPA recently revealed information on its ARGUS-IS (Autonomous Real-Time Ground Ubiquitous Surveillance Imaging System), a surveillance camera that uses hundreds of smartphone image sensors to record a 1.8 gigapixel image. Designed for use in an unmanned drone (probably an MQ-1 Predator), from an altitude of 20,000 ft (6,100 m) ARGUS can keep a real-time video eye on an area 4.5 miles (7.2 km) across down to a resolution of about six inches (15 cm).
One of the greatest needs of a ground commander in these days of asymmetrical warfare is to know what is happening on the field of action. This alone allows a commander to guide forces to where they will have the greatest effectiveness, while also substantially reducing the chances of surprise actions by enemy forces. This level of situational awareness is difficult enough to acquire on a conventional battlefield, but has been nearly impossible when the field of action includes spatially messy theaters, such as towns, cities, and oil refineries.
Improving situational awareness, particularly in asymmetric warfare scenarios, is one of DARPA's primary missions in recent years. The Autonomous Real-time Ground Ubiquitous Surveillance-Imaging System (ARGUS-IS) program is developing a real-time, high-resolution, wide-area video surveillance system that provides real-time video across a large theater of action, identifies and tracks moving objects, and provides up to 65 individually targeted video windows for close-up observation.
The 1.8 gigapixel digital camera is the simple part of the ARGUS-IS system, consisting of a matrix of CMOS optical sensors, high quality imaging optics, and a six-axis stabilized gimbal mounting system. The 1.8 gigapixel sensor is made up of a matrix of 368 Aptina MT9P031 5-megapixel smartphone CCDs. These sensors have an active area of 5.7 x 4.3 mm each, so the width of the sensor matrix is about 90 mm (3.5 in). Using a little trigonometry and basic optics, we can estimate the focal length of the imaging lens to be about 85 mm (3.35 in).
Using Commercial Off The Shelf (COTS) image sensors for wide angle imaging requires that telecentric imaging optics be used to avoid massive computer processing to correct the images. When a telecentric lens is used, the focused light hits the image sensors perpendicularly. This avoids brightness variations resulting from the lenslets positioned over the CMOS pixels and color distortions due to misalignment of the incoming light with the pixel's Bayer filters.
The camera has four lenses, which are used to avoid gaps in the image when the 368 separate images are combined into a single master image. The image above shows that the optically active portion (grey inner rectangle) of a CMOS image sensor does not fill the chip on which it is fabricated – there is extra room needed for wiring. If the 368 image sensors of the ARGUS-IS were packed in a single matrix, a significant part of the surveilled field of view would not be imaged.
Instead, the ARGUS-IS sensor matrix is split into four parts, each having 92 sensors. In the image above, the matrix is split into green, red, blue, and yellow submatrices. (These colors are not the ones to which the sensors are sensitive – the colors are simply for reference.) The desired image (in the center of the figure) is marked off by the submatrices into 2 x 2 patterns. The green sensors are in the upper left, the red sensors in the upper right, the blue sensors in the lower left, and the yellow sensors in the lower right of the repeating 2 x 2 patterns. Each of the four submatrices now has plenty of room around the edges of the sensors to accommodate the necessary wiring structures. Each camera lens feeds each color of submatrix, and then the four partial images are electronically stitched together into a single image covering the entire field of view.
Now comes the hard part. The ARGUS-IS takes 12 frames a second to maintain video surveillance over its field of view. The sensor data amounts to 12 bits per pixel, so the camera delivers a flood of raw image data amounting to 32.4 GB/s, while the Common Data Link used by the ARGUS-IS has a capacity of 34.25 MB/s. Clearly, a great deal of data compression must take place in the airborne ARGUS unit. To do so, a 32-processor data compression unit that carries out the data compression and object tracking function is flown along with the ARGUS-IS camera.
Lawrence Livermore National Laboratory (LLNL) was given the task of developing methods to compress and analyze the raw video data. Most of the visual information in an aerial image does not change from frame to frame – rooftops don't change unless someone walks on them (or a bird flies by). The LLNL software works by identifying interesting moving objects, tracking them as they move, and recording changes in their appearance. The researchers claim that this approach, together with JPEG2000 video compression, results in a thousand-fold compression of the raw video data. This amazing level of compression allows the use of the Common Data Link for air to ground communications.
DARPA's aim is to be able to store 70 hours of imagery data within the ground station so that a commander can look at an area that was ignored in yesterday's real-time surveillance, and see the entire day's video record of that area. To store the decompressed raw video would require nearly ten petabytes per day of raw video. Instead, the compressed data stream from the ARGUS-IS is stored, which only requires about six terabytes of data storage – only twice the size of my US$200 backup drive. Now that the video data is acquired, people need additional help to figure out what is happening. The traditional interface (eyeballs) between video data and the human analysts is far too inefficient when this level of data input is experienced. This is not simply a problem for ARGUS-IS, but it also limits effective use of a number of large data flow intelligence assets.
Helping to make sense of everything happening over a ten to twenty square mile field of action is the job of the Ground Exploitation System (GES). The GES provides a visual interface to the ground imagery which is rather like that of Google Earth, allowing dozens of users to view the background imagery, moving target indicators that follow tens of thousands of ground targets, and 65 VGA sized video windows to keep track of locations of particular interest.
While some of the higher-level functions of the ARGUS-IS system are still being optimized, the overall functionality of the system is amazing, especially given that its capabilities are probably considerably greater than is currently being revealed. A better feel for the "basic capabilities" of the ARGUS-IS is provided by a video from the PBS television show NOVA, which was given unprecedented access to the ARGUS-IS system by DARPA.