How Google's Pixel phones take such terrific photos, especially in shaky hands
Google's wizardry with computational imaging is at the heart of what makes its Pixel phones such impressive photography tools despite the fact that they only have single 12.2-megapixel cameras. A new paper reveals some of the magic behind the impressive Night Sight and Super-Res Zoom features.
Smartphones have sprung out of nowhere to completely dominate the low end of the photography market in the last 10 years. While they can't compete with big-sensor DSLR and mirrorless rigs for pure image quality, they're so portable, so easy to use, so connected and so massively convenient that there's very little point for most people taking a compact camera anywhere when there's such a handy gadget right there in the pocket already.
Google's Pixel 3 series is renowned for having one of the best cameras in the business, despite the fact that it only runs a single front and rear camera in a day and age where many competitors run up to four separate cameras to handle different conditions, as well as 3D depth sensors to gauge distance, enabling faux depth of field effects that simulate large sensors paired with expensive wide-aperture lenses.
Where the others get their results by adding more hardware, Google excels using some very clever image processing that makes use of the wobbly hand held technique the vast majority of phone pictures are taken with. A new paper and video give us a bit of insight into exactly how it's done, specifically for two shooting modes: the astonishing Night Sight low-light mode, and Super-Res Zoom, which allows you to zoom in and get clear shots that actually exceed the pixel count of the sensor they're shot with.
The key to it all is burst shooting, something that happens to almost all your Pixel photos whether you ask it to or not. Press the shutter button, and the camera will capture a bunch of frames at once. This, plus a bunch of other processing, means you basically never have to worry about your subject blinking – the phone will sort through a couple dozen frames and simply pop up the one it thinks has the widest open eyes and brightest smiles, making you feel like a terrific photographer.
But it also allows the software to get funky at sub-sensor resolution levels, pulling out details that you can't get out of single frames and building super-high-resolution shots similarly to how something like the Panasonic S1R does it, but using hand shake instead of sensor shift. This same shaking also contributes to noise- and moire-reduction algorithms that make these tiny phone cameras far better in low light than the sensors themselves have any right to be.
Put simply, the phone takes several shots from your wobbly hand-held shutter press, then lines them back up on top of one another. It then checks for motion within the frame – things that are actually in motion and not a result of camera wobble. This allows it to create a motion robustness map, figuring out which parts of the image it can super-resolve and which it can't.
Then, in the parts of the image that are well controlled for motion and nicely overlaid, it begins combining the frames. The more frames it has to work with, the more information it can potentially pull out about details too small for a single pixel in regular resolution. It can use this information, along with calculations about edges, flats and detailed areas, to take a very educated guess at what's happening at a sub-pixel resolution, building photos more detailed than should be possible on a 12-megapixel sensor and scaling the resolution up.
Thus, Super-Res Zoom lets you pinch to zoom in, and shoot photos that eventually resolve past the limits of the camera. And the same technique of information gathering across multiple, slightly offset frames allows Google to pull impressive amounts of color and contrast information out of low-light shots in Night Sight mode, making for exceptionally good noise control and de-mosaicing, as well as removal of moire (in shots with high frequency patterns) and chromatic aberration, or color bleed, in shots with sharp transitions between dark and light areas.
If you'd like to get right down into the nitty-gritty of how Google's doing these steps, you can check out the paper released last week, or enjoy a more digestible explanation of the super-res zoom effect from last year at the Google AI Blog. But the video below provides an excellent quick overview as well.