Science

Google algorithm constructs (weird, glitchy) faces from 8x8 pixel photos

Google algorithm constructs (w...
From the 8x8 picture on the left, the Google Brain team's algorithm generated the next four 32x32 photos as possible options.
From the 8x8 picture on the left, the Google Brain team's algorithm generated the next four 32x32 photos as possible options.
View 5 Images
From the 8x8 picture on the left, the Google Brain team's algorithm generated the next four 32x32 photos as possible options.
1/5
From the 8x8 picture on the left, the Google Brain team's algorithm generated the next four 32x32 photos as possible options.
Starting with the image on the left, a number of different algorithms were fed a downsampled 8x8 image and asked to generate 32x32 images. These are the results - the 3 on the right being the Google team's results
2/5
Starting with the image on the left, a number of different algorithms were fed a downsampled 8x8 image and asked to generate 32x32 images. These are the results - the 3 on the right being the Google team's results
From the almost indecipherable 8x8 image on the left, the Google Brain team's algorithm generated the 32x32 image in the middle, which is amazingly close to the original images on the right
3/5
From the almost indecipherable 8x8 image on the left, the Google Brain team's algorithm generated the 32x32 image in the middle, which is amazingly close to the original images on the right
Images of bedroom interiors and celebrity headshots were used to test the new algorithm
4/5
Images of bedroom interiors and celebrity headshots were used to test the new algorithm
The best-and worst-performing images, as ranked by how often they were able to fool human subjects into thinking they were taken by a camera
5/5
The best-and worst-performing images, as ranked by how often they were able to fool human subjects into thinking they were taken by a camera

We've seen before how Google is experimenting with its RAISR algorithm to add detail and sharpness to images, but a new paper from a team of Google Brain researchers shows how machine learning might take things to a whole new level.

While the RAISR project sought to bring crispness and clarity to images that were already decent photos, the new Pixel Recursive Super Resolution paper shows how images might eventually be upsized from tiny, blocky, 8x8 pixel sprites that don't look like anything until you squint, to a far more detailed 32x32 pixel format.

From the almost indecipherable 8x8 image on the left, the Google Brain team's algorithm generated the 32x32 image in the middle, which is amazingly close to the original images on the right
From the almost indecipherable 8x8 image on the left, the Google Brain team's algorithm generated the 32x32 image in the middle, which is amazingly close to the original images on the right

The process works by first ingesting a ton of similar photos at high resolution – in this case, tightly cropped celebrity headshots. The computer rapidly downscales a ton of these shots to the same blocky, low-res format as the image it's trying to upscale, and works backwards from there.

In a celebrity headshot, for example, it knows where roughly to find the eyes, nose, mouth, hair and jawline. Having located them, it can go through its extensive database of prior photos to see what kinds of pixel structures would usually go where, to build an estimate of how the photo might have looked in higher resolution.

Images of bedroom interiors and celebrity headshots were used to test the new algorithm
Images of bedroom interiors and celebrity headshots were used to test the new algorithm

That explains how things might end up looking as glitchy and terrifying as some of the examples the algorithm comes up with – you can see the system throwing different facial features at this amorphous facial blob to see how they stick. Most of the results are pretty comical.

Even so, it has shown a decent ability to fool people already, with about 10 percent of its celebrity headshot images fooling human judges into guessing they were taken by a camera.

The results are even better without the complexities of the human face to contend with – it fooled people with a success rate of about 28 percent when the photos were of bedroom interiors instead.

The best-and worst-performing images, as ranked by how often they were able to fool human subjects into thinking they were taken by a camera
The best-and worst-performing images, as ranked by how often they were able to fool human subjects into thinking they were taken by a camera

Obviously, this is a very limited technology. You can put to rest any suspicion that it might end up identifying people out of security camera footage. It's purely based on a learning machine's best guess for what's happening behind those pixels, in the context of a ton of images it's viewed previously. It's digital art as much as it is science.

I do wonder, though, whether an approach like this might be able to deliver much more accurate results if it's fed a string of pixelated face image frames from a video; whether there might be enough information in a whole string of these kinds of low-res images to piece together a more accurate face model.

Full paper: Google Brain

2 comments
Bob
You can't turn a lack of information into a good picture without more data. However, I would like to find a photo program that turns a blurry or out of focus photo into a sharp one. Anybody know of one?
EH
Yes, a few frames of data would give a lot more constraints and a more accurate interpolation. There have been algorithms designed to up-res moving pictures for at least since "S. Farsiu, et. al.'s "Fast and Robust Multi-frame Super-resolution" in IEEE Transactions on Image Processing, 2004. "Super-resolution" algorithms are related to "compressed sensing", another big buzzword.