Face detection software has slowly crept into mainstream use, from Facebook photo tagging to Android phone unlocking, but new research looks set to move the technology on significantly. Scientists at Yahoo Labs and Stanford University have come up with a new approach that can register faces at any angle, even when partially hidden, making it easier than ever to be detected.
Sachin Farfade and Mohammad Saberian at Yahoo Labs in California, and Li-Jia Li at Stanford University, say their new algorithm is simpler and more accurate than many alternative methods. To understand how it works, we first need to take a quick historical detour to 2001 and the revolutionary breakthrough made by computer scientists Paul Viola and Michael Jones.
Viola and Jones overcame an impasse in face detection technology by ignoring the thornier issue of recognition and concentrating on spotting faces. Their algorithm looks for a light vertical line (the nose) crossed by a dark horizontal line (the eyes) in a "detection cascade." This has proved very effective at recognizing faces from head-on and is incorporated into many of the digital cameras currently on the market.
The technique isn't as good for spotting faces at an angle or those which are partially concealed though, which brings us back to Farfade, Saberian and Li. The team has taken a fundamentally different approach to its predecessors, employing an advanced type of machine learning known as a deep convolutional neural network. Essentially, a huge database of annotated images is used to teach the software what a face looks like.
Farfade and his colleagues have built a database containing 200,000 images of faces and 20 million images without faces. 50,000 different iterations of 128 images each were then used to train the neural network powering their detection engine. The result is a tool that can spot faces at many different angles (even upside down) and identify individual faces in a picture that contains a lot of them.
"We evaluated the proposed method with other deep learning based methods and showed that our method results in faster and more accurate results," say the team members, calling their creation the Deep Dense Face Detector. "We are planning to use better sampling strategies and more sophisticated data augmentation techniques to further improve performance of the proposed method for detecting occluded and rotated faces."
As the technology filters through into consumer and commercial products, it could mean that eventually your Kinect controller and the CCTV system on your local high street are going to be able to spot you more quickly than ever. It can even be used retrospectively to look through old photos and videos, according to the researchers.
Source: Cornell University Library via MIT Technology Review