Facebook is now using AI to describe photos to the blind

By David Nield

April 05, 2016

How do you know what's in a photo if you can't see it? Facebook has the answer

View 5 Images

1/5

How do you know what's in a photo if you can't see it? Facebook has the answer

2/5

Facebook's alternative captions are now live in the iOS app for some users

3/5

Objects, people and backgrounds can all be identified

4/5

A caption is inserted if Facebook's AI is more than 80 percent confident it knows what it's seeing

5/5

The AI network has millions of example photos to work from

View gallery - 5 images

Browse through your Facebook News Feed and you'll see photos play a prominent part, meaning visually impaired users are missing out on a lot of updates from their friends. Now Facebook's engineers have harnessed the power of an artificial intelligence network to describe these pictures to blind or partially blind users.

Facebook is calling the system "automatic alternative text" and it's based on a neural network primed with billions of parameters and millions of examples. Such neural networks – vast, complex databases designed to mimic the human brain as closely as possible – are playing an increasingly important role in modern computing.

The AI software doesn't actually "see" the picture, but it can compare the objects in it with its vast internal database of similar photos and make an educated guess about what's being shown. Part of the challenge, Facebook says, is in getting computers to recognize what's most important in an image, whether that's the people, the background or the "action."

For each image, the AI system returns a confidence score indicating how sure it is that it can identify what's in the picture. If this is above 80 percent, an automatically-generated caption appears. According to the engineers behind the system, that target is already being hit for half of all the pictures on the social network, and the underlying technology is getting better all the time (another key characteristic of neural networks).

When objects and people have been identified, Facebook's software constructs a sentence to describe the picture, usually ordered by how confident the AI is about the presence of each element. If there's some ambiguity about the picture then the sentence starts with "image may contain" to express that uncertainty.

The feature is live now in the Facebook iOS app, as long as your language is set to English and you're in the US, UK, Canada, Australia or New Zealand. Facebook says it hopes to roll out the service to more platforms, languages and markets in the near future. It actually works with any screen reader software – on iOS you can enable it via the VoiceOver tool in the Accessibility section of Settings (under General), for example.

Coincidentally, Twitter has also started experimenting with a similar feature, though in this instance captions are added manually. Users on iOS and Android are being encouraged to add their own alt text captions for the benefit of the visually impaired. Letting humans do the work means more accuracy in the description, but it does depend on people putting in the time and effort to explain what they're posting.

Source: Facebook

View gallery - 5 images

1 comment

Daishi April 5, 2016 02:27 PM

"for the blind"
This feature is 99.9% about improving big data and data mining for advertisers to profile their customer base with targeted ads and 0.1% "for the blind". I'm not saying that's a bad thing per se, if you aren't paying for the product you are the product as they say but they aren't being terribly forward with the implications of what the tech will be useful for.

Facebook is now using AI to describe photos to the blind

Tags

Most Viewed

Toyota and Lexus no longer most reliable carmakers, says Consumer Reports

France runs fusion reactor for record 22 minutes

Laser-wielding device is like an anti-aircraft system for mosquitoes

FREE NEWSLETTER