Computers

Facebook is now using AI to describe photos to the blind

Facebook is now using AI to de...
How do you know what's in a photo if you can't see it? Facebook has the answer
How do you know what's in a photo if you can't see it? Facebook has the answer
View 5 Images
How do you know what's in a photo if you can't see it? Facebook has the answer
1/5
How do you know what's in a photo if you can't see it? Facebook has the answer
Facebook's alternative captions are now live in the iOS app for some users
2/5
Facebook's alternative captions are now live in the iOS app for some users
Objects, people and backgrounds can all be identified
3/5
Objects, people and backgrounds can all be identified
A caption is inserted if Facebook's AI is more than 80 percent confident it knows what it's seeing
4/5
A caption is inserted if Facebook's AI is more than 80 percent confident it knows what it's seeing
The AI network has millions of example photos to work from
5/5
The AI network has millions of example photos to work from

Browse through your Facebook News Feed and you'll see photos play a prominent part, meaning visually impaired users are missing out on a lot of updates from their friends. Now Facebook's engineers have harnessed the power of an artificial intelligence network to describe these pictures to blind or partially blind users.

Facebook is calling the system "automatic alternative text" and it's based on a neural network primed with billions of parameters and millions of examples. Such neural networks – vast, complex databases designed to mimic the human brain as closely as possible – are playing an increasingly important role in modern computing.

The AI software doesn't actually "see" the picture, but it can compare the objects in it with its vast internal database of similar photos and make an educated guess about what's being shown. Part of the challenge, Facebook says, is in getting computers to recognize what's most important in an image, whether that's the people, the background or the "action."

For each image, the AI system returns a confidence score indicating how sure it is that it can identify what's in the picture. If this is above 80 percent, an automatically-generated caption appears. According to the engineers behind the system, that target is already being hit for half of all the pictures on the social network, and the underlying technology is getting better all the time (another key characteristic of neural networks).

Facebook's alternative captions are now live in the iOS app for some users
Facebook's alternative captions are now live in the iOS app for some users

When objects and people have been identified, Facebook's software constructs a sentence to describe the picture, usually ordered by how confident the AI is about the presence of each element. If there's some ambiguity about the picture then the sentence starts with "image may contain" to express that uncertainty.

The feature is live now in the Facebook iOS app, as long as your language is set to English and you're in the US, UK, Canada, Australia or New Zealand. Facebook says it hopes to roll out the service to more platforms, languages and markets in the near future. It actually works with any screen reader software – on iOS you can enable it via the VoiceOver tool in the Accessibility section of Settings (under General), for example.

Coincidentally, Twitter has also started experimenting with a similar feature, though in this instance captions are added manually. Users on iOS and Android are being encouraged to add their own alt text captions for the benefit of the visually impaired. Letting humans do the work means more accuracy in the description, but it does depend on people putting in the time and effort to explain what they're posting.

Source: Facebook

1 comment
Daishi
"for the blind" This feature is 99.9% about improving big data and data mining for advertisers to profile their customer base with targeted ads and 0.1% "for the blind". I'm not saying that's a bad thing per se, if you aren't paying for the product you are the product as they say but they aren't being terribly forward with the implications of what the tech will be useful for.