Computers

Microsoft Captionbot will tell you what's in your photos

Microsoft Captionbot will tell...
Microsoft Captionbot will take a look at your image and tell you what's going on
Microsoft Captionbot will take a look at your image and tell you what's going on
View 5 Images
Microsoft Captionbot will take a look at your image and tell you what's going on
1/5
Microsoft Captionbot will take a look at your image and tell you what's going on
The system saves images that have been uploaded in an attempt to improve its hit rate
2/5
The system saves images that have been uploaded in an attempt to improve its hit rate
There are three separate elements to Captionbot's capabilities
3/5
There are three separate elements to Captionbot's capabilities
Microsoft's own images always seem to return positive with correct captions, ours weren't quite as successful
4/5
Microsoft's own images always seem to return positive with correct captions, ours weren't quite as successful
Captionbot lets users upload their own photo, though there's also a range of stock images to use while the system is in testing
5/5
Captionbot lets users upload their own photo, though there's also a range of stock images to use while the system is in testing
View gallery - 5 images

Scrolling through your Facebook, Twitter or Instagram feed is proof positive that we love taking photos - of our food, our children, our drunken antics and, most of all, ourselves. No matter how good your happy snap, finding a way to caption it can be tricky. It's still early days, but one day you could hand off the responsibility to a Captionbot.

Powered by Microsoft's Cognitive Services, the bot looks over your images and gives rudimentary descriptions of what it can see using a Computer Vision API, an Emotion API and a Bing Image API. This is the same base software Microsoft has used for its How Old Do I Look? system.

To actually create the captions, this system has been coupled with the language system from Tay, Microsoft's attempt at a chat bot that was shut down after a vulnerability led to it tweeting racist and sexist remarks.

Captionbot lets users upload their own photo, though there's also a range of stock images to use while the system is in testing
Captionbot lets users upload their own photo, though there's also a range of stock images to use while the system is in testing

The photo captioning system is not completely accurate, but attempts to describe the person in an image, what they're doing and their emotions in the moment. It can also recognize animals and describe landscapes, although it did respond with "I am not really confident" to both the images we uploaded, before confusing one of our male journalists for a female. Okay, so that journalist was me...

What Microsoft's system won't do is read the caption aloud, which means deaf people may still need to turn to Facebook's bot for help. The Facebook bot gives suggestions as to what's in an image, with responses qualified with a rating about how confident it is in the description.

At the moment, the Captionbot system is in testing. Once it's returned a caption, you can rate the response. Before you try and corrupt it by uploading all the crazy, lewd photos you've got, it's worth bearing mind the system keeps all the photos it's analyzed.

Source: Microsoft

View gallery - 5 images
5 comments
tedfire
Well out of ten pictures I tried it got them all wrong !
It said a koala was a dog. It said a skeleton was a bird in front of a mirror and it said the Earth was a pair of skis ! Back to the drawingboard Microsoft !
DavidB
Ah, but is it smart enough to know that "Susan and me at the beach" is the correct form, rather than the increasingly ubiquitous "Susan and I at the beach"?
That would be some impressive Artificial Intelligence in an area where actual intelligence seems to be lacking.
guzmanchinky
Hahaha! So mucch fun! Seriously, get some friends together tonight and set up a drinking game to guess how well this AI guesses on each image. The results i got are pretty hilarious! I seriously admire the science that goes into this, however. But you can see this is where AI is still way behind humans. For now...
Stephen N Russell
Update & use for: Education Training planning marketing alone.
kwalispecial
I understand that there is some pretty cool engineering here, but it still has along way to go.
Gold pocket watch > thought it was a cell phone Kid reading a book > got that one right Recurve bow and arrows leaning against a target > street sign (not a terrible guess...) Owl perched on roof > bird on a while (nice job)
But here was the best one: Hand written note, pen on white paper, measurements for a tuxedo > "I think it's a person on a surf board in a skate park."