Scrolling through your Facebook, Twitter or Instagram feed is proof positive that we love taking photos - of our food, our children, our drunken antics and, most of all, ourselves. No matter how good your happy snap, finding a way to caption it can be tricky. It's still early days, but one day you could hand off the responsibility to a Captionbot.
Powered by Microsoft's Cognitive Services, the bot looks over your images and gives rudimentary descriptions of what it can see using a Computer Vision API, an Emotion API and a Bing Image API. This is the same base software Microsoft has used for its How Old Do I Look? system.
To actually create the captions, this system has been coupled with the language system from Tay, Microsoft's attempt at a chat bot that was shut down after a vulnerability led to it tweeting racist and sexist remarks.
The photo captioning system is not completely accurate, but attempts to describe the person in an image, what they're doing and their emotions in the moment. It can also recognize animals and describe landscapes, although it did respond with "I am not really confident" to both the images we uploaded, before confusing one of our male journalists for a female. Okay, so that journalist was me...
What Microsoft's system won't do is read the caption aloud, which means deaf people may still need to turn to Facebook's bot for help. The Facebook bot gives suggestions as to what's in an image, with responses qualified with a rating about how confident it is in the description.
At the moment, the Captionbot system is in testing. Once it's returned a caption, you can rate the response. Before you try and corrupt it by uploading all the crazy, lewd photos you've got, it's worth bearing mind the system keeps all the photos it's analyzed.
Want a cleaner, faster loading and ad free reading experience?
Try New Atlas Plus. Learn more