Word-picture association is one of the basic mechanisms of human memory. As children, it helps us to learn language by verbalizing what we see, as adults it is an invaluable aid to visualizing broader concepts or perhaps helping those with an LBLD (Language-Based Learning Disability). Now researchers from the University of Washington and the Allen Institute for Artificial Intelligence have created the first fully automated computer program named LEVAN that teaches itself everything there is to know about a visual concept by associating words with images.
LEVAN (Learning EVerything about ANything) is designed to search for a particular given word amongst the millions of lines of text held on Google books, then associate the result with images it matches on the Web to learn all possible variations of that concept. The program then displays the outcome as a wide-ranging list of images that users can explore to help them grasp subjects quickly and in great detail.
"It is all about discovering associations between textual and visual data," said Ali Farhadi, a University of Washington assistant professor of computer science and engineering. "The program learns to tightly couple rich sets of phrases with pixels in images. This means that it can recognize instances of specific concepts when it sees them."
The program uses the capability of Google’s Ngram feature to conduct its word searches. An n-gram is essentially a connecting sequence of "n" (number) items from a given sequence of dialog or text. The items can be anything from syllables to phonemes, letters, or words according to the concept applied. In this case, the word may be "horse."
Once the word is entered, LEVAN then searches the results looking for modifiers to its concepts, and keeps only the visual ones. Abstract concepts that can't be visualized, such as "my horse" are left out, but terms such as "jumping horse" or "wooden horse" are retained. LEVAN then goes on to search available Web images until it finds matches to what it is searching for, then combines and categorizes them for display.
Unlike other association systems, such as Watson, that measure the question put to it by grouping words together to find statistically related phrases, the LEVAN algorithm actually combines both word grouping and image recognition to produce its results.
Because the LEVAN method is to organize visual information about a concept in a useful way, it is envisaged by the team that it may be used by a variety of applications across vision and Neuro-linguistic programming areas to improve comprehension and aid word-image associative learning.
To date, the system has word-image associations available for over 50,000 variations within 150 concepts (ranging from "airplane" to "walking"), and has marked up more than 10 million images with reference bounding boxes. However, as the concepts of the LEVAN archive are still quite limited, it still has a lot more to learn. LEVAN researchers are therefore inviting the public to help out by adding their own words for LEVAN to search and associate.
Anyone wishing to participate can head over to the LEVAN submission page.
The project is to be presented at the upcoming Computer Vision and Pattern Recognition annual conference in Columbus, Ohio.
The short video below explains the concepts behind the LEVAN project.
Source: University of Washington
For example, "reading." Reading is a town in Pennsylvania and in England. A Reading is something provided by a measurement instrument. Reading as an adjective describes glasses and other objects.
Once Levan hits reading, the categories of reading are listed in small, discrete silos that leave one with the same sort of mess that Google Search delivers.
But whereas Google is a lost cause, this project might turn out pretty cool.