When and if we ever do get our personal robot assistants, it would be nice to think that we could "be ourselves" in front of them, doing things such as scratching our butts or checking our deodorant - because they're just robots, right? They're not going to know what we're doing. Well ... thanks to research currently being conducted at Cornell University, there's already a Microsoft Kinect system that can correctly identify people's activities, based on observation of their movements. If such technology were incorporated into a robot, it's possible that it could admonish you for chewing with your mouth open - although more likely, it might offer to help you lift a heavy object.
In the research project, the Kinect's RGBD (Red, Green, Blue, Depth) camera was used to observe four people performing 12 different activities in five different settings - office, kitchen, bedroom, bathroom, and living room. The activities included things like brushing teeth, cooking, writing on a whiteboard, working on a computer, and drinking water. The data was ran through an algorithm, in which activities were broken down into more manageable sub-activities, known as a hierarchical maximum entropy Markov model. It states, essentially, that if a person is seen to do A, B, C, and D, then E is probably what they're doing. If the system had first seen the person perform a training set of the activities, it was able to recognize those activities with 84.3 percent accuracy "in the field." If it hadn't seen the person before, it was able to manage 64.2 percent accuracy.
Personal assistant robots equipped with such a system could use it in home care situations, checking that people in its care were drinking enough water, brushing their teeth regularly, taking their pills, and so on. Depending on the abilities of the robot, if it saw that the person were having difficulty doing something, it could even step in and help.
In the experiments, the system was able to avoid getting thrown off by non-related gestures mixed in with the activities, and it didn't seem to matter if the person it was watching was left or right-handed. The Cornell researchers did admit, however, that all of their subjects were performing out in the open, and weren't sure how well the system would perform if objects were blocking part of its view. They also suggested that it would work better if the system could learn to recognize certain key objects (toothbrushes, forks, etc.) that are contextual to certain activities.
Source: I PROGRAMMER