6 research outputs found

    Learning in vision and robotics

    Get PDF
    I present my work on learning from video and robotic input. This is an important problem, with numerous potential applications. The use of machine learning makes it possible to obtain models which can handle noise and variation without explicitly programming them. It also raises the possibility of robots which can interact more seamlessly with humans rather than only exhibiting hard-coded behaviors. I will present my work in two areas: video action recognition, and robot navigation. First, I present a video action recognition method which represents actions in video by sequences of retinotopic appearance and motion detectors, learns such models automatically from training data, and allow actions in new video to be recognized and localized completely automatically. Second, I present a new method which allows a mobile robot to learn word meanings from a combination of robot sensor measurements and sentential descriptions corresponding to a set of robotically driven paths. These word meanings support automatic driving from sentential input, and generation of sentential description of new paths. Finally, I also present work on a new action recognition dataset, and comparisons of the performance of recent methods on this dataset and others

    Grounding robot motion in natural language and visual perception

    Get PDF
    The current state of the art in military and first responder ground robots involves heavy physical and cognitive burdens on the human operator while taking little to no advantage of the potential autonomy of robotic technology. The robots currently in use are rugged remote-controlled vehicles. Their interaction modalities, usually utilizing a game controller connected to a computer, require a dedicated operator who has limited capacity for other tasks. I present research which aims to ease these burdens by incorporating multiple modes of robotic sensing into a system which allows humans to interact with robots through a natural-language interface. I conduct this research on a custom-built six-wheeled mobile robot. First I present a unified framework which supports grounding natural-language semantics in robotic driving. This framework supports learning the meanings of nouns and prepositions from sentential descriptions of paths driven by the robot, as well as using such meanings to both generate a sentential description of a path and perform automated driving of a path specified in natural language. One limitation of this framework is that it requires as input the locations of the (initially nameless) objects in the floor plan. Next I present a method to automatically detect, localize, and label objects in the robot’s environment using only the robot’s video feed and corresponding odometry. This method produces a map of the robot’s environment in which objects are differentiated by abstract class labels. Finally, I present work that unifies the previous two approaches. This method detects, localizes, and labels objects, as the previous method does. However, this new method integrates natural-language descriptions to learn actual object names, rather than abstract labels

    Driving Under the Influence (of Language)

    No full text
    corecore