63 research outputs found

    Task-adaptable, Pervasive Perception for Robots Performing Everyday Manipulation

    Get PDF
    Intelligent robotic agents that help us in our day-to-day chores have been an aspiration of robotics researchers for decades. More than fifty years since the creation of the first intelligent mobile robotic agent, robots are still struggling to perform seemingly simple tasks, such as setting or cleaning a table. One of the reasons for this is that the unstructured environments these robots are expected to work in impose demanding requirements on a robota s perception system. Depending on the manipulation task the robot is required to execute, different parts of the environment need to be examined, the objects in it found and functional parts of these identified. This is a challenging task, since the visual appearance of the objects and the variety of scenes they are found in are large. This thesis proposes to treat robotic visual perception for everyday manipulation tasks as an open question-asnswering problem. To this end RoboSherlock, a framework for creating task-adaptable, pervasive perception systems is presented. Using the framework, robot perception is addressed from a systema s perspective and contributions to the state-of-the-art are proposed that introduce several enhancements which scale robot perception toward the needs of human-level manipulation. The contributions of the thesis center around task-adaptability and pervasiveness of perception systems. A perception task-language and a language interpreter that generates task-relevant perception plans is proposed. The task-language and task-interpreter leverage the power of knowledge representation and knowledge-based reasoning in order to enhance the question-answering capabilities of the system. Pervasiveness, a seamless integration of past, present and future percepts, is achieved through three main contributions: a novel way for recording, replaying and inspecting perceptual episodic memories, a new perception component that enables pervasive operation and maintains an object belief state and a novel prospection component that enables robots to relive their past experiences and anticipate possible future scenarios. The contributions are validated through several real world robotic experiments that demonstrate how the proposed system enhances robot perception

    Action Categorisation in Multimodal Instructions

    Get PDF
    We present an explorative study for the (semi-)automatic categorisation of actions in Dutch multimodal first aid instructions, where the actions needed to successfully execute the procedure in question are presented verbally and in pictures. We start with the categorisation of verbalised actions and expect that this will later facilitate the identification of those actions in the pictures, which is known to be hard. Comparisons of and user-based experimentation with the verbal and visual representations will allow us to determine the effectiveness of picture-text combinations and will eventually support the automatic generation of multimodal documents. We used Natural Language Processing tools to identify and categorise 2,388 verbs in a corpus of 78 multimodal instructions (MIs). We show that the main action structure of an instruction can be retrieved through verb identification using the Alpino parser followed by a manual election operation. The selected main action verbs were subsequently generalised and categorised with the use of Cornetto, a lexical resource that combines a Dutch Wordnet and a Dutch Reference Lexicon. Results show that these tools are useful but also have limitations which make human intervention essential to guide an accurate categorisation of actions in multimodal instructions

    CHORUS Deliverable 2.1: State of the Art on Multimedia Search Engines

    Get PDF
    Based on the information provided by European projects and national initiatives related to multimedia search as well as domains experts that participated in the CHORUS Think-thanks and workshops, this document reports on the state of the art related to multimedia content search from, a technical, and socio-economic perspective. The technical perspective includes an up to date view on content based indexing and retrieval technologies, multimedia search in the context of mobile devices and peer-to-peer networks, and an overview of current evaluation and benchmark inititiatives to measure the performance of multimedia search engines. From a socio-economic perspective we inventorize the impact and legal consequences of these technical advances and point out future directions of research
    • …
    corecore