398 research outputs found

    Robo brain: Large-scale knowledge engine for robots

    Get PDF
    Abstract-In this paper we introduce a knowledge engine, which learns and shares knowledge representations, for robots to carry out a variety of tasks. Building such an engine brings with it the challenge of dealing with multiple data modalities including symbols, natural language, haptic senses, robot trajectories, visual features and many others. The knowledge stored in the engine comes from multiple sources including physical interactions that robots have while performing tasks (perception, planning and control), knowledge bases from WWW and learned representations from leading robotics research groups. We discuss various technical aspects and associated challenges such as modeling the correctness of knowledge, inferring latent information and formulating different robotic tasks as queries to the knowledge engine. We describe the system architecture and how it supports different mechanisms for users and robots to interact with the engine. Finally, we demonstrate its use in three important research areas: grounding natural language, perception, and planning, which are the key building blocks for many robotic tasks. This knowledge engine is a collaborative effort and we call it RoboBrain

    Ideas Matchmaking for Supporting Innovators and Entrepreneurs

    Get PDF
    Käesolevas töös esitletakse süsteemi, mis on võimeline sirvima veebist ettevõtluse ja tehnoloogiaga seotud andmeid, mida saab siduda kasutajate poolt Innovvoice platvormil välja pakutud ideedega. Selline teenus on ideabator platvormi väärtuslik osa, mis toetab ettevõtluse uuendajaid ja potentsiaalseid ettevõtjaid.In this paper we show a system able to crawl content from the Web related to entrepreneurship and technology, to be matched with ideas proposed by users in the Innovvoice platform. We argue that such a service is a valuable component of an ideabator platform, supporting innovators and possible entrepreneurs

    DIANNE: a modular framework for designing, training and deploying deep neural networks on heterogeneous distributed infrastructure

    Get PDF
    Deep learning has shown tremendous results on various machine learning tasks, but the nature of the problems being tackled and the size of state-of-the-art deep neural networks often require training and deploying models on distributed infrastructure. DIANNE is a modular framework designed for dynamic (re)distribution of deep learning models and procedures. Besides providing elementary network building blocks as well as various training and evaluation routines, DIANNE focuses on dynamic deployment on heterogeneous distributed infrastructure, abstraction of Internet of Things (loT) sensors, integration with external systems and graphical user interfaces to build and deploy networks, while retaining the performance of similar deep learning frameworks. In this paper the DIANNE framework is proposed as an all-in-one solution for deep learning, enabling data and model parallelism though a modular design, offloading to local compute power, and the ability to abstract between simulation and real environment. (C) 2018 Elsevier Inc. All rights reserved

    CARTIER: Cartographic lAnguage Reasoning Targeted at Instruction Execution for Robots

    Full text link
    This work explores the capacity of large language models (LLMs) to address problems at the intersection of spatial planning and natural language interfaces for navigation.Our focus is on following relatively complex instructions that are more akin to natural conversation than traditional explicit procedural directives seen in robotics. Unlike most prior work, where navigation directives are provided as imperative commands (e.g., go to the fridge), we examine implicit directives within conversational interactions. We leverage the 3D simulator AI2Thor to create complex and repeatable scenarios at scale, and augment it by adding complex language queries for 40 object types. We demonstrate that a robot can better parse descriptive language queries than existing methods by using an LLM to interpret the user interaction in the context of a list of the objects in the scene

    Searching in a web-based infrastructure for smart things

    Full text link
    Abstract—Given the expected high number of accessible digi-tally augmented devices and their communication requirements, this paper presents our work on creating a Web-based infrastruc-ture for smart things to facilitate the integration, look-up, and interaction with such devices for human users and machines. To exploit the locality of interactions with and between smart things, the proposed infrastructure treats the location of a smart thing as its main property and is therefore structured hierarchically according to logical place identifiers. We discuss the infrastruc-ture’s look-up mechanism that leverages Web patterns to foster scalability and load balancing and features an advanced caching mechanism that greatly reduces the response time and number of exchanged messages. These properties are demonstrated in an evaluation in a simulated smart environment. I

    ViMantic, a distributed robotic architecture for semantic mapping in indoor environments

    Get PDF
    Semantic maps augment traditional representations of robot workspaces, typically based on their geometry and/or topology, with meta-information about the properties, relations and functionalities of their composing elements. A piece of such information could be: fridges are appliances typically found in kitchens and employed to keep food in good condition. Thereby, semantic maps allow for the execution of high-level robotic tasks in an efficient way, e.g. “Hey robot, Store the leftover salad”. This paper presents ViMantic, a novel semantic mapping architecture for the building and maintenance of such maps, which brings together a number of features as demanded by modern mobile robotic systems, including: (i) a formal model, based on ontologies, which defines the semantics of the problem at hand and establishes mechanisms for its manipulation; (ii) techniques for processing sensory information and automatically populating maps with, for example, objects detected by cutting-edge CNNs; (iii) distributed execution capabilities through a client–server design, making the knowledge in the maps accessible and extendable to other robots/agents; (iv) a user interface that allows for the visualization and interaction with relevant parts of the maps through a virtual environment; (v) public availability, hence being ready to use in robotic platforms. The suitability of ViMantic has been assessed using Robot@Home, a vast repository of data collected by a robot in different houses. The experiments carried out consider different scenarios with one or multiple robots, from where we have extracted satisfactory results regarding automatic population, execution times, and required size in memory of the resultant semantic maps.This work has been supported by the research projects WISER (DPI2017-84827-R), funded by the Spanish Government and financed by the European Regional Development’s funds (FEDER), MoveCare (ICT-26-2016b-GA-732158), funded by the European H2020 program, and by a postdoc contract from the I-PPIT program of the University of Málaga, and the UG PHD scholarship program from the University of Groningen. Funding for open access charge: Universidad de Málaga/CBU

    An Extensible Framework for Creating Personal Archives of Web Resources Requiring Authentication

    Get PDF
    The key factors for the success of the World Wide Web are its large size and the lack of a centralized control over its contents. In recent years, many advances have been made in preserving web content but much of this content (namely, social media content) was not archived, or still to this day is not being archived,for various reasons. Tools built to accomplish this frequently break because of the dynamic structure of social media websites. Because many social media websites exhibit a commonality in hierarchy of the content, it would be worthwhile to setup a means to reference this hierarchy for tools to leverage and become adaptive as the target websites evolve. As relying on the service to provide this means is problematic in the context of archiving, we can surmise that the only way to assure that all of these shortcomings are not experienced is to rely on the original context in which the user views the content, i.e. the webbrowser. In this thesis I will describe an abstract specification and concrete implementations of the specification that allow tools to leverage the context of theweb browser to capture content into personal web archives. These tools will then be able to accomplish personal web archiving in a way that makes them more robust. As evaluation, I will make a change in the hierarchy of a synthetic social media website and its respective specification. Then, I will show that anadapted tool, using the specification, continues to function and is able to archive the social media website

    RoboGen: Towards Unleashing Infinite Data for Automated Robot Learning via Generative Simulation

    Full text link
    We present RoboGen, a generative robotic agent that automatically learns diverse robotic skills at scale via generative simulation. RoboGen leverages the latest advancements in foundation and generative models. Instead of directly using or adapting these models to produce policies or low-level actions, we advocate for a generative scheme, which uses these models to automatically generate diversified tasks, scenes, and training supervisions, thereby scaling up robotic skill learning with minimal human supervision. Our approach equips a robotic agent with a self-guided propose-generate-learn cycle: the agent first proposes interesting tasks and skills to develop, and then generates corresponding simulation environments by populating pertinent objects and assets with proper spatial configurations. Afterwards, the agent decomposes the proposed high-level task into sub-tasks, selects the optimal learning approach (reinforcement learning, motion planning, or trajectory optimization), generates required training supervision, and then learns policies to acquire the proposed skill. Our work attempts to extract the extensive and versatile knowledge embedded in large-scale models and transfer them to the field of robotics. Our fully generative pipeline can be queried repeatedly, producing an endless stream of skill demonstrations associated with diverse tasks and environments

    A system for large-scale image and video retrieval on everyday scenes

    Get PDF
    There has been a growing amount of multimedia data generated on the web todayin terms of size and diversity. This has made accurate content retrieval with these large and complex collections of data a challenging problem. Motivated by the need for systems that can enable scalable and efficient search, we propose QIK (Querying Images Using Contextual Knowledge). QIK leverages advances in deep learning (DL) and natural language processing (NLP) for scene understanding to enable large-scale multimedia retrieval on everyday scenes with common objects. The system consists of three major components: Indexer, Query Processor, and Video Processor. Given an image, the Indexer performs probabilistic image understanding (PIU). The PIU generated consists of the most probable captions, parsed and represented by tree structures using NLP techniques, and detected objects. The PIU's are stored and indexed in a database system. For a query image, the Query Processor generates the most probable caption and parses it into the corresponding tree structure. Then an optimized tree-pattern query is constructed and executed on the database to retrieve a set of candidate images. The candidate images fetched are ranked using the tree-edit distance metric computed on the tree structures. Given a video, the Video Processor extracts a sequence of key scenes that are posed to the Query Processor to retrieve a set of candidate scenes. The candidate scene parse trees corresponding to a video are extracted and are ranked based on the number of matching scenes. We evaluated the performance of our system for large-scale image and video retrieval tasks on datasets containing everyday scenes and observed that our system could outperform state-ofthe- art techniques in terms of mean average precision.Includes bibliographical references
    corecore