437 research outputs found

    Dyna-DM: Dynamic Object-aware Self-supervised Monocular Depth Maps

    Full text link
    Self-supervised monocular depth estimation has been a subject of intense study in recent years, because of its applications in robotics and autonomous driving. Much of the recent work focuses on improving depth estimation by increasing architecture complexity. This paper shows that state-of-the-art performance can also be achieved by improving the learning process rather than increasing model complexity. More specifically, we propose (i) disregarding small potentially dynamic objects when training, and (ii) employing an appearance-based approach to separately estimate object pose for truly dynamic objects. We demonstrate that these simplifications reduce GPU memory usage by 29% and result in qualitatively and quantitatively improved depth maps. The code is available at https://github.com/kieran514/Dyna-DM

    LearnBlock: A Robot-Agnostic Educational Programming Tool

    Get PDF
    Education is evolving to prepare students for the current sociotechnical changes. An increasing effort to introduce programming and other STEM-related subjects into the core curriculum of primary and secondary education is taking place around the world. The use of robots stands out among STEM initiatives, since robots are proving to be an engaging tool for learning programming and other STEM-related contents. Block-based programming is the option chosen for most educational robotic platforms. However, many robotics kits include their own software tools, as well as their own set of programming blocks. LearnBlock, a new educational programming tool, is proposed here. Its major novelty is its loosely coupled software architecture which makes it, to the best of our knowledge, the first robot-agnostic educational tool. Robot-agnosticism is provided not only in block code, but also in generated code, unifying the translation from blocks to the final programming language. The set of blocks can be easily extended implementing additional Python functions, without modifying the core code of the tool. Moreover, LearnBlock provides an integrated educational programming environment that facilitates a progressive transition from a visual to a general-purpose programming language. To evaluate LearnBlock and demonstrate that it is platform-agnostic, several tests were conducted. Each of them consists of a program implementing a robot behaviour. The block code of each test can run on several educational robots without changes

    SocNav1: A Dataset to Benchmark and Learn Social Navigation Conventions

    Get PDF
    Datasets are essential to the development and evaluation of machine learning and artificial intelligence algorithms. As new tasks are addressed, new datasets are required. Training algorithms for human-aware navigation is an example of this need. Different factors make designing and gathering data for human-aware navigation datasets challenging. Firstly, the problem itself is subjective, different dataset contributors will very frequently disagree to some extent on their labels. Secondly, the number of variables to consider is undetermined culture-dependent. This paper presents SocNav1, a dataset for social navigation conventions. SocNav1 aims at evaluating the robots’ ability to assess the level of discomfort that their presence might generate among humans. The 9280 samples in SocNav1 seem to be enough for machine learning purposes given the relatively small size of the data structures describing the scenarios. Furthermore, SocNav1 is particularly well-suited to be used to benchmark non-Euclidean machine learning algorithms such as graph neural networks. This paper describes the proposed dataset and the method employed to gather the data. To provide a further understanding of the nature of the dataset, an analysis and validation of the collected data are also presented

    Multi-person 3D pose estimation from unlabelled data

    Get PDF
    Its numerous applications make multi-human 3D pose estimation a remarkably impactful area of research. Nevertheless, assuming a multiple-view system composed of several regular RGB cameras, 3D multi-pose estimation presents several challenges. First of all, each person must be uniquely identified in the different views to separate the 2D information provided by the cameras. Secondly, the 3D pose estimation process from the multi-view 2D information of each person must be robust against noise and potential occlusions in the scenario. In this work, we address these two challenges with the help of deep learning. Specifically, we present a model based on Graph Neural Networks capable of predicting the cross-view correspondence of the people in the scenario along with a Multilayer Perceptron that takes the 2D points to yield the 3D poses of each person. These two models are trained in a self-supervised manner, thus avoiding the need for large datasets with 3D annotations

    Multimodal Bayesian Network for Artificial Perception

    Get PDF
    In order to make machines perceive their external environment coherently, multiple sources of sensory information derived from several different modalities can be used (e.g. cameras, LIDAR, stereo, RGB-D, and radars). All these different sources of information can be efficiently merged to form a robust perception of the environment. Some of the mechanisms that underlie this merging of the sensor information are highlighted in this chapter, showing that depending on the type of information, different combination and integration strategies can be used and that prior knowledge are often required for interpreting the sensory signals efficiently. The notion that perception involves Bayesian inference is an increasingly popular position taken by a considerable number of researchers. Bayesian models have provided insights into many perceptual phenomena, showing that they are a valid approach to deal with real-world uncertainties and for robust classification, including classification in time-dependent problems. This chapter addresses the use of Bayesian networks applied to sensory perception in the following areas: mobile robotics, autonomous driving systems, advanced driver assistance systems, sensor fusion for object detection, and EEG-based mental states classification

    Multi-camera Torso Pose Estimation using Graph Neural Networks

    Get PDF
    Estimating the location and orientation of humans is an essential skill for service and assistive robots. To achieve a reliable estimation in a wide area such as an apartment, multiple RGBD cameras are frequently used. Firstly, these setups are relatively expensive. Secondly, they seldom perform an effective data fusion using the multiple camera sources at an early stage of the processing pipeline. Occlusions and partial views make this second point very relevant in these scenarios. The proposal presented in this paper makes use of graph neural networks to merge the information acquired from multiple camera sources, achieving a mean absolute error below 125 mm for the location and 10 degrees for the orientation using low-resolution RGB images. The experiments, conducted in an apartment with three cameras, benchmarked two different graph neural network implementations and a third architecture based on fully connected layers. The software used has been released as open-source in a public repository

    A Deep Evolutionary Approach to Bioinspired Classifier Optimisation for Brain-Machine Interaction

    Get PDF
    This study suggests a new approach to EEG data classification by exploring the idea of using evolutionary computation to both select useful discriminative EEG features and optimise the topology of Artificial Neural Networks. An evolutionary algorithm is applied to select the most informative features from an initial set of 2550 EEG statistical features. Optimisation of a Multilayer Perceptron (MLP) is performed with an evolutionary approach before classification to estimate the best hyperparameters of the network. Deep learning and tuning with Long Short-Term Memory (LSTM) are also explored, and Adaptive Boosting of the two types of models is tested for each problem. Three experiments are provided for comparison using different classifiers: One for attention state classification, one for emotional sentiment classification, and a third experiment in which the goal is to guess the number a subject is thinking of. The obtained results show that an Adaptive Boosted LSTM can achieve an accuracy of 84.44%, 97.06%, and 9.94% on the attentional, emotional, and number datasets, respectively. An evolutionary-optimised MLP achieves results close to the Adaptive Boosted LSTM for the two first experiments and significantly higher for the number-guessing experiment with an Adaptive Boosted DEvo MLP reaching 31.35%, while being significantly quicker to train and classify. In particular, the accuracy of the nonboosted DEvo MLP was of 79.81%, 96.11%, and 27.07% in the same benchmarks. Two datasets for the experiments were gathered using a Muse EEG headband with four electrodes corresponding to TP9, AF7, AF8, and TP10 locations of the international EEG placement standard. The EEG MindBigData digits dataset was gathered from the TP9, FP1, FP2, and TP10 locations

    Socially aware robot navigation system in human-populated and interactive environments based on an adaptive spatial density function and space affordances

    Get PDF
    Traditionally robots are mostly known by society due to the wide use of manipulators, which are generally placed in controlled environments such as factories. However, with the advances in the area of mobile robotics, they are increasingly inserted into social contexts, i.e., in the presence of people. The adoption of socially acceptable behaviours demands a trade-off between social comfort and other metrics of efficiency. For navigation tasks, for example, humans must be differentiated from other ordinary objects in the scene. In this work, we propose a novel human-aware navigation strategy built upon the use of an adaptive spatial density function that efficiently cluster groups of people according to their spatial arrangement. Space affordances are also used for defining potential activity spaces considering the objects in the scene. The proposed function defines regions where navigation is either discouraged or forbidden. To implement a socially acceptable navigation, the navigation architecture combines a probabilistic roadmap and rapidly-exploring random tree path planners, and an adaptation of the elastic band algorithm. Trials in real and simulated environments carried out demonstrate that the use of the clustering algorithm and social rules in the navigation architecture do not hinder the navigation performance

    Attentional Selection for Action in Mobile Robots

    Get PDF
    In this chapter, a novel computational model of visual attention based on the selection for action theory has been presented. In our system, attention is conceived as an intermediary between visual perception and action control, solving two fundamental behavioural questions
    corecore