5,714 research outputs found

    An Empirical Evaluation of Deep Learning on Highway Driving

    Full text link
    Numerous groups have applied a variety of deep learning techniques to computer vision problems in highway perception scenarios. In this paper, we presented a number of empirical evaluations of recent deep learning advances. Computer vision, combined with deep learning, has the potential to bring about a relatively inexpensive, robust solution to autonomous driving. To prepare deep learning for industry uptake and practical applications, neural networks will require large data sets that represent all possible driving environments and scenarios. We collect a large data set of highway data and apply deep learning and computer vision algorithms to problems such as car and lane detection. We show how existing convolutional neural networks (CNNs) can be used to perform lane and vehicle detection while running at frame rates required for a real-time system. Our results lend credence to the hypothesis that deep learning holds promise for autonomous driving.Comment: Added a video for lane detectio

    Do-It-Yourself Single Camera 3D Pointer Input Device

    Full text link
    We present a new algorithm for single camera 3D reconstruction, or 3D input for human-computer interfaces, based on precise tracking of an elongated object, such as a pen, having a pattern of colored bands. To configure the system, the user provides no more than one labelled image of a handmade pointer, measurements of its colored bands, and the camera's pinhole projection matrix. Other systems are of much higher cost and complexity, requiring combinations of multiple cameras, stereocameras, and pointers with sensors and lights. Instead of relying on information from multiple devices, we examine our single view more closely, integrating geometric and appearance constraints to robustly track the pointer in the presence of occlusion and distractor objects. By probing objects of known geometry with the pointer, we demonstrate acceptable accuracy of 3D localization.Comment: 8 pages, 6 figures, 2018 15th Conference on Computer and Robot Visio

    Recurrent Neural Networks For Accurate RSSI Indoor Localization

    Full text link
    This paper proposes recurrent neuron networks (RNNs) for a fingerprinting indoor localization using WiFi. Instead of locating user's position one at a time as in the cases of conventional algorithms, our RNN solution aims at trajectory positioning and takes into account the relation among the received signal strength indicator (RSSI) measurements in a trajectory. Furthermore, a weighted average filter is proposed for both input RSSI data and sequential output locations to enhance the accuracy among the temporal fluctuations of RSSI. The results using different types of RNN including vanilla RNN, long short-term memory (LSTM), gated recurrent unit (GRU) and bidirectional LSTM (BiLSTM) are presented. On-site experiments demonstrate that the proposed structure achieves an average localization error of 0.750.75 m with 80%80\% of the errors under 11 m, which outperforms the conventional KNN algorithms and probabilistic algorithms by approximately 30%30\% under the same test environment.Comment: Received signal strength indicator (RSSI), WiFi indoor localization, recurrent neuron network (RNN), long shortterm memory (LSTM), fingerprint-based localizatio

    Find your Way by Observing the Sun and Other Semantic Cues

    Full text link
    In this paper we present a robust, efficient and affordable approach to self-localization which does not require neither GPS nor knowledge about the appearance of the world. Towards this goal, we utilize freely available cartographic maps and derive a probabilistic model that exploits semantic cues in the form of sun direction, presence of an intersection, road type, speed limit as well as the ego-car trajectory in order to produce very reliable localization results. Our experimental evaluation shows that our approach can localize much faster (in terms of driving time) with less computation and more robustly than competing approaches, which ignore semantic information

    A Model-Driven Engineering Approach for ROS using Ontological Semantics

    Full text link
    This paper presents a novel ontology-driven software engineering approach for the development of industrial robotics control software. It introduces the ReApp architecture that synthesizes model-driven engineering with semantic technologies to facilitate the development and reuse of ROS-based components and applications. In ReApp, we show how different ontological classification systems for hardware, software, and capabilities help developers in discovering suitable software components for their tasks and in applying them correctly. The proposed model-driven tooling enables developers to work at higher abstraction levels and fosters automatic code generation. It is underpinned by ontologies to minimize discontinuities in the development workflow, with an integrated development environment presenting a seamless interface to the user. First results show the viability and synergy of the selected approach when searching for or developing software with reuse in mind.Comment: Presented at DSLRob 2015 (arXiv:1601.00877), Stefan Zander, Georg Heppner, Georg Neugschwandtner, Ramez Awad, Marc Essinger and Nadia Ahmed: A Model-Driven Engineering Approach for ROS using Ontological Semantic

    Recovering 6D Object Pose: A Review and Multi-modal Analysis

    Full text link
    A large number of studies analyse object detection and pose estimation at visual level in 2D, discussing the effects of challenges such as occlusion, clutter, texture, etc., on the performances of the methods, which work in the context of RGB modality. Interpreting the depth data, the study in this paper presents thorough multi-modal analyses. It discusses the above-mentioned challenges for full 6D object pose estimation in RGB-D images comparing the performances of several 6D detectors in order to answer the following questions: What is the current position of the computer vision community for maintaining "automation" in robotic manipulation? What next steps should the community take for improving "autonomy" in robotics while handling objects? Our findings include: (i) reasonably accurate results are obtained on textured-objects at varying viewpoints with cluttered backgrounds. (ii) Heavy existence of occlusion and clutter severely affects the detectors, and similar-looking distractors is the biggest challenge in recovering instances' 6D. (iii) Template-based methods and random forest-based learning algorithms underlie object detection and 6D pose estimation. Recent paradigm is to learn deep discriminative feature representations and to adopt CNNs taking RGB images as input. (iv) Depending on the availability of large-scale 6D annotated depth datasets, feature representations can be learnt on these datasets, and then the learnt representations can be customized for the 6D problem

    Cross Hallway Detection and Indoor Localization Using Flash Laser Detection and Ranging

    Get PDF
    A flash LADAR is investigated as a source of navigation information to support cross-hallway detection and relative localization. To accomplish this, a dynamic, flexible simulation was developed that simulated the LADAR and the noise of a LADAR system. Using simulated LADAR data, algorithms were developed that were shown to be effective at detecting cross hallways in simulated ideal environments and in simulated environments with noise. Relative position was determined in the same situations. A SwissRanger SR4000 flash LADAR was then used to collect real data and to verify algorithm performance in real environments. Hallway detection was shown to be possible in all real data sets, and the relative position-finding algorithm was shown to be accurate when compared to the absolute accuracy of the LADAR. Thus, flash LADAR is concluded to be an effective source for indoor navigation information

    Self-Supervised Vision-Based Detection of the Active Speaker as Support for Socially-Aware Language Acquisition

    Full text link
    This paper presents a self-supervised method for visual detection of the active speaker in a multi-person spoken interaction scenario. Active speaker detection is a fundamental prerequisite for any artificial cognitive system attempting to acquire language in social settings. The proposed method is intended to complement the acoustic detection of the active speaker, thus improving the system robustness in noisy conditions. The method can detect an arbitrary number of possibly overlapping active speakers based exclusively on visual information about their face. Furthermore, the method does not rely on external annotations, thus complying with cognitive development. Instead, the method uses information from the auditory modality to support learning in the visual domain. This paper reports an extensive evaluation of the proposed method using a large multi-person face-to-face interaction dataset. The results show good performance in a speaker dependent setting. However, in a speaker independent setting the proposed method yields a significantly lower performance. We believe that the proposed method represents an essential component of any artificial cognitive system or robotic platform engaging in social interactions.Comment: 10 pages, IEEE Transactions on Cognitive and Developmental System
    • …
    corecore