1,570 research outputs found

    A simple technique for improving multi-class classification with neural networks

    Get PDF
    We present a novel method to perform multi-class pattern classification with neural networks and test it on a challenging 3D hand gesture recognition problem. Our method consists of a standard one-against-all (OAA) classification, followed by another network layer classifying the resulting class scores, possibly augmented by the original raw input vector. This allows the network to disambiguate hard-to-separate classes as the distribution of class scores carries considerable information as well, and is in fact often used for assessing the confidence of a decision. We show that by this approach we are able to significantly boost our results, overall as well as for particular difficult cases, on the hard 10-class gesture classification task.Comment: European Symposium on artificial neural networks (ESANN), Jun 2015, Bruges, Belgiu

    On the Importance of Accurate Geometry Data for Dense 3D Vision Tasks

    Get PDF
    Learning-based methods to solve dense 3D vision problems typically train on 3D sensor data. The respectively used principle of measuring distances provides advantages and drawbacks. These are typically not compared nor discussed in the literature due to a lack of multi-modal datasets. Texture-less regions are problematic for structure from motion and stereo, reflective material poses issues for active sensing, and distances for translucent objects are intricate to measure with existing hardware. Training on inaccurate or corrupt data induces model bias and hampers generalisation capabilities. These effects remain unnoticed if the sensor measurement is considered as ground truth during the evaluation. This paper investigates the effect of sensor errors for the dense 3D vision tasks of depth estimation and reconstruction. We rigorously show the significant impact of sensor characteristics on the learned predictions and notice generalisation issues arising from various technologies in everyday household environments. For evaluation, we introduce a carefully designed dataset\footnote{dataset available at https://github.com/Junggy/HAMMER-dataset} comprising measurements from commodity sensors, namely D-ToF, I-ToF, passive/active stereo, and monocular RGB+P. Our study quantifies the considerable sensor noise impact and paves the way to improved dense vision estimates and targeted data fusion.Comment: Accepted at CVPR 2023, Main Paper + Supp. Mat. arXiv admin note: substantial text overlap with arXiv:2205.0456

    Multimodal human hand motion sensing and analysis - a review

    Get PDF

    Review of constraints on vision-based gesture recognition for human–computer interaction

    Get PDF
    The ability of computers to recognise hand gestures visually is essential for progress in human-computer interaction. Gesture recognition has applications ranging from sign language to medical assistance to virtual reality. However, gesture recognition is extremely challenging not only because of its diverse contexts, multiple interpretations, and spatio-temporal variations but also because of the complex non-rigid properties of the hand. This study surveys major constraints on vision-based gesture recognition occurring in detection and pre-processing, representation and feature extraction, and recognition. Current challenges are explored in detail

    Optimization of a Simultaneous Localization and Mapping (SLAM) System for an Autonomous Vehicle Using a 2-Dimensional Light Detection and Ranging Sensor (LiDAR) by Sensor Fusion

    Get PDF
    Fully autonomous vehicles must accurately estimate the extent of their environment as well as their relative location in their environment. A popular approach to organizing such information is creating a map of a given physical environment and defining a point in this map representing the vehicle’s location. Simultaneous Mapping and Localization (SLAM) is a computing algorithm that takes inputs from a Light Detection and Ranging (LiDAR) sensor to construct a map of the vehicle’s physical environment and determine its respective location in this map based on feature recognition simultaneously. Two fundamental requirements allow an accurate SLAM method: one being accurate distance measurements and the second being an accurate assessment of location. Researched are methods in which a 2D LiDAR sensor system with laser range finders, ultrasonic sensors and stereo camera vision is optimized for distance measurement accuracy, particularly a method using recurrent neural networks. Sensor fusion techniques with infrared, camera and ultrasonic sensors are implemented to investigate their effects on distance measurement accuracy. It was found that the use of a recurrent neural network for fusing data from a 2D LiDAR with laser range finders and ultrasonic sensors outperforms raw sensor data in accuracy (46.6% error reduced to 3.0% error) and precision (0.62m std. deviation reduced to 0.0015m std. deviation). These results demonstrate the effectiveness of machine learning based fusion algorithms for noise reduction, measurement accuracy improvement, and outlier measurement removal which would provide SLAM vehicles more robust performance
    • …
    corecore