1,039 research outputs found

    Learning to Detect Ground Control Points for Improving the Accuracy of Stereo Matching

    Get PDF
    International audienceWhile machine learning has been instrumental to the ongoing progress in most areas of computer vision, it has not been applied to the problem of stereo matching with similar frequency or success. We present a supervised learning approach for predicting the correctness of stereo matches based on a random forest and a set of features that capture various forms of information about each pixel.We show highly competitive results in predicting the correctness of matches and in confidence estimation, which allows us to rank pixels according to the reliability of their assigned disparities. Moreover, we show how these confidence values can be used to improve the accuracy of disparity maps by integrating them with an MRF-based stereo algorithm. This is an important distinction from current literature that has mainly focused on sparsification by removing potentially erroneous disparities to generate quasi-dense disparity maps

    Learning Matchable Image Transformations for Long-term Metric Visual Localization

    Full text link
    Long-term metric self-localization is an essential capability of autonomous mobile robots, but remains challenging for vision-based systems due to appearance changes caused by lighting, weather, or seasonal variations. While experience-based mapping has proven to be an effective technique for bridging the `appearance gap,' the number of experiences required for reliable metric localization over days or months can be very large, and methods for reducing the necessary number of experiences are needed for this approach to scale. Taking inspiration from color constancy theory, we learn a nonlinear RGB-to-grayscale mapping that explicitly maximizes the number of inlier feature matches for images captured under different lighting and weather conditions, and use it as a pre-processing step in a conventional single-experience localization pipeline to improve its robustness to appearance change. We train this mapping by approximating the target non-differentiable localization pipeline with a deep neural network, and find that incorporating a learned low-dimensional context feature can further improve cross-appearance feature matching. Using synthetic and real-world datasets, we demonstrate substantial improvements in localization performance across day-night cycles, enabling continuous metric localization over a 30-hour period using a single mapping experience, and allowing experience-based localization to scale to long deployments with dramatically reduced data requirements.Comment: In IEEE Robotics and Automation Letters (RA-L) and presented at the IEEE International Conference on Robotics and Automation (ICRA'20), Paris, France, May 31-June 4, 202

    Deep learning in remote sensing: a review

    Get PDF
    Standing at the paradigm shift towards data-intensive science, machine learning techniques are becoming increasingly important. In particular, as a major breakthrough in the field, deep learning has proven as an extremely powerful tool in many fields. Shall we embrace deep learning as the key to all? Or, should we resist a 'black-box' solution? There are controversial opinions in the remote sensing community. In this article, we analyze the challenges of using deep learning for remote sensing data analysis, review the recent advances, and provide resources to make deep learning in remote sensing ridiculously simple to start with. More importantly, we advocate remote sensing scientists to bring their expertise into deep learning, and use it as an implicit general model to tackle unprecedented large-scale influential challenges, such as climate change and urbanization.Comment: Accepted for publication IEEE Geoscience and Remote Sensing Magazin

    Pattern recognition to detect fetal alchohol syndrome using stereo facial images

    Get PDF
    Fetal alcohol syndrome (FAS) is a condition which is caused by excessive consumption of alcohol by the mother during pregnancy. A FAS diagnosis depends on the presence of growth retardation, central nervous system and neurodevelopment abnormalities together with facial malformations. The main facial features which best distinguish children with and without FAS are smooth philtrum, thin upper lip and short palpebral fissures. Diagnosis of the facial phenotype associated with FAS can be done using methods such as direct facial anthropometry and photogrammetry. The project described here used information obtained from stereo facial images and applied facial shape analysis and pattern recognition to distinguish between children with FAS and control children. Other researches have reported on identifying FAS through the classification of 2D landmark coordinates and 3D landmark information in the form of Procrustes residuals. This project built on this previous work with the use of 3D information combined with texture as features for facial classification. Stereo facial images of children were used to obtain the 3D coordinates of those facial landmarks which play a role in defining the FAS facial phenotype. Two datasets were used: the first consisted of facial images of 34 children whose facial shapes had previously been analysed with respect to FAS. The second dataset consisted of a new set of images from 40 subjects. Elastic bunch graph matching was used on the frontal facial images of the study populaiii tion to obtain texture information, in the form of jets, around selected landmarks. Their 2D coordinates were also extracted during the process. Faces were classified using knearest neighbor (kNN), linear discriminant analysis (LDA) and support vector machine (SVM) classifiers. Principal component analysis was used for dimensionality reduction while classification accuracy was assessed using leave-one-out cross-validation. For dataset 1, using 2D coordinates together with texture information as features during classification produced a best classification accuracy of 72.7% with kNN, 75.8% with LDA and 78.8% with SVM. When the 2D coordinates were replaced by Procrustes residuals (which encode 3D facial shape information), the best classification accuracies were 69.7% with kNN, 81.8% with LDA and 78.6% with SVM. LDA produced the most consistent classification results. The classification accuracies for dataset 2 were lower than for dataset 1. The different conditions during data collection and the possible differences in the ethnic composition of the datasets were identified as likely causes for this decrease in classification accuracy

    Enhanced Assessment of the Wound-Healing Process by Accurate Multiview Tissue Classification

    Full text link
    • …
    corecore