233,836 research outputs found

    Efficient Online Surface Correction for Real-time Large-Scale 3D Reconstruction

    Full text link
    State-of-the-art methods for large-scale 3D reconstruction from RGB-D sensors usually reduce drift in camera tracking by globally optimizing the estimated camera poses in real-time without simultaneously updating the reconstructed surface on pose changes. We propose an efficient on-the-fly surface correction method for globally consistent dense 3D reconstruction of large-scale scenes. Our approach uses a dense Visual RGB-D SLAM system that estimates the camera motion in real-time on a CPU and refines it in a global pose graph optimization. Consecutive RGB-D frames are locally fused into keyframes, which are incorporated into a sparse voxel hashed Signed Distance Field (SDF) on the GPU. On pose graph updates, the SDF volume is corrected on-the-fly using a novel keyframe re-integration strategy with reduced GPU-host streaming. We demonstrate in an extensive quantitative evaluation that our method is up to 93% more runtime efficient compared to the state-of-the-art and requires significantly less memory, with only negligible loss of surface quality. Overall, our system requires only a single GPU and allows for real-time surface correction of large environments.Comment: British Machine Vision Conference (BMVC), London, September 201

    Atypical audiovisual speech integration in infants at risk for autism

    Get PDF
    The language difficulties often seen in individuals with autism might stem from an inability to integrate audiovisual information, a skill important for language development. We investigated whether 9-month-old siblings of older children with autism, who are at an increased risk of developing autism, are able to integrate audiovisual speech cues. We used an eye-tracker to record where infants looked when shown a screen displaying two faces of the same model, where one face is articulating/ba/and the other/ga/, with one face congruent with the syllable sound being presented simultaneously, the other face incongruent. This method was successful in showing that infants at low risk can integrate audiovisual speech: they looked for the same amount of time at the mouths in both the fusible visual/ga/− audio/ba/and the congruent visual/ba/− audio/ba/displays, indicating that the auditory and visual streams fuse into a McGurk-type of syllabic percept in the incongruent condition. It also showed that low-risk infants could perceive a mismatch between auditory and visual cues: they looked longer at the mouth in the mismatched, non-fusible visual/ba/− audio/ga/display compared with the congruent visual/ga/− audio/ga/display, demonstrating that they perceive an uncommon, and therefore interesting, speech-like percept when looking at the incongruent mouth (repeated ANOVA: displays x fusion/mismatch conditions interaction: F(1,16) = 17.153, p = 0.001). The looking behaviour of high-risk infants did not differ according to the type of display, suggesting difficulties in matching auditory and visual information (repeated ANOVA, displays x conditions interaction: F(1,25) = 0.09, p = 0.767), in contrast to low-risk infants (repeated ANOVA: displays x conditions x low/high-risk groups interaction: F(1,41) = 4.466, p = 0.041). In some cases this reduced ability might lead to the poor communication skills characteristic of autism

    Fusion of facial regions using color information in a forensic scenario

    Full text link
    Comunicación presentada en: 18th Iberoamerican Congress on Pattern Recognition, CIARP 2013; Havana; Cuba; 20-23 November 2013The final publication is available at Springer via http://dx.doi.org/10.1007/978-3-642-41827-3_50This paper reports an analysis of the benefits of using color information on a region-based face recognition system. Three different color spaces are analysed (RGB, YCbCr, lαβ) in a very challenging scenario matching good quality mugshot images against video surveillance images. This scenario is of special interest for forensics, where examiners carry out a comparison of two face images using the global information of the faces, but paying special attention to each individual facial region (eyes, nose, mouth, etc.). This work analyses the discriminative power of 15 facial regions comparing both the grayscale and color information. Results show a significant improvement of performance when fusing several regions of the face compared to just using the whole face image. A further improvement of performance is achieved when color information is consideredThis work has been partially supported by contract with Spanish Guardia Civil and projects BBfor2 (FP7-ITN-238803), bio-Challenge (TEC2009-11186), Bio Shield (TEC2012-34881), Contexts (S2009/TIC-1485), TeraSense (CSD2008-00068) and "Cátedra UAM-Telefónica
    • …
    corecore