844 research outputs found
Multimodal Deep Learning for Robust RGB-D Object Recognition
Robust object recognition is a crucial ingredient of many, if not all,
real-world robotics applications. This paper leverages recent progress on
Convolutional Neural Networks (CNNs) and proposes a novel RGB-D architecture
for object recognition. Our architecture is composed of two separate CNN
processing streams - one for each modality - which are consecutively combined
with a late fusion network. We focus on learning with imperfect sensor data, a
typical problem in real-world robotics tasks. For accurate learning, we
introduce a multi-stage training methodology and two crucial ingredients for
handling depth data with CNNs. The first, an effective encoding of depth
information for CNNs that enables learning without the need for large depth
datasets. The second, a data augmentation scheme for robust learning with depth
images by corrupting them with realistic noise patterns. We present
state-of-the-art results on the RGB-D object dataset and show recognition in
challenging RGB-D real-world noisy settings.Comment: Final version submitted to IROS'2015, results unchanged,
reformulation of some text passages in abstract and introductio
A MACHINE LEARNING APPROACH TO EYE BLINK DETECTION IN LOW-LIGHT VIDEOS
Inadequate lighting conditions can harm the accuracy of blink detection systems, which play a crucial role in fatigue detection technology, transportation and security applications. While some video capture devices are now equipped with flashlight technology to enhance lighting, users occasionally need to remember to activate this feature, resulting in slightly darker videos. Consequently, there is a pressing need to improve the performance of blink detection systems to detect eye accurately blinks in low light videos. This research proposes developing a machine learning-based blink detection system to see flashes in low-light videos. The Confusion matrix was conducted to evaluate the effectiveness of the proposed blink detection system. These tests involved 31 videos ranging from 5 to 10 seconds in duration. Involving male and female test subjects aged between 20 and 22. The accuracy of the proposed blink detection system was measured using the confusion matrix method. The results indicate that by leveraging a machine learning approach, the blink detection system achieved a remarkable accuracy of 100% in detecting blinks within low-light videos. However, this research necessitates further development to account for more complex and diverse real-life situations. Future studies could focus on developing more sophisticated algorithms and expanding the test subjects to improve the performance of the blink detection system in low light conditions. Such advancements would contribute to the practical application of the system in a broader range of scenarios, ultimately enhancing its effectiveness in fatigue detection technology
Tactile Mapping and Localization from High-Resolution Tactile Imprints
This work studies the problem of shape reconstruction and object localization
using a vision-based tactile sensor, GelSlim. The main contributions are the
recovery of local shapes from contact, an approach to reconstruct the tactile
shape of objects from tactile imprints, and an accurate method for object
localization of previously reconstructed objects. The algorithms can be applied
to a large variety of 3D objects and provide accurate tactile feedback for
in-hand manipulation. Results show that by exploiting the dense tactile
information we can reconstruct the shape of objects with high accuracy and do
on-line object identification and localization, opening the door to reactive
manipulation guided by tactile sensing. We provide videos and supplemental
information in the project's website
http://web.mit.edu/mcube/research/tactile_localization.html.Comment: ICRA 2019, 7 pages, 7 figures. Website:
http://web.mit.edu/mcube/research/tactile_localization.html Video:
https://youtu.be/uMkspjmDbq
- …