2,139 research outputs found
UcoSLAM: Simultaneous Localization and Mapping by Fusion of KeyPoints and Squared Planar Markers
This paper proposes a novel approach for Simultaneous Localization and
Mapping by fusing natural and artificial landmarks. Most of the SLAM approaches
use natural landmarks (such as keypoints). However, they are unstable over
time, repetitive in many cases or insufficient for a robust tracking (e.g. in
indoor buildings). On the other hand, other approaches have employed artificial
landmarks (such as squared fiducial markers) placed in the environment to help
tracking and relocalization. We propose a method that integrates both
approaches in order to achieve long-term robust tracking in many scenarios.
Our method has been compared to the start-of-the-art methods ORB-SLAM2 and
LDSO in the public dataset Kitti, Euroc-MAV, TUM and SPM, obtaining better
precision, robustness and speed. Our tests also show that the combination of
markers and keypoints achieves better accuracy than each one of them
independently.Comment: Paper submitted to Pattern Recognitio
Sample-level CNN Architectures for Music Auto-tagging Using Raw Waveforms
Recent work has shown that the end-to-end approach using convolutional neural
network (CNN) is effective in various types of machine learning tasks. For
audio signals, the approach takes raw waveforms as input using an 1-D
convolution layer. In this paper, we improve the 1-D CNN architecture for music
auto-tagging by adopting building blocks from state-of-the-art image
classification models, ResNets and SENets, and adding multi-level feature
aggregation to it. We compare different combinations of the modules in building
CNN architectures. The results show that they achieve significant improvements
over previous state-of-the-art models on the MagnaTagATune dataset and
comparable results on Million Song Dataset. Furthermore, we analyze and
visualize our model to show how the 1-D CNN operates.Comment: Accepted for publication at ICASSP 201
Iris segmentation using a non-decimated wavelet transform
This paper presents an iris segmentation algorithm. The proposed technique applies a histogram based method on the input eye image extracting a point within the pupil. The image is then intensity sampled over M equiangular radial scan line, generating M 1-dimensional signals. A Fuzzy multi-scale edge detection algorithm is then applied to each of the resulting radii signals, to accurately detect and locate one positive edge point from the signal. A uniform cubic B-spline approximation method is further applied to the detected edges determining the iris outer boundary. The histogram of the area within the extracted outer iris bondary of the eye image is finaly used to extract the pupil outer bondary. Experimental results on a number of eye test images taken under visible wavelenght from UBIRISv.1 and UBIRISv.2 databases show that the proposed segmentation method accurately extracts the iris boundaries
Cooperation of different neuronal systems during hand sign recognition.
Hand signs with symbolic meaning can often be utilized more successfully than words to communicate an intention; however, the underlying brain mechanisms are undefined. The present study using magnetoencephalography (MEG) demonstrates that the primary visual, mirror neuron, social recognition and object recognition systems are involved in hand sign recognition. MEG detected well-orchestrated multiple brain regional electrical activity among these neuronal systems. During the assessment of the meaning of hand signs, the inferior parietal, superior temporal sulcus (STS) and inferior occipitotemporal regions were simultaneously activated. These three regions showed similar time courses in their electrical activity, suggesting that they work together during hand sign recognition by integrating information in the ventral and dorsal pathways through the STS. The results also demonstrated marked right hemispheric predominance, suggesting that hand expression is processed in a manner similar to that in which social signs, such as facial expressions, are processed
Person re-identification via efficient inference in fully connected CRF
In this paper, we address the problem of person re-identification problem,
i.e., retrieving instances from gallery which are generated by the same person
as the given probe image. This is very challenging because the person's
appearance usually undergoes significant variations due to changes in
illumination, camera angle and view, background clutter, and occlusion over the
camera network. In this paper, we assume that the matched gallery images should
not only be similar to the probe, but also be similar to each other, under
suitable metric. We express this assumption with a fully connected CRF model in
which each node corresponds to a gallery and every pair of nodes are connected
by an edge. A label variable is associated with each node to indicate whether
the corresponding image is from target person. We define unary potential for
each node using existing feature calculation and matching techniques, which
reflect the similarity between probe and gallery image, and define pairwise
potential for each edge in terms of a weighed combination of Gaussian kernels,
which encode appearance similarity between pair of gallery images. The specific
form of pairwise potential allows us to exploit an efficient inference
algorithm to calculate the marginal distribution of each label variable for
this dense connected CRF. We show the superiority of our method by applying it
to public datasets and comparing with the state of the art.Comment: 7 pages, 4 figure
Affective feedback: an investigation into the role of emotions in the information seeking process
User feedback is considered to be a critical element in the information seeking process, especially in relation to relevance assessment. Current feedback techniques determine content relevance with respect to the cognitive and situational levels of interaction that occurs between the user and the retrieval system. However, apart from real-life problems and information objects, users interact with intentions, motivations and feelings, which can be seen as critical aspects of cognition and decision-making. The study presented in this paper serves as a starting point to the exploration of the role of emotions in the information seeking process. Results show that the latter not only interweave with different physiological, psychological and cognitive processes, but also form distinctive patterns, according to specific task, and according to specific user
- …