262 research outputs found
Delineation of line patterns in images using B-COSFIRE filters
Delineation of line patterns in images is a basic step required in various
applications such as blood vessel detection in medical images, segmentation of
rivers or roads in aerial images, detection of cracks in walls or pavements,
etc. In this paper we present trainable B-COSFIRE filters, which are a model of
some neurons in area V1 of the primary visual cortex, and apply it to the
delineation of line patterns in different kinds of images. B-COSFIRE filters
are trainable as their selectivity is determined in an automatic configuration
process given a prototype pattern of interest. They are configurable to detect
any preferred line structure (e.g. segments, corners, cross-overs, etc.), so
usable for automatic data representation learning. We carried out experiments
on two data sets, namely a line-network data set from INRIA and a data set of
retinal fundus images named IOSTAR. The results that we achieved confirm the
robustness of the proposed approach and its effectiveness in the delineation of
line structures in different kinds of images.Comment: International Work Conference on Bioinspired Intelligence, July
10-13, 201
Learning sound representations using trainable COPE feature extractors
Sound analysis research has mainly been focused on speech and music
processing. The deployed methodologies are not suitable for analysis of sounds
with varying background noise, in many cases with very low signal-to-noise
ratio (SNR). In this paper, we present a method for the detection of patterns
of interest in audio signals. We propose novel trainable feature extractors,
which we call COPE (Combination of Peaks of Energy). The structure of a COPE
feature extractor is determined using a single prototype sound pattern in an
automatic configuration process, which is a type of representation learning. We
construct a set of COPE feature extractors, configured on a number of training
patterns. Then we take their responses to build feature vectors that we use in
combination with a classifier to detect and classify patterns of interest in
audio signals. We carried out experiments on four public data sets: MIVIA audio
events, MIVIA road events, ESC-10 and TU Dortmund data sets. The results that
we achieved (recognition rate equal to 91.71% on the MIVIA audio events, 94% on
the MIVIA road events, 81.25% on the ESC-10 and 94.27% on the TU Dortmund)
demonstrate the effectiveness of the proposed method and are higher than the
ones obtained by other existing approaches. The COPE feature extractors have
high robustness to variations of SNR. Real-time performance is achieved even
when the value of a large number of features is computed.Comment: Accepted for publication in Pattern Recognitio
Alien Registration- Nicolai, Nicola (Portland, Cumberland County)
https://digitalmaine.com/alien_docs/25665/thumbnail.jp
Brain-Inspired Algorithms for Processing of Visual Data
The study of the visual system of the brain has attracted the attention and interest of many neuro-scientists, that derived computational models of some types of neuron that compose it. These findings inspired researchers in image processing and computer vision to deploy such models to solve problems of visual data processing. In this paper, we review approaches for image processing and computer vision, the design of which is based on neuro-scientific findings about the functions of some neurons in the visual cortex. Furthermore, we analyze the connection between the hierarchical organization of the visual system of the brain and the structure of Convolutional Networks (ConvNets). We pay particular attention to the mechanisms of inhibition of the responses of some neurons, which provide the visual system with improved stability to changing input stimuli, and discuss their implementation in image processing operators and in ConvNets.</p
Data-Efficient Large Scale Place Recognition with Graded Similarity Supervision
Visual place recognition (VPR) is a fundamental task of computer vision for visual localization. Existing methods are trained using image pairs that either depict the same place or not. Such a binary indication does not consider continuous relations of similarity between images of the same place taken from different positions, determined by the continuous nature of camera pose. The binary similarity induces a noisy supervision signal into the training of VPR methods, which stall in local minima and require expensive hard mining algorithms to guarantee convergence. Motivated by the fact that two images of the same place only partially share visual cues due to camera pose differences, we deploy an automatic re-annotation strategy to re-label VPR datasets. We compute graded similarity labels for image pairs based on available localization metadata. Furthermore, we propose a new Generalized Contrastive Loss (GCL) that uses graded similarity labels for training contrastive networks. We demonstrate that the use of the new labels and GCL allow to dispense from hard-pair mining, and to train image descriptors that perform better in VPR by nearest neighbor search, obtaining superior or comparable results than methods that require expensive hard-pair mining and re-ranking techniques.</p
Generalized Contrastive Optimization of Siamese Networks for Place Recognition
Visual place recognition is a challenging task in computer vision and a key
component of camera-based localization and navigation systems. Recently,
Convolutional Neural Networks (CNNs) achieved high results and good
generalization capabilities. They are usually trained using pairs or triplets
of images labeled as either similar or dissimilar, in a binary fashion. In
practice, the similarity between two images is not binary, but rather
continuous. Furthermore, training these CNNs is computationally complex and
involves costly pair and triplet mining strategies.
We propose a Generalized Contrastive loss (GCL) function that relies on image
similarity as a continuous measure, and use it to train a siamese CNN.
Furthermore, we propose three techniques for automatic annotation of image
pairs with labels indicating their degree of similarity, and deploy them to
re-annotate the MSLS, TB-Places, and 7Scenes datasets.
We demonstrate that siamese CNNs trained using the GCL function and the
improved annotations consistently outperform their binary counterparts. Our
models trained on MSLS outperform the state-of-the-art methods, including
NetVLAD, and generalize well on the Pittsburgh, TokyoTM and Tokyo 24/7
datasets. Furthermore, training a siamese network using the GCL function does
not require complex pair mining. We release the source code at
https://github.com/marialeyvallina/generalized_contrastive_loss.Comment: Under revie
- …