231 research outputs found
UW-ProCCaps: UnderWater Progressive Colourisation with Capsules
Underwater images are fundamental for studying and understanding the status
of marine life. We focus on reducing the memory space required for image
storage while the memory space consumption in the collecting phase limits the
time lasting of this phase leading to the need for more image collection
campaigns. We present a novel machine-learning model that reconstructs the
colours of underwater images from their luminescence channel, thus saving 2/3
of the available storage space. Our model specialises in underwater colour
reconstruction and consists of an encoder-decoder architecture. The encoder is
composed of a convolutional encoder and a parallel specialised classifier
trained with webly-supervised data. The encoder and the decoder use layers of
capsules to capture the features of the entities in the image. The colour
reconstruction process recalls the progressive and the generative adversarial
training procedures. The progressive training gives the ground for a generative
adversarial routine focused on the refining of colours giving the image bright
and saturated colours which bring the image back to life. We validate the model
both qualitatively and quantitatively on four benchmark datasets. This is the
first attempt at colour reconstruction in greyscale underwater images.
Extensive results on four benchmark datasets demonstrate that our solution
outperforms state-of-the-art (SOTA) solutions. We also demonstrate that the
generated colourisation enhances the quality of images compared to enhancement
models at the SOTA
Improving MRI-based Knee Disorder Diagnosis with Pyramidal Feature Details
This paper presents MRPyrNet, a new convolutional neural network (CNN) architecture that improves the capabilities of CNN-based pipelines for knee injury detection via magnetic resonance imaging (MRI). Existing works showed that anomalies are localized in small-sized knee regions that appear in particular areas of MRI scans. Based on such facts, MRPyrNet exploits a Feature Pyramid Network to enhance small appearing features and Pyramidal Detail Pooling to capture such relevant information in a robust way. Experimental results on two publicly available datasets demonstrate that MRPyrNet improves the ACL tear and meniscal tear diagnosis capabilities of two state-of-the-art methodologies. Code is available at https://git.io/JtMPH
A supervised extreme learning committee for food recognition
Food recognition is an emerging topic in computer vision. The problem is being addressed especially in health-oriented systems where it is used as a support for food diary applications. The goal is to improve current food diaries, where the users have to manually insert their daily food intake, with an automatic recognition of the food type, quantity and consequent calories intake estimation. In addition to the classical recognition challenges, the food recognition problem is characterized by the absence of a rigid structure of the food and by large intra-class variations. To tackle such challenges, a food recognition system based on a committee classification is proposed. The aim is to provide a system capable of automatically choosing the optimal features for food recognition out of the existing plethora of available ones (e.g., color, texture, etc.). Following this idea, each committee member, i.e., an Extreme Learning Machine, is trained to specialize on a single feature type. Then, a Structural Support Vector Machine is exploited to produce the final ranking of possible matches by filtering out the irrelevant features and thus merging only the relevant ones. Experimental results show that the proposed system outperforms state-of-the-art works on four publicly available benchmark datasets. \ua9 2016 Elsevier Inc. All rights reserved
Temporal Model Adaptation for Person Re-Identification
Person re-identification is an open and challenging problem in computer
vision. Majority of the efforts have been spent either to design the best
feature representation or to learn the optimal matching metric. Most approaches
have neglected the problem of adapting the selected features or the learned
model over time. To address such a problem, we propose a temporal model
adaptation scheme with human in the loop. We first introduce a
similarity-dissimilarity learning method which can be trained in an incremental
fashion by means of a stochastic alternating directions methods of multipliers
optimization procedure. Then, to achieve temporal adaptation with limited human
effort, we exploit a graph-based approach to present the user only the most
informative probe-gallery matches that should be used to update the model.
Results on three datasets have shown that our approach performs on par or even
better than state-of-the-art approaches while reducing the manual pairwise
labeling effort by about 80%
Multi Branch Siamese Network For Person Re-Identification
To capture robust person features, learning discriminative,
style and view invariant descriptors is a key challenge in person Re-Identification (re-id). Most deep Re-ID models learn
single scale feature representation which are unable to grasp
compact and style invariant representations. In this paper,
we present a multi branch Siamese Deep Neural Network
with multiple classifiers to overcome the above issues. The
multi-branch learning of the network creates a stronger descriptor with fine-grained information from global features of
a person. Camera to camera image translation is performed
with generative adversarial network to generate diverse data
and add style invariance in learned features. Experimental
results on benchmark datasets demonstrate that the proposed
method performs better than other state of the arts methods
Tracking-by-Trackers with a Distilled and Reinforced Model
Visual object tracking was generally tackled by reasoning independently on fast processing algorithms, accurate online adaptation methods, and fusion of trackers. In this paper, we unify such goals by proposing a novel tracking methodology that takes advantage of other visual trackers, offline and online. A compact student model is trained via the marriage of knowledge distillation and reinforcement learning. The first allows to transfer and compress tracking knowledge of other trackers. The second enables the learning of evaluation measures which are then exploited online. After learning, the student can be ultimately used to build (i) a very fast single-shot tracker, (ii) a tracker with a simple and effective online adaptation mechanism, (iii) a tracker that performs fusion of other trackers. Extensive validation shows that the proposed algorithms compete with real-time state-of-the-art trackers
Pose-Normalized Image Generation for Person Re-identification
Person Re-identification (re-id) faces two major challenges: the lack of
cross-view paired training data and learning discriminative identity-sensitive
and view-invariant features in the presence of large pose variations. In this
work, we address both problems by proposing a novel deep person image
generation model for synthesizing realistic person images conditional on the
pose. The model is based on a generative adversarial network (GAN) designed
specifically for pose normalization in re-id, thus termed pose-normalization
GAN (PN-GAN). With the synthesized images, we can learn a new type of deep
re-id feature free of the influence of pose variations. We show that this
feature is strong on its own and complementary to features learned with the
original images. Importantly, under the transfer learning setting, we show that
our model generalizes well to any new re-id dataset without the need for
collecting any training data for model fine-tuning. The model thus has the
potential to make re-id model truly scalable.Comment: 10 pages, 5 figure
- …