351,240 research outputs found
Conditional Similarity Networks
What makes images similar? To measure the similarity between images, they are
typically embedded in a feature-vector space, in which their distance preserve
the relative dissimilarity. However, when learning such similarity embeddings
the simplifying assumption is commonly made that images are only compared to
one unique measure of similarity. A main reason for this is that contradicting
notions of similarities cannot be captured in a single space. To address this
shortcoming, we propose Conditional Similarity Networks (CSNs) that learn
embeddings differentiated into semantically distinct subspaces that capture the
different notions of similarities. CSNs jointly learn a disentangled embedding
where features for different similarities are encoded in separate dimensions as
well as masks that select and reweight relevant dimensions to induce a subspace
that encodes a specific similarity notion. We show that our approach learns
interpretable image representations with visually relevant semantic subspaces.
Further, when evaluating on triplet questions from multiple similarity notions
our model even outperforms the accuracy obtained by training individual
specialized networks for each notion separately.Comment: CVPR 201
Adaptive Nonparametric Image Parsing
In this paper, we present an adaptive nonparametric solution to the image
parsing task, namely annotating each image pixel with its corresponding
category label. For a given test image, first, a locality-aware retrieval set
is extracted from the training data based on super-pixel matching similarities,
which are augmented with feature extraction for better differentiation of local
super-pixels. Then, the category of each super-pixel is initialized by the
majority vote of the -nearest-neighbor super-pixels in the retrieval set.
Instead of fixing as in traditional non-parametric approaches, here we
propose a novel adaptive nonparametric approach which determines the
sample-specific k for each test image. In particular, is adaptively set to
be the number of the fewest nearest super-pixels which the images in the
retrieval set can use to get the best category prediction. Finally, the initial
super-pixel labels are further refined by contextual smoothing. Extensive
experiments on challenging datasets demonstrate the superiority of the new
solution over other state-of-the-art nonparametric solutions.Comment: 11 page
Spectral Graph Convolutions for Population-based Disease Prediction
Exploiting the wealth of imaging and non-imaging information for disease
prediction tasks requires models capable of representing, at the same time,
individual features as well as data associations between subjects from
potentially large populations. Graphs provide a natural framework for such
tasks, yet previous graph-based approaches focus on pairwise similarities
without modelling the subjects' individual characteristics and features. On the
other hand, relying solely on subject-specific imaging feature vectors fails to
model the interaction and similarity between subjects, which can reduce
performance. In this paper, we introduce the novel concept of Graph
Convolutional Networks (GCN) for brain analysis in populations, combining
imaging and non-imaging data. We represent populations as a sparse graph where
its vertices are associated with image-based feature vectors and the edges
encode phenotypic information. This structure was used to train a GCN model on
partially labelled graphs, aiming to infer the classes of unlabelled nodes from
the node features and pairwise associations between subjects. We demonstrate
the potential of the method on the challenging ADNI and ABIDE databases, as a
proof of concept of the benefit from integrating contextual information in
classification tasks. This has a clear impact on the quality of the
predictions, leading to 69.5% accuracy for ABIDE (outperforming the current
state of the art of 66.8%) and 77% for ADNI for prediction of MCI conversion,
significantly outperforming standard linear classifiers where only individual
features are considered.Comment: International Conference on Medical Image Computing and
Computer-Assisted Interventions (MICCAI) 201
Feature aggregation and region-aware learning for detection of splicing forgery.
Detection of image splicing forgery become an increasingly difficult task due to the scale variations of the forged areas and the covered traces of manipulation from post-processing techniques. Most existing methods fail to jointly multi-scale local and global information and ignore the correlations between the tampered and real regions in inter-image, which affects the detection performance of multi-scale tampered regions. To tackle these challenges, in this paper, we propose a novel method based on feature aggregation and region-aware learning to detect the manipulated areas with varying scales. In specific, we first integrate multi-level adjacency features using a feature selection mechanism to improve feature representation. Second, a cross-domain correlation aggregation module is devised to perform correlation enhancement of local features from CNN and global representations from Transformer, allowing for a complementary fusion of dual-domain information. Third, a region-aware learning mechanism is designed to improve feature discrimination by comparing the similarities and differences of the features between different regions. Extensive evaluations on benchmark datasets indicate the effectiveness in detecting multi-scale spliced tampered regions
Measuring concept similarities in multimedia ontologies: analysis and evaluations
The recent development of large-scale multimedia concept ontologies has provided a new momentum for research in the semantic analysis of multimedia repositories. Different methods for generic concept detection have been extensively studied, but the question of how to exploit the structure of a multimedia ontology and existing inter-concept relations has not received similar attention. In this paper, we present a clustering-based method for modeling semantic concepts on low-level feature spaces and study the evaluation of the quality of such models with entropy-based methods. We cover a variety of methods for assessing the similarity of different concepts in a multimedia ontology. We study three ontologies and apply the proposed techniques in experiments involving the visual and semantic similarities, manual annotation of video, and concept detection. The results show that modeling inter-concept relations can provide a promising resource for many different application areas in semantic multimedia processing
- …