74,959 research outputs found
Hybrid multi-layer Deep CNN/Aggregator feature for image classification
Deep Convolutional Neural Networks (DCNN) have established a remarkable
performance benchmark in the field of image classification, displacing
classical approaches based on hand-tailored aggregations of local descriptors.
Yet DCNNs impose high computational burdens both at training and at testing
time, and training them requires collecting and annotating large amounts of
training data. Supervised adaptation methods have been proposed in the
literature that partially re-learn a transferred DCNN structure from a new
target dataset. Yet these require expensive bounding-box annotations and are
still computationally expensive to learn. In this paper, we address these
shortcomings of DCNN adaptation schemes by proposing a hybrid approach that
combines conventional, unsupervised aggregators such as Bag-of-Words (BoW),
with the DCNN pipeline by treating the output of intermediate layers as densely
extracted local descriptors.
We test a variant of our approach that uses only intermediate DCNN layers on
the standard PASCAL VOC 2007 dataset and show performance significantly higher
than the standard BoW model and comparable to Fisher vector aggregation but
with a feature that is 150 times smaller. A second variant of our approach that
includes the fully connected DCNN layers significantly outperforms Fisher
vector schemes and performs comparably to DCNN approaches adapted to Pascal VOC
2007, yet at only a small fraction of the training and testing cost.Comment: Accepted in ICASSP 2015 conference, 5 pages including reference, 4
figures and 2 table
Adaptive Density Estimation for Generative Models
Unsupervised learning of generative models has seen tremendous progress over
recent years, in particular due to generative adversarial networks (GANs),
variational autoencoders, and flow-based models. GANs have dramatically
improved sample quality, but suffer from two drawbacks: (i) they mode-drop,
i.e., do not cover the full support of the train data, and (ii) they do not
allow for likelihood evaluations on held-out data. In contrast,
likelihood-based training encourages models to cover the full support of the
train data, but yields poorer samples. These mutual shortcomings can in
principle be addressed by training generative latent variable models in a
hybrid adversarial-likelihood manner. However, we show that commonly made
parametric assumptions create a conflict between them, making successful hybrid
models non trivial. As a solution, we propose to use deep invertible
transformations in the latent variable decoder. This approach allows for
likelihood computations in image space, is more efficient than fully invertible
models, and can take full advantage of adversarial training. We show that our
model significantly improves over existing hybrid models: offering GAN-like
samples, IS and FID scores that are competitive with fully adversarial models,
and improved likelihood scores
Hyperbolic Geometry in Computer Vision: A Novel Framework for Convolutional Neural Networks
Real-world visual data exhibit intrinsic hierarchical structures that can be
represented effectively in hyperbolic spaces. Hyperbolic neural networks (HNNs)
are a promising approach for learning feature representations in such spaces.
However, current methods in computer vision rely on Euclidean backbones and
only project features to the hyperbolic space in the task heads, limiting their
ability to fully leverage the benefits of hyperbolic geometry. To address this,
we present HCNN, the first fully hyperbolic convolutional neural network (CNN)
designed for computer vision tasks. Based on the Lorentz model, we generalize
fundamental components of CNNs and propose novel formulations of the
convolutional layer, batch normalization, and multinomial logistic regression
(MLR). Experimentation on standard vision tasks demonstrates the effectiveness
of our HCNN framework and the Lorentz model in both hybrid and fully hyperbolic
settings. Overall, we aim to pave the way for future research in hyperbolic
computer vision by offering a new paradigm for interpreting and analyzing
visual data. Our code is publicly available at
https://github.com/kschwethelm/HyperbolicCV
Learning Heterogeneous Similarity Measures for Hybrid-Recommendations in Meta-Mining
The notion of meta-mining has appeared recently and extends the traditional
meta-learning in two ways. First it does not learn meta-models that provide
support only for the learning algorithm selection task but ones that support
the whole data-mining process. In addition it abandons the so called black-box
approach to algorithm description followed in meta-learning. Now in addition to
the datasets, algorithms also have descriptors, workflows as well. For the
latter two these descriptions are semantic, describing properties of the
algorithms. With the availability of descriptors both for datasets and data
mining workflows the traditional modelling techniques followed in
meta-learning, typically based on classification and regression algorithms, are
no longer appropriate. Instead we are faced with a problem the nature of which
is much more similar to the problems that appear in recommendation systems. The
most important meta-mining requirements are that suggestions should use only
datasets and workflows descriptors and the cold-start problem, e.g. providing
workflow suggestions for new datasets.
In this paper we take a different view on the meta-mining modelling problem
and treat it as a recommender problem. In order to account for the meta-mining
specificities we derive a novel metric-based-learning recommender approach. Our
method learns two homogeneous metrics, one in the dataset and one in the
workflow space, and a heterogeneous one in the dataset-workflow space. All
learned metrics reflect similarities established from the dataset-workflow
preference matrix. We demonstrate our method on meta-mining over biological
(microarray datasets) problems. The application of our method is not limited to
the meta-mining problem, its formulations is general enough so that it can be
applied on problems with similar requirements
Adapting Sequence to Sequence models for Text Normalization in Social Media
Social media offer an abundant source of valuable raw data, however informal
writing can quickly become a bottleneck for many natural language processing
(NLP) tasks. Off-the-shelf tools are usually trained on formal text and cannot
explicitly handle noise found in short online posts. Moreover, the variety of
frequently occurring linguistic variations presents several challenges, even
for humans who might not be able to comprehend the meaning of such posts,
especially when they contain slang and abbreviations. Text Normalization aims
to transform online user-generated text to a canonical form. Current text
normalization systems rely on string or phonetic similarity and classification
models that work on a local fashion. We argue that processing contextual
information is crucial for this task and introduce a social media text
normalization hybrid word-character attention-based encoder-decoder model that
can serve as a pre-processing step for NLP applications to adapt to noisy text
in social media. Our character-based component is trained on synthetic
adversarial examples that are designed to capture errors commonly found in
online user-generated text. Experiments show that our model surpasses neural
architectures designed for text normalization and achieves comparable
performance with state-of-the-art related work.Comment: Accepted at the 13th International AAAI Conference on Web and Social
Media (ICWSM 2019
- …