12,664 research outputs found
X-CNN: Cross-modal Convolutional Neural Networks for Sparse Datasets
In this paper we propose cross-modal convolutional neural networks (X-CNNs),
a novel biologically inspired type of CNN architectures, treating gradient
descent-specialised CNNs as individual units of processing in a larger-scale
network topology, while allowing for unconstrained information flow and/or
weight sharing between analogous hidden layers of the network---thus
generalising the already well-established concept of neural network ensembles
(where information typically may flow only between the output layers of the
individual networks). The constituent networks are individually designed to
learn the output function on their own subset of the input data, after which
cross-connections between them are introduced after each pooling operation to
periodically allow for information exchange between them. This injection of
knowledge into a model (by prior partition of the input data through domain
knowledge or unsupervised methods) is expected to yield greatest returns in
sparse data environments, which are typically less suitable for training CNNs.
For evaluation purposes, we have compared a standard four-layer CNN as well as
a sophisticated FitNet4 architecture against their cross-modal variants on the
CIFAR-10 and CIFAR-100 datasets with differing percentages of the training data
being removed, and find that at lower levels of data availability, the X-CNNs
significantly outperform their baselines (typically providing a 2--6% benefit,
depending on the dataset size and whether data augmentation is used), while
still maintaining an edge on all of the full dataset tests.Comment: To appear in the 7th IEEE Symposium Series on Computational
Intelligence (IEEE SSCI 2016), 8 pages, 6 figures. Minor revisions, in
response to reviewers' comment
Looking Beyond Appearances: Synthetic Training Data for Deep CNNs in Re-identification
Re-identification is generally carried out by encoding the appearance of a
subject in terms of outfit, suggesting scenarios where people do not change
their attire. In this paper we overcome this restriction, by proposing a
framework based on a deep convolutional neural network, SOMAnet, that
additionally models other discriminative aspects, namely, structural attributes
of the human figure (e.g. height, obesity, gender). Our method is unique in
many respects. First, SOMAnet is based on the Inception architecture, departing
from the usual siamese framework. This spares expensive data preparation
(pairing images across cameras) and allows the understanding of what the
network learned. Second, and most notably, the training data consists of a
synthetic 100K instance dataset, SOMAset, created by photorealistic human body
generation software. Synthetic data represents a good compromise between
realistic imagery, usually not required in re-identification since surveillance
cameras capture low-resolution silhouettes, and complete control of the
samples, which is useful in order to customize the data w.r.t. the surveillance
scenario at-hand, e.g. ethnicity. SOMAnet, trained on SOMAset and fine-tuned on
recent re-identification benchmarks, outperforms all competitors, matching
subjects even with different apparel. The combination of synthetic data with
Inception architectures opens up new research avenues in re-identification.Comment: 14 page
Multi-level Feature Fusion-based CNN for Local Climate Zone Classification from Sentinel-2 Images: Benchmark Results on the So2Sat LCZ42 Dataset
As a unique classification scheme for urban forms and functions, the local
climate zone (LCZ) system provides essential general information for any
studies related to urban environments, especially on a large scale. Remote
sensing data-based classification approaches are the key to large-scale mapping
and monitoring of LCZs. The potential of deep learning-based approaches is not
yet fully explored, even though advanced convolutional neural networks (CNNs)
continue to push the frontiers for various computer vision tasks. One reason is
that published studies are based on different datasets, usually at a regional
scale, which makes it impossible to fairly and consistently compare the
potential of different CNNs for real-world scenarios. This study is based on
the big So2Sat LCZ42 benchmark dataset dedicated to LCZ classification. Using
this dataset, we studied a range of CNNs of varying sizes. In addition, we
proposed a CNN to classify LCZs from Sentinel-2 images, Sen2LCZ-Net. Using this
base network, we propose fusing multi-level features using the extended
Sen2LCZ-Net-MF. With this proposed simple network architecture and the highly
competitive benchmark dataset, we obtain results that are better than those
obtained by the state-of-the-art CNNs, while requiring less computation with
fewer layers and parameters. Large-scale LCZ classification examples of
completely unseen areas are presented, demonstrating the potential of our
proposed Sen2LCZ-Net-MF as well as the So2Sat LCZ42 dataset. We also
intensively investigated the influence of network depth and width and the
effectiveness of the design choices made for Sen2LCZ-Net-MF. Our work will
provide important baselines for future CNN-based algorithm developments for
both LCZ classification and other urban land cover land use classification
Face Attribute Prediction Using Off-the-Shelf CNN Features
Predicting attributes from face images in the wild is a challenging computer
vision problem. To automatically describe face attributes from face containing
images, traditionally one needs to cascade three technical blocks --- face
localization, facial descriptor construction, and attribute classification ---
in a pipeline. As a typical classification problem, face attribute prediction
has been addressed using deep learning. Current state-of-the-art performance
was achieved by using two cascaded Convolutional Neural Networks (CNNs), which
were specifically trained to learn face localization and attribute description.
In this paper, we experiment with an alternative way of employing the power of
deep representations from CNNs. Combining with conventional face localization
techniques, we use off-the-shelf architectures trained for face recognition to
build facial descriptors. Recognizing that the describable face attributes are
diverse, our face descriptors are constructed from different levels of the CNNs
for different attributes to best facilitate face attribute prediction.
Experiments on two large datasets, LFWA and CelebA, show that our approach is
entirely comparable to the state-of-the-art. Our findings not only demonstrate
an efficient face attribute prediction approach, but also raise an important
question: how to leverage the power of off-the-shelf CNN representations for
novel tasks.Comment: In proceeding of 2016 International Conference on Biometrics (ICB
Recommended from our members
Toward Fast and Reliable Potential Energy Surfaces for Metallic Pt Clusters by Hierarchical Delta Neural Networks.
Data-driven machine learning force fields (MLFs) are more and more popular in atomistic simulations and exploit machine learning methods to predict energies and forces for unknown structures based on the knowledge learned from an existing reference database. The latter usually comes from density functional theory calculations. One main drawback of MLFs is that physical laws are not incorporated in the machine learning models, and instead, MLFs are designed to be very flexible to simulate complex quantum chemistry potential energy surface (PES). In general, MLFs have poor transferability, and hence, a very large trainset is required to span all the target feature space to get a reliable MLF. This procedure becomes more troublesome when the PES is complicated, with a large number of degrees of freedom, in which building a large database is inevitable and very expensive, especially when accurate but costly exchange-correlation functionals have to be used. In this manuscript, we exploit a high-dimensional neural network potential (HDNNP) on Pt clusters of sizes from 6 to 20 as one example. Our standard level of energy calculation is DFT GGA (PBE) using a plane wave basis set. We introduce an approximate but fast level with the PBE functional and a minimal atomic orbital basis set, and then, a more accurate but expensive level, using a hybrid functional or nonlocal vdW functional and a plane wave basis set, is reliably predicted by learning the difference with HDNNP. The results show that such a differential approach (named ΔHDNNP) can deliver very accurate predictions (error <10 meV/atom) in reference to converged basis set energies as well as more accurate but expensive xc functionals. The overall speedup can be as large as 900 for a 20 atom Pt cluster. More importantly, ΔHDNNP shows much better transferability due to the intrinsic smoothness of the delta potential energy surface, and accordingly, one can use much smaller trainset data to obtain better accuracy than the conventional HDNNP. A multilayer ΔHDNNP is thus proposed to obtain very accurate predictions versus expensive nonlocal vdW functional calculations in which the required trainset is further reduced. The approach can be easily generalized to any other machine learning methods and opens a path to study the structure and dynamics of Pt clusters and nanoparticles
Aerodynamic Optimization of High-Speed Trains Nose using a Genetic Algorithm and Artificial Neural Network
An aerodynamic optimization of the train aerodynamic characteristics in term of front wind action sensitivity is carried out in this paper. In particular, a genetic algorithm (GA) is used to perform a shape optimization study of a high-speed train nose. The nose is parametrically defined via BĂ©zier Curves, including a wider range of geometries in the design space as possible optimal solutions. Using a GA, the main disadvantage to deal with is the large number of evaluations need before finding such optimal. Here it is proposed the use of metamodels to replace Navier-Stokes solver. Among all the posibilities, Rsponse Surface Models and Artificial Neural Networks (ANN) are considered. Best results of prediction and generalization are obtained with ANN and those are applied in GA code. The paper shows the feasibility of using GA in combination with ANN for this problem, and solutions achieved are included
- …