34,054 research outputs found
A Nonparametric Ensemble Binary Classifier and its Statistical Properties
In this work, we propose an ensemble of classification trees (CT) and
artificial neural networks (ANN). Several statistical properties including
universal consistency and upper bound of an important parameter of the proposed
classifier are shown. Numerical evidence is also provided using various real
life data sets to assess the performance of the model. Our proposed
nonparametric ensemble classifier doesn't suffer from the `curse of
dimensionality' and can be used in a wide variety of feature selection cum
classification problems. Performance of the proposed model is quite better when
compared to many other state-of-the-art models used for similar situations
Looking Beyond Appearances: Synthetic Training Data for Deep CNNs in Re-identification
Re-identification is generally carried out by encoding the appearance of a
subject in terms of outfit, suggesting scenarios where people do not change
their attire. In this paper we overcome this restriction, by proposing a
framework based on a deep convolutional neural network, SOMAnet, that
additionally models other discriminative aspects, namely, structural attributes
of the human figure (e.g. height, obesity, gender). Our method is unique in
many respects. First, SOMAnet is based on the Inception architecture, departing
from the usual siamese framework. This spares expensive data preparation
(pairing images across cameras) and allows the understanding of what the
network learned. Second, and most notably, the training data consists of a
synthetic 100K instance dataset, SOMAset, created by photorealistic human body
generation software. Synthetic data represents a good compromise between
realistic imagery, usually not required in re-identification since surveillance
cameras capture low-resolution silhouettes, and complete control of the
samples, which is useful in order to customize the data w.r.t. the surveillance
scenario at-hand, e.g. ethnicity. SOMAnet, trained on SOMAset and fine-tuned on
recent re-identification benchmarks, outperforms all competitors, matching
subjects even with different apparel. The combination of synthetic data with
Inception architectures opens up new research avenues in re-identification.Comment: 14 page
In All Likelihood, Deep Belief Is Not Enough
Statistical models of natural stimuli provide an important tool for
researchers in the fields of machine learning and computational neuroscience. A
canonical way to quantitatively assess and compare the performance of
statistical models is given by the likelihood. One class of statistical models
which has recently gained increasing popularity and has been applied to a
variety of complex data are deep belief networks. Analyses of these models,
however, have been typically limited to qualitative analyses based on samples
due to the computationally intractable nature of the model likelihood.
Motivated by these circumstances, the present article provides a consistent
estimator for the likelihood that is both computationally tractable and simple
to apply in practice. Using this estimator, a deep belief network which has
been suggested for the modeling of natural image patches is quantitatively
investigated and compared to other models of natural image patches. Contrary to
earlier claims based on qualitative results, the results presented in this
article provide evidence that the model under investigation is not a
particularly good model for natural image
GraphVite: A High-Performance CPU-GPU Hybrid System for Node Embedding
Learning continuous representations of nodes is attracting growing interest
in both academia and industry recently, due to their simplicity and
effectiveness in a variety of applications. Most of existing node embedding
algorithms and systems are capable of processing networks with hundreds of
thousands or a few millions of nodes. However, how to scale them to networks
that have tens of millions or even hundreds of millions of nodes remains a
challenging problem. In this paper, we propose GraphVite, a high-performance
CPU-GPU hybrid system for training node embeddings, by co-optimizing the
algorithm and the system. On the CPU end, augmented edge samples are parallelly
generated by random walks in an online fashion on the network, and serve as the
training data. On the GPU end, a novel parallel negative sampling is proposed
to leverage multiple GPUs to train node embeddings simultaneously, without much
data transfer and synchronization. Moreover, an efficient collaboration
strategy is proposed to further reduce the synchronization cost between CPUs
and GPUs. Experiments on multiple real-world networks show that GraphVite is
super efficient. It takes only about one minute for a network with 1 million
nodes and 5 million edges on a single machine with 4 GPUs, and takes around 20
hours for a network with 66 million nodes and 1.8 billion edges. Compared to
the current fastest system, GraphVite is about 50 times faster without any
sacrifice on performance.Comment: accepted at WWW 201
- …