735 research outputs found
Parametric t-Distributed Stochastic Exemplar-centered Embedding
Parametric embedding methods such as parametric t-SNE (pt-SNE) have been
widely adopted for data visualization and out-of-sample data embedding without
further computationally expensive optimization or approximation. However, the
performance of pt-SNE is highly sensitive to the hyper-parameter batch size due
to conflicting optimization goals, and often produces dramatically different
embeddings with different choices of user-defined perplexities. To effectively
solve these issues, we present parametric t-distributed stochastic
exemplar-centered embedding methods. Our strategy learns embedding parameters
by comparing given data only with precomputed exemplars, resulting in a cost
function with linear computational and memory complexity, which is further
reduced by noise contrastive samples. Moreover, we propose a shallow embedding
network with high-order feature interactions for data visualization, which is
much easier to tune but produces comparable performance in contrast to a deep
neural network employed by pt-SNE. We empirically demonstrate, using several
benchmark datasets, that our proposed methods significantly outperform pt-SNE
in terms of robustness, visual effects, and quantitative evaluations.Comment: fixed typo
Exemplar-Centered Supervised Shallow Parametric Data Embedding
Metric learning methods for dimensionality reduction in combination with
k-Nearest Neighbors (kNN) have been extensively deployed in many
classification, data embedding, and information retrieval applications.
However, most of these approaches involve pairwise training data comparisons,
and thus have quadratic computational complexity with respect to the size of
training set, preventing them from scaling to fairly big datasets. Moreover,
during testing, comparing test data against all the training data points is
also expensive in terms of both computational cost and resources required.
Furthermore, previous metrics are either too constrained or too expressive to
be well learned. To effectively solve these issues, we present an
exemplar-centered supervised shallow parametric data embedding model, using a
Maximally Collapsing Metric Learning (MCML) objective. Our strategy learns a
shallow high-order parametric embedding function and compares training/test
data only with learned or precomputed exemplars, resulting in a cost function
with linear computational complexity for both training and testing. We also
empirically demonstrate, using several benchmark datasets, that for
classification in two-dimensional embedding space, our approach not only gains
speedup of kNN by hundreds of times, but also outperforms state-of-the-art
supervised embedding approaches.Comment: accepted to IJCAI201
Applied deep learning in intelligent transportation systems and embedding exploration
Deep learning techniques have achieved tremendous success in many real applications in recent years and show their great potential in many areas including transportation. Even though transportation becomes increasingly indispensable in people’s daily life, its related problems, such as traffic congestion and energy waste, have not been completely solved, yet some problems have become even more critical. This dissertation focuses on solving the following fundamental problems: (1) passenger demand prediction, (2) transportation mode detection, (3) traffic light control, in the transportation field using deep learning. The dissertation also extends the application of deep learning to an embedding system for visualization and data retrieval.
The first part of this dissertation is about a Spatio-TEmporal Fuzzy neural Network (STEF-Net) which accurately predicts passenger demand by incorporating the complex interaction of all known important factors, such as temporal, spatial and external information. Specifically, a convolutional long short-term memory network is employed to simultaneously capture spatio-temporal feature interaction, and a fuzzy neural network to model external factors. A novel feature fusion method with convolution and an attention layer is proposed to keep the temporal relation and discriminative spatio-temporal feature interaction. Experiments on a large-scale real-world dataset show the proposed model outperforms the state-of-the-art approaches.
The second part is a light-weight and energy-efficient system which detects transportation modes using only accelerometer sensors in smartphones. Understanding people’s transportation modes is beneficial to many civilian applications, such as urban transportation planning. The system collects accelerometer data in an efficient way and leverages a convolutional neural network to determine transportation modes. Different architectures and classification methods are tested with the proposed convolutional neural network to optimize the system design. Performance evaluation shows that the proposed approach achieves better accuracy than existing work in detecting people’s transportation modes.
The third component of this dissertation is a deep reinforcement learning model, based on Q learning, to control the traffic light. Existing inefficient traffic light control causes numerous problems, such as long delay and waste of energy. In the proposed model, the complex traffic scenario is quantified as states by collecting data and dividing the whole intersection into grids. The timing changes of a traffic light are the actions, which are modeled as a high-dimension Markov decision process. The reward is the cumulative waiting time difference between two cycles. To solve the model, a convolutional neural network is employed to map states to rewards, which is further optimized by several components, such as dueling network, target network, double Q-learning network, and prioritized experience replay. The simulation results in Simulation of Urban MObility (SUMO) show the efficiency of the proposed model in controlling traffic lights.
The last part of this dissertation studies the hierarchical structure in an embedding system. Traditional embedding approaches associate a real-valued embedding vector with each symbol or data point, which generates storage-inefficient representation and fails to effectively encode the internal semantic structure of data. A regularized autoencoder framework is proposed to learn compact Hierarchical K-way D-dimensional (HKD) discrete embedding of data points, aiming at capturing semantic structures of data. Experimental results on synthetic and real-world datasets show that the proposed HKD embedding can effectively reveal the semantic structure of data via visualization and greatly reduce the search space of nearest neighbor retrieval while preserving high accuracy
A Deterministic and Generalized Framework for Unsupervised Learning with Restricted Boltzmann Machines
Restricted Boltzmann machines (RBMs) are energy-based neural-networks which
are commonly used as the building blocks for deep architectures neural
architectures. In this work, we derive a deterministic framework for the
training, evaluation, and use of RBMs based upon the Thouless-Anderson-Palmer
(TAP) mean-field approximation of widely-connected systems with weak
interactions coming from spin-glass theory. While the TAP approach has been
extensively studied for fully-visible binary spin systems, our construction is
generalized to latent-variable models, as well as to arbitrarily distributed
real-valued spin systems with bounded support. In our numerical experiments, we
demonstrate the effective deterministic training of our proposed models and are
able to show interesting features of unsupervised learning which could not be
directly observed with sampling. Additionally, we demonstrate how to utilize
our TAP-based framework for leveraging trained RBMs as joint priors in
denoising problems
- …