26 research outputs found
Exemplar-Centered Supervised Shallow Parametric Data Embedding
Metric learning methods for dimensionality reduction in combination with
k-Nearest Neighbors (kNN) have been extensively deployed in many
classification, data embedding, and information retrieval applications.
However, most of these approaches involve pairwise training data comparisons,
and thus have quadratic computational complexity with respect to the size of
training set, preventing them from scaling to fairly big datasets. Moreover,
during testing, comparing test data against all the training data points is
also expensive in terms of both computational cost and resources required.
Furthermore, previous metrics are either too constrained or too expressive to
be well learned. To effectively solve these issues, we present an
exemplar-centered supervised shallow parametric data embedding model, using a
Maximally Collapsing Metric Learning (MCML) objective. Our strategy learns a
shallow high-order parametric embedding function and compares training/test
data only with learned or precomputed exemplars, resulting in a cost function
with linear computational complexity for both training and testing. We also
empirically demonstrate, using several benchmark datasets, that for
classification in two-dimensional embedding space, our approach not only gains
speedup of kNN by hundreds of times, but also outperforms state-of-the-art
supervised embedding approaches.Comment: accepted to IJCAI201
A Context-aware Attention Network for Interactive Question Answering
Neural network based sequence-to-sequence models in an encoder-decoder
framework have been successfully applied to solve Question Answering (QA)
problems, predicting answers from statements and questions. However, almost all
previous models have failed to consider detailed context information and
unknown states under which systems do not have enough information to answer
given questions. These scenarios with incomplete or ambiguous information are
very common in the setting of Interactive Question Answering (IQA). To address
this challenge, we develop a novel model, employing context-dependent
word-level attention for more accurate statement representations and
question-guided sentence-level attention for better context modeling. We also
generate unique IQA datasets to test our model, which will be made publicly
available. Employing these attention mechanisms, our model accurately
understands when it can output an answer or when it requires generating a
supplementary question for additional input depending on different contexts.
When available, user's feedback is encoded and directly applied to update
sentence-level attention to infer an answer. Extensive experiments on QA and
IQA datasets quantitatively demonstrate the effectiveness of our model with
significant improvement over state-of-the-art conventional QA models.Comment: 9 page
A Deep Spatio-Temporal Fuzzy Neural Network for Passenger Demand Prediction
In spite of its importance, passenger demand prediction is a highly
challenging problem, because the demand is simultaneously influenced by the
complex interactions among many spatial and temporal factors and other external
factors such as weather. To address this problem, we propose a Spatio-TEmporal
Fuzzy neural Network (STEF-Net) to accurately predict passenger demands
incorporating the complex interactions of all known important factors. We
design an end-to-end learning framework with different neural networks modeling
different factors. Specifically, we propose to capture spatio-temporal feature
interactions via a convolutional long short-term memory network and model
external factors via a fuzzy neural network that handles data uncertainty
significantly better than deterministic methods. To keep the temporal relations
when fusing two networks and emphasize discriminative spatio-temporal feature
interactions, we employ a novel feature fusion method with a convolution
operation and an attention layer. As far as we know, our work is the first to
fuse a deep recurrent neural network and a fuzzy neural network to model
complex spatial-temporal feature interactions with additional uncertain input
features for predictive learning. Experiments on a large-scale real-world
dataset show that our model achieves more than 10% improvement over the
state-of-the-art approaches.Comment: https://epubs.siam.org/doi/abs/10.1137/1.9781611975673.1