268 research outputs found
Deep Short Text Classification with Knowledge Powered Attention
Short text classification is one of important tasks in Natural Language
Processing (NLP). Unlike paragraphs or documents, short texts are more
ambiguous since they have not enough contextual information, which poses a
great challenge for classification. In this paper, we retrieve knowledge from
external knowledge source to enhance the semantic representation of short
texts. We take conceptual information as a kind of knowledge and incorporate it
into deep neural networks. For the purpose of measuring the importance of
knowledge, we introduce attention mechanisms and propose deep Short Text
Classification with Knowledge powered Attention (STCKA). We utilize Concept
towards Short Text (C- ST) attention and Concept towards Concept Set (C-CS)
attention to acquire the weight of concepts from two aspects. And we classify a
short text with the help of conceptual information. Unlike traditional
approaches, our model acts like a human being who has intrinsic ability to make
decisions based on observation (i.e., training data for machines) and pays more
attention to important knowledge. We also conduct extensive experiments on four
public datasets for different tasks. The experimental results and case studies
show that our model outperforms the state-of-the-art methods, justifying the
effectiveness of knowledge powered attention
Continual Driving Policy Optimization with Closed-Loop Individualized Curricula
The safety of autonomous vehicles (AV) has been a long-standing top concern,
stemming from the absence of rare and safety-critical scenarios in the
long-tail naturalistic driving distribution. To tackle this challenge, a surge
of research in scenario-based autonomous driving has emerged, with a focus on
generating high-risk driving scenarios and applying them to conduct
safety-critical testing of AV models. However, limited work has been explored
on the reuse of these extensive scenarios to iteratively improve AV models.
Moreover, it remains intractable and challenging to filter through gigantic
scenario libraries collected from other AV models with distinct behaviors,
attempting to extract transferable information for current AV improvement.
Therefore, we develop a continual driving policy optimization framework
featuring Closed-Loop Individualized Curricula (CLIC), which we factorize into
a set of standardized sub-modules for flexible implementation choices: AV
Evaluation, Scenario Selection, and AV Training. CLIC frames AV Evaluation as a
collision prediction task, where it estimates the chance of AV failures in
these scenarios at each iteration. Subsequently, by re-sampling from historical
scenarios based on these failure probabilities, CLIC tailors individualized
curricula for downstream training, aligning them with the evaluated capability
of AV. Accordingly, CLIC not only maximizes the utilization of the vast
pre-collected scenario library for closed-loop driving policy optimization but
also facilitates AV improvement by individualizing its training with more
challenging cases out of those poorly organized scenarios. Experimental results
clearly indicate that CLIC surpasses other curriculum-based training
strategies, showing substantial improvement in managing risky scenarios, while
still maintaining proficiency in handling simpler cases
EGC: Image Generation and Classification via a Diffusion Energy-Based Model
Learning image classification and image generation using the same set of
network parameters is a challenging problem. Recent advanced approaches perform
well in one task often exhibit poor performance in the other. This work
introduces an energy-based classifier and generator, namely EGC, which can
achieve superior performance in both tasks using a single neural network.
Unlike a conventional classifier that outputs a label given an image (i.e., a
conditional distribution ), the forward pass in EGC is a
classifier that outputs a joint distribution , enabling an
image generator in its backward pass by marginalizing out the label . This
is done by estimating the energy and classification probability given a noisy
image in the forward pass, while denoising it using the score function
estimated in the backward pass. EGC achieves competitive generation results
compared with state-of-the-art approaches on ImageNet-1k, CelebA-HQ and LSUN
Church, while achieving superior classification accuracy and robustness against
adversarial attacks on CIFAR-10. This work represents the first successful
attempt to simultaneously excel in both tasks using a single set of network
parameters. We believe that EGC bridges the gap between discriminative and
generative learning
Deep Learning of Potential Outcomes
This review systematizes the emerging literature for causal inference using
deep neural networks under the potential outcomes framework. It provides an
intuitive introduction on how deep learning can be used to estimate/predict
heterogeneous treatment effects and extend causal inference to settings where
confounding is non-linear, time varying, or encoded in text, networks, and
images. To maximize accessibility, we also introduce prerequisite concepts from
causal inference and deep learning. The survey differs from other treatments of
deep learning and causal inference in its sharp focus on observational causal
estimation, its extended exposition of key algorithms, and its detailed
tutorials for implementing, training, and selecting among deep estimators in
Tensorflow 2 available at github.com/kochbj/Deep-Learning-for-Causal-Inference
Graph ODE with Factorized Prototypes for Modeling Complicated Interacting Dynamics
This paper studies the problem of modeling interacting dynamical systems,
which is critical for understanding physical dynamics and biological processes.
Recent research predominantly uses geometric graphs to represent these
interactions, which are then captured by powerful graph neural networks (GNNs).
However, predicting interacting dynamics in challenging scenarios such as
out-of-distribution shift and complicated underlying rules remains unsolved. In
this paper, we propose a new approach named Graph ODE with factorized
prototypes (GOAT) to address the problem. The core of GOAT is to incorporate
factorized prototypes from contextual knowledge into a continuous graph ODE
framework. Specifically, GOAT employs representation disentanglement and system
parameters to extract both object-level and system-level contexts from
historical trajectories, which allows us to explicitly model their independent
influence and thus enhances the generalization capability under system changes.
Then, we integrate these disentangled latent representations into a graph ODE
model, which determines a combination of various interacting prototypes for
enhanced model expressivity. The entire model is optimized using an end-to-end
variational inference framework to maximize the likelihood. Extensive
experiments in both in-distribution and out-of-distribution settings validate
the superiority of GOAT
- …