44 research outputs found
Listen, Attend, and Walk: Neural Mapping of Navigational Instructions to Action Sequences
We propose a neural sequence-to-sequence model for direction following, a
task that is essential to realizing effective autonomous agents. Our
alignment-based encoder-decoder model with long short-term memory recurrent
neural networks (LSTM-RNN) translates natural language instructions to action
sequences based upon a representation of the observable world state. We
introduce a multi-level aligner that empowers our model to focus on sentence
"regions" salient to the current world state by using multiple abstractions of
the input sentence. In contrast to existing methods, our model uses no
specialized linguistic resources (e.g., parsers) or task-specific annotations
(e.g., seed lexicons). It is therefore generalizable, yet still achieves the
best results reported to-date on a benchmark single-sentence dataset and
competitive results for the limited-training multi-sentence setting. We analyze
our model through a series of ablations that elucidate the contributions of the
primary components of our model.Comment: To appear at AAAI 2016 (and an extended version of a NIPS 2015
Multimodal Machine Learning workshop paper
Neural Probabilistic Methods for Event Sequence Modeling
This thesis focuses on modeling event sequences, namely, sequences of discrete events in continuous time. We build a family of generative probabilistic models that is able to reason about what events will happen in the future and when, given the history of previous events. Under our models, each event—as it happens—is allowed to update the future intensities of multiple event types, and the intensity of each event type—as nothing happens—is allowed to evolve with time along a trajectory.
We use neural networks to allow the “updates” and “trajectories” to be complex and realistic. In the purely neural version of our model, all future event intensities are conditioned on the hidden state of a continuous-time LSTM, which has consumed every past event as it happened. To exploit domain-specific knowledge of how an event might only affect a few—but not all—future event intensities, we propose to introduce domain-specific structure into the model. We design a modeling language, by which a domain expert can write down the rules of a temporal deductive database. The database tracks facts over time; the rules deduce facts from other facts and from past events. Each fact has a time-varying state, computed by a neural network whose topology is determined by the fact’s provenance, including its experience of the past events that have contributed to deducing it. The possible event types at any time are given by special facts, whose intensities are neurally modeled alongside their states.
We develop efficient methods for training our models, and doing inference with them. Applying the general principle of noise-contrastive estimation, we work out a stochastic training objective that is less expensive to optimize than the log-likelihood, which people typically maximize for parameter estimation. As in the discrete-time case that inspired us, the parameters that maximize our objective will provably maximize the log-likelihood as well. For the scenarios where we are given incomplete sequences, we propose particle smoothing—a form of sequential importance sampling—to impute the missing events.
This thesis includes extensive experiments, demonstrating the effectiveness of our models and algorithms. On many synthetic and real-world datasets, on held-out sequences, we show empirically: (1) our purely neural model achieves competitive likelihood and predictive accuracy; (2) our neural-symbolic model improves prediction by encoding appropriate domain knowledge in the architecture; (3) for models to achieve the same level of log-likelihood, our noise-contrastive estimation needs considerably fewer function evaluations and less wall-clock time than maximum likelihood estimation; (4) our particle smoothing method is effective at inferring the ground-truth unobserved events.
In this thesis, I will also discuss a few future research directions, including embedding our models within a reinforcement learner to discover causal structure and learn an intervention policy
Explicit Planning Helps Language Models in Logical Reasoning
Language models have been shown to perform remarkably well on a wide range of
natural language processing tasks. In this paper, we propose a novel system
that uses language models to perform multi-step logical reasoning. Our system
incorporates explicit planning into its inference procedure, thus able to make
more informed reasoning decisions at each step by looking ahead into their
future effects. In our experiments, our full system significantly outperforms
other competing systems. On a multiple-choice question answering task, our
system performs competitively compared to GPT-3-davinci despite having only
around 1.5B parameters. We conduct several ablation studies to demonstrate that
explicit planning plays a crucial role in the system's performance
HYPRO: A Hybridly Normalized Probabilistic Model for Long-Horizon Prediction of Event Sequences
In this paper, we tackle the important yet under-investigated problem of
making long-horizon prediction of event sequences. Existing state-of-the-art
models do not perform well at this task due to their autoregressive structure.
We propose HYPRO, a hybridly normalized probabilistic model that naturally fits
this task: its first part is an autoregressive base model that learns to
propose predictions; its second part is an energy function that learns to
reweight the proposals such that more realistic predictions end up with higher
probabilities. We also propose efficient training and inference algorithms for
this model. Experiments on multiple real-world datasets demonstrate that our
proposed HYPRO model can significantly outperform previous models at making
long-horizon predictions of future events. We also conduct a range of ablation
studies to investigate the effectiveness of each component of our proposed
methods.Comment: NeurIPS 2022 camera-read