80,655 research outputs found
Listen, Attend, and Walk: Neural Mapping of Navigational Instructions to Action Sequences
We propose a neural sequence-to-sequence model for direction following, a
task that is essential to realizing effective autonomous agents. Our
alignment-based encoder-decoder model with long short-term memory recurrent
neural networks (LSTM-RNN) translates natural language instructions to action
sequences based upon a representation of the observable world state. We
introduce a multi-level aligner that empowers our model to focus on sentence
"regions" salient to the current world state by using multiple abstractions of
the input sentence. In contrast to existing methods, our model uses no
specialized linguistic resources (e.g., parsers) or task-specific annotations
(e.g., seed lexicons). It is therefore generalizable, yet still achieves the
best results reported to-date on a benchmark single-sentence dataset and
competitive results for the limited-training multi-sentence setting. We analyze
our model through a series of ablations that elucidate the contributions of the
primary components of our model.Comment: To appear at AAAI 2016 (and an extended version of a NIPS 2015
Multimodal Machine Learning workshop paper
PHT-bot: Deep-Learning based system for automatic risk stratification of COPD patients based upon signs of Pulmonary Hypertension
Chronic Obstructive Pulmonary Disease (COPD) is a leading cause of morbidity
and mortality worldwide. Identifying those at highest risk of deterioration
would allow more effective distribution of preventative and surveillance
resources. Secondary pulmonary hypertension is a manifestation of advanced
COPD, which can be reliably diagnosed by the main Pulmonary Artery (PA) to
Ascending Aorta (Ao) ratio. In effect, a PA diameter to Ao diameter ratio of
greater than 1 has been demonstrated to be a reliable marker of increased
pulmonary arterial pressure. Although clinically valuable and readily
visualized, the manual assessment of the PA and the Ao diameters is time
consuming and under-reported. The present study describes a non invasive method
to measure the diameters of both the Ao and the PA from contrast-enhanced chest
Computed Tomography (CT). The solution applies deep learning techniques in
order to select the correct axial slice to measure, and to segment both
arteries. The system achieves test Pearson correlation coefficient scores of
93% for the Ao and 92% for the PA. To the best of our knowledge, it is the
first such fully automated solution
Video Time: Properties, Encoders and Evaluation
Time-aware encoding of frame sequences in a video is a fundamental problem in
video understanding. While many attempted to model time in videos, an explicit
study on quantifying video time is missing. To fill this lacuna, we aim to
evaluate video time explicitly. We describe three properties of video time,
namely a) temporal asymmetry, b)temporal continuity and c) temporal causality.
Based on each we formulate a task able to quantify the associated property.
This allows assessing the effectiveness of modern video encoders, like C3D and
LSTM, in their ability to model time. Our analysis provides insights about
existing encoders while also leading us to propose a new video time encoder,
which is better suited for the video time recognition tasks than C3D and LSTM.
We believe the proposed meta-analysis can provide a reasonable baseline to
assess video time encoders on equal grounds on a set of temporal-aware tasks.Comment: 14 pages, BMVC 201
- …