80,655 research outputs found

    Listen, Attend, and Walk: Neural Mapping of Navigational Instructions to Action Sequences

    Full text link
    We propose a neural sequence-to-sequence model for direction following, a task that is essential to realizing effective autonomous agents. Our alignment-based encoder-decoder model with long short-term memory recurrent neural networks (LSTM-RNN) translates natural language instructions to action sequences based upon a representation of the observable world state. We introduce a multi-level aligner that empowers our model to focus on sentence "regions" salient to the current world state by using multiple abstractions of the input sentence. In contrast to existing methods, our model uses no specialized linguistic resources (e.g., parsers) or task-specific annotations (e.g., seed lexicons). It is therefore generalizable, yet still achieves the best results reported to-date on a benchmark single-sentence dataset and competitive results for the limited-training multi-sentence setting. We analyze our model through a series of ablations that elucidate the contributions of the primary components of our model.Comment: To appear at AAAI 2016 (and an extended version of a NIPS 2015 Multimodal Machine Learning workshop paper

    PHT-bot: Deep-Learning based system for automatic risk stratification of COPD patients based upon signs of Pulmonary Hypertension

    Full text link
    Chronic Obstructive Pulmonary Disease (COPD) is a leading cause of morbidity and mortality worldwide. Identifying those at highest risk of deterioration would allow more effective distribution of preventative and surveillance resources. Secondary pulmonary hypertension is a manifestation of advanced COPD, which can be reliably diagnosed by the main Pulmonary Artery (PA) to Ascending Aorta (Ao) ratio. In effect, a PA diameter to Ao diameter ratio of greater than 1 has been demonstrated to be a reliable marker of increased pulmonary arterial pressure. Although clinically valuable and readily visualized, the manual assessment of the PA and the Ao diameters is time consuming and under-reported. The present study describes a non invasive method to measure the diameters of both the Ao and the PA from contrast-enhanced chest Computed Tomography (CT). The solution applies deep learning techniques in order to select the correct axial slice to measure, and to segment both arteries. The system achieves test Pearson correlation coefficient scores of 93% for the Ao and 92% for the PA. To the best of our knowledge, it is the first such fully automated solution

    Video Time: Properties, Encoders and Evaluation

    Get PDF
    Time-aware encoding of frame sequences in a video is a fundamental problem in video understanding. While many attempted to model time in videos, an explicit study on quantifying video time is missing. To fill this lacuna, we aim to evaluate video time explicitly. We describe three properties of video time, namely a) temporal asymmetry, b)temporal continuity and c) temporal causality. Based on each we formulate a task able to quantify the associated property. This allows assessing the effectiveness of modern video encoders, like C3D and LSTM, in their ability to model time. Our analysis provides insights about existing encoders while also leading us to propose a new video time encoder, which is better suited for the video time recognition tasks than C3D and LSTM. We believe the proposed meta-analysis can provide a reasonable baseline to assess video time encoders on equal grounds on a set of temporal-aware tasks.Comment: 14 pages, BMVC 201
    • …
    corecore