Search CORE

6,715 research outputs found

An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling

Author: Bai Shaojie
Kolter J. Zico
Koltun Vladlen
Publication venue
Publication date: 19/04/2018
Field of study

For most deep learning practitioners, sequence modeling is synonymous with recurrent networks. Yet recent results indicate that convolutional architectures can outperform recurrent networks on tasks such as audio synthesis and machine translation. Given a new sequence modeling task or dataset, which architecture should one use? We conduct a systematic evaluation of generic convolutional and recurrent architectures for sequence modeling. The models are evaluated across a broad range of standard tasks that are commonly used to benchmark recurrent networks. Our results indicate that a simple convolutional architecture outperforms canonical recurrent networks such as LSTMs across a diverse range of tasks and datasets, while demonstrating longer effective memory. We conclude that the common association between sequence modeling and recurrent networks should be reconsidered, and convolutional networks should be regarded as a natural starting point for sequence modeling tasks. To assist related work, we have made code available at http://github.com/locuslab/TCN

arXiv.org e-Print Archive

Visual Attention Model for Cross-sectional Stock Return Prediction and End-to-End Multimodal Market Representation Learning

Author: Deng Yuntian
Dredze Mark
Rosenberg David
Stent Amanda
Verma Arun
Zhao Ran
Publication venue
Publication date: 08/03/2019
Field of study

Technical and fundamental analysis are traditional tools used to analyze individual stocks; however, the finance literature has shown that the price movement of each individual stock correlates heavily with other stocks, especially those within the same sector. In this paper we propose a general purpose market representation that incorporates fundamental and technical indicators and relationships between individual stocks. We treat the daily stock market as a "market image" where rows (grouped by market sector) represent individual stocks and columns represent indicators. We apply a convolutional neural network over this market image to build market features in a hierarchical way. We use a recurrent neural network, with an attention mechanism over the market feature maps, to model temporal dynamics in the market. We show that our proposed model outperforms strong baselines in both short-term and long-term stock return prediction tasks. We also show another use for our market image: to construct concise and dense market embeddings suitable for downstream prediction tasks.Comment: Accepted as full paper in the 32nd International FLAIRS Conferenc

arXiv.org e-Print Archive

Learning distant cause and effect using only local and immediate credit assignment

Author: Ahmed Abdelrahman
Kowadlo Gideon
Rawlinson David
Publication venue
Publication date: 12/12/2019
Field of study

We present a recurrent neural network memory that uses sparse coding to create a combinatoric encoding of sequential inputs. Using several examples, we show that the network can associate distant causes and effects in a discrete stochastic process, predict partially-observable higher-order sequences, and enable a DQN agent to navigate a maze by giving it memory. The network uses only biologically-plausible, local and immediate credit assignment. Memory requirements are typically one order of magnitude less than existing LSTM, GRU and autoregressive feed-forward sequence learning models. The most significant limitation of the memory is generalization to unseen input sequences. We explore this limitation by measuring next-word prediction perplexity on the Penn Treebank dataset.Comment: 11 pages, 5 figures, 2 table

arXiv.org e-Print Archive

Time Perception Machine: Temporal Point Processes for the When, Where and What of Activity Prediction

Author: Bornn Luke
Mori Greg
Xu Bicheng
Zhong Yatao
Zhou Guang-Tong
Publication venue
Publication date: 14/08/2018
Field of study

Numerous powerful point process models have been developed to understand temporal patterns in sequential data from fields such as health-care, electronic commerce, social networks, and natural disaster forecasting. In this paper, we develop novel models for learning the temporal distribution of human activities in streaming data (e.g., videos and person trajectories). We propose an integrated framework of neural networks and temporal point processes for predicting when the next activity will happen. Because point processes are limited to taking event frames as input, we propose a simple yet effective mechanism to extract features at frames of interest while also preserving the rich information in the remaining frames. We evaluate our model on two challenging datasets. The results show that our model outperforms traditional statistical point process approaches significantly, demonstrating its effectiveness in capturing the underlying temporal dynamics as well as the correlation within sequential activities. Furthermore, we also extend our model to a joint estimation framework for predicting the timing, spatial location, and category of the activity simultaneously, to answer the when, where, and what of activity prediction

arXiv.org e-Print Archive

Audio-Linguistic Embeddings for Spoken Sentences

Author: Fei-Fei Li
Guo Michelle
Haque Albert
Verma Prateek
Publication venue
Publication date: 20/02/2019
Field of study

We propose spoken sentence embeddings which capture both acoustic and linguistic content. While existing works operate at the character, phoneme, or word level, our method learns long-term dependencies by modeling speech at the sentence level. Formulated as an audio-linguistic multitask learning problem, our encoder-decoder model simultaneously reconstructs acoustic and natural language features from audio. Our results show that spoken sentence embeddings outperform phoneme and word-level baselines on speech recognition and emotion recognition tasks. Ablation studies show that our embeddings can better model high-level acoustic concepts while retaining linguistic content. Overall, our work illustrates the viability of generic, multi-modal sentence embeddings for spoken language understanding.Comment: International Conference on Acoustics, Speech, and Signal Processing (ICASSP) 201

arXiv.org e-Print Archive

Multi-Cast Attention Networks for Retrieval-based Question Answering and Response Prediction

Author: Hui Siu Cheung
Tay Yi
Tuan Luu Anh
Publication venue
Publication date: 03/06/2018
Field of study

Attention is typically used to select informative sub-phrases that are used for prediction. This paper investigates the novel use of attention as a form of feature augmentation, i.e, casted attention. We propose Multi-Cast Attention Networks (MCAN), a new attention mechanism and general model architecture for a potpourri of ranking tasks in the conversational modeling and question answering domains. Our approach performs a series of soft attention operations, each time casting a scalar feature upon the inner word embeddings. The key idea is to provide a real-valued hint (feature) to a subsequent encoder layer and is targeted at improving the representation learning process. There are several advantages to this design, e.g., it allows an arbitrary number of attention mechanisms to be casted, allowing for multiple attention types (e.g., co-attention, intra-attention) and attention variants (e.g., alignment-pooling, max-pooling, mean-pooling) to be executed simultaneously. This not only eliminates the costly need to tune the nature of the co-attention layer, but also provides greater extents of explainability to practitioners. Via extensive experiments on four well-known benchmark datasets, we show that MCAN achieves state-of-the-art performance. On the Ubuntu Dialogue Corpus, MCAN outperforms existing state-of-the-art models by

9\%

. MCAN also achieves the best performing score to date on the well-studied TrecQA dataset.Comment: Accepted to KDD 2018 (Paper titled only "Multi-Cast Attention Networks" in KDD version

arXiv.org e-Print Archive

GLoMo: Unsupervisedly Learned Relational Graphs as Transferable Representations

Author: Cohen William W.
Dhingra Bhuwan
He Kaiming
LeCun Yann
Salakhutdinov Ruslan
Yang Zhilin
Zhao Jake
Publication venue
Publication date: 02/07/2018
Field of study

Modern deep transfer learning approaches have mainly focused on learning generic feature vectors from one task that are transferable to other tasks, such as word embeddings in language and pretrained convolutional features in vision. However, these approaches usually transfer unary features and largely ignore more structured graphical representations. This work explores the possibility of learning generic latent relational graphs that capture dependencies between pairs of data units (e.g., words or pixels) from large-scale unlabeled data and transferring the graphs to downstream tasks. Our proposed transfer learning framework improves performance on various tasks including question answering, natural language inference, sentiment analysis, and image classification. We also show that the learned graphs are generic enough to be transferred to different embeddings on which the graphs have not been trained (including GloVe embeddings, ELMo embeddings, and task-specific RNN hidden unit), or embedding-free units such as image pixels

arXiv.org e-Print Archive

Inter-Patient ECG Classification with Convolutional and Recurrent Neural Networks

Author: Guo Li
Matuszewski Bogdan
Sim Gavin
Publication venue
Publication date: 27/09/2018
Field of study

The recent advances in ECG sensor devices provide opportunities for user self-managed auto-diagnosis and monitoring services over the internet. This imposes the requirements for generic ECG classification methods that are inter-patient and device independent. In this paper, we present our work on using the densely connected convolutional neural network (DenseNet) and gated recurrent unit network (GRU) for addressing the inter-patient ECG classification problem. A deep learning model architecture is proposed and is evaluated using the MIT-BIH Arrhythmia and Supraventricular Databases. The results obtained show that without applying any complicated data pre-processing or feature engineering methods, both of our models have considerably outperformed the state-of-the-art performance for supraventricular (SVEB) and ventricular (VEB) arrhythmia classifications on the unseen testing dataset (with the F1 score improved from 51.08 to 61.25 for SVEB detection and from 88.59 to 89.75 for VEB detection respectively). As no patient-specific or device-specific information is used at the training stage in this work, it can be considered as a more generic approach for dealing with scenarios in which varieties of ECG signals are collected from different patients using different types of sensor devices.Comment: 10 pages, 8 figure

arXiv.org e-Print Archive

Generalization Studies of Neural Network Models for Cardiac Disease Detection Using Limited Channel ECG

Author: Beymer David
Narayan Girish
Rajan Deepta
Publication venue
Publication date: 04/01/2019
Field of study

Acceleration of machine learning research in healthcare is challenged by lack of large annotated and balanced datasets. Furthermore, dealing with measurement inaccuracies and exploiting unsupervised data are considered to be central to improving existing solutions. In particular, a primary objective in predictive modeling is to generalize well to both unseen variations within the observed classes, and unseen classes. In this work, we consider such a challenging problem in machine learning driven diagnosis: detecting a gamut of cardiovascular conditions (e.g. infarction, dysrhythmia etc.) from limited channel ECG measurements. Though deep neural networks have achieved unprecedented success in predictive modeling, they rely solely on discriminative models that can generalize poorly to unseen classes. We argue that unsupervised learning can be utilized to construct effective latent spaces that facilitate better generalization. This work extensively compares the generalization of our proposed approach against a state-of-the-art deep learning solution. Our results show significant improvements in F1-scores.Comment: IEEE Computing in Cardiology (CinC) 201

arXiv.org e-Print Archive

Encoding Source Language with Convolutional Neural Network for Machine Translation

Author: Jiang Wenbin
Li Hang
Liu Qun
Lu Zhengdong
Meng Fandong
Wang Mingxuan
Publication venue
Publication date: 08/06/2015
Field of study

The recently proposed neural network joint model (NNJM) (Devlin et al., 2014) augments the n-gram target language model with a heuristically chosen source context window, achieving state-of-the-art performance in SMT. In this paper, we give a more systematic treatment by summarizing the relevant source information through a convolutional architecture guided by the target information. With different guiding signals during decoding, our specifically designed convolution+gating architectures can pinpoint the parts of a source sentence that are relevant to predicting a target word, and fuse them with the context of entire source sentence to form a unified representation. This representation, together with target language words, are fed to a deep neural network (DNN) to form a stronger NNJM. Experiments on two NIST Chinese-English translation tasks show that the proposed model can achieve significant improvements over the previous NNJM by up to +1.08 BLEU points on averageComment: Accepted as a full paper at ACL 201

arXiv.org e-Print Archive