Search CORE

14,006 research outputs found

Interpretable Structure-Evolving LSTM

Author: Feng Jiashi
Liang Xiaodan
Lin Liang
Shen Xiaohui
Xing Eric P.
Yan Shuicheng
Publication venue
Publication date: 08/03/2017
Field of study

This paper develops a general framework for learning interpretable data representation via Long Short-Term Memory (LSTM) recurrent neural networks over hierarchal graph structures. Instead of learning LSTM models over the pre-fixed structures, we propose to further learn the intermediate interpretable multi-level graph structures in a progressive and stochastic way from data during the LSTM network optimization. We thus call this model the structure-evolving LSTM. In particular, starting with an initial element-level graph representation where each node is a small data element, the structure-evolving LSTM gradually evolves the multi-level graph representations by stochastically merging the graph nodes with high compatibilities along the stacked LSTM layers. In each LSTM layer, we estimate the compatibility of two connected nodes from their corresponding LSTM gate outputs, which is used to generate a merging probability. The candidate graph structures are accordingly generated where the nodes are grouped into cliques with their merging probabilities. We then produce the new graph structure with a Metropolis-Hasting algorithm, which alleviates the risk of getting stuck in local optimums by stochastic sampling with an acceptance probability. Once a graph structure is accepted, a higher-level graph is then constructed by taking the partitioned cliques as its nodes. During the evolving process, representation becomes more abstracted in higher-levels where redundant information is filtered out, allowing more efficient propagation of long-range data dependencies. We evaluate the effectiveness of structure-evolving LSTM in the application of semantic object parsing and demonstrate its advantage over state-of-the-art LSTM models on standard benchmarks.Comment: To appear in CVPR 2017 as a spotlight pape

arXiv.org e-Print Archive

Connectionist natural language parsing

Author: Berg
Callan
Christiansen
Christiansen
Cleeremans
Cottrell
Cottrell
Dominic Palmer-Brown
Elman
Fanty
Fodor
Frazier
Friederici
Giles
Greene
Hadley
Heather M. Powell
Ho
Howells
Jonathan A. Tepper
Kwansy
Lane
Lawrence
MacDonald
Marcus
Martelli
Mayberry
McDonald
Miikkulainen
Miikkulainen
Moisl
Pearlmutter
Pollack
Rayner
Reilly
Rodriguez
Santos
Sells
Selman
Servan-Schreiber
Sharkey
Sharkey
St. John
Stevenson
Stowe
Tanenhaus
Taraban
Tepper
Tepper
Waltz
Wermter
Wiles
Zeng
Publication venue: 'Elsevier BV'
Publication date: 01/01/2002
Field of study

The key developments of two decades of connectionist parsing are reviewed. Connectionist parsers are assessed according to their ability to learn to represent syntactic structures from examples automatically, without being presented with symbolic grammar rules. This review also considers the extent to which connectionist parsers offer computational models of human sentence processing and provide plausible accounts of psycholinguistic data. In considering these issues, special attention is paid to the level of realism, the nature of the modularity, and the type of processing that is to be found in a wide range of parsers

When Are Tree Structures Necessary for Deep Learning of Representations?

Author: Hovy Eudard
Jurafsky Dan
Li Jiwei
Luong Minh-Thang
Publication venue
Publication date: 01/01/2015
Field of study

Recursive neural models, which use syntactic parse trees to recursively generate representations bottom-up, are a popular architecture. But there have not been rigorous evaluations showing for exactly which tasks this syntax-based method is appropriate. In this paper we benchmark {\bf recursive} neural models against sequential {\bf recurrent} neural models (simple recurrent and LSTM models), enforcing apples-to-apples comparison as much as possible. We investigate 4 tasks: (1) sentiment classification at the sentence level and phrase level; (2) matching questions to answer-phrases; (3) discourse parsing; (4) semantic relation extraction (e.g., {\em component-whole} between nouns). Our goal is to understand better when, and why, recursive models can outperform simpler models. We find that recursive models help mainly on tasks (like semantic relation extraction) that require associating headwords across a long distance, particularly on very long sequences. We then introduce a method for allowing recurrent models to achieve similar performance: breaking long sentences into clause-like units at punctuation and processing them separately before combining. Our results thus help understand the limitations of both classes of models, and suggest directions for improving recurrent models

arXiv.org e-Print Archive

CiteSeerX