513,677 research outputs found
Forgetting Exceptions is Harmful in Language Learning
We show that in language learning, contrary to received wisdom, keeping
exceptional training instances in memory can be beneficial for generalization
accuracy. We investigate this phenomenon empirically on a selection of
benchmark natural language processing tasks: grapheme-to-phoneme conversion,
part-of-speech tagging, prepositional-phrase attachment, and base noun phrase
chunking. In a first series of experiments we combine memory-based learning
with training set editing techniques, in which instances are edited based on
their typicality and class prediction strength. Results show that editing
exceptional instances (with low typicality or low class prediction strength)
tends to harm generalization accuracy. In a second series of experiments we
compare memory-based learning and decision-tree learning methods on the same
selection of tasks, and find that decision-tree learning often performs worse
than memory-based learning. Moreover, the decrease in performance can be linked
to the degree of abstraction from exceptions (i.e., pruning or eagerness). We
provide explanations for both results in terms of the properties of the natural
language processing tasks and the learning algorithms.Comment: 31 pages, 7 figures, 10 tables. uses 11pt, fullname, a4wide tex
styles. Pre-print version of article to appear in Machine Learning 11:1-3,
Special Issue on Natural Language Learning. Figures on page 22 slightly
compressed to avoid page overloa
Persistence pays off: Paying Attention to What the LSTM Gating Mechanism Persists
Language Models (LMs) are important components in several Natural Language
Processing systems. Recurrent Neural Network LMs composed of LSTM units,
especially those augmented with an external memory, have achieved
state-of-the-art results. However, these models still struggle to process long
sequences which are more likely to contain long-distance dependencies because
of information fading and a bias towards more recent information. In this paper
we demonstrate an effective mechanism for retrieving information in a memory
augmented LSTM LM based on attending to information in memory in proportion to
the number of timesteps the LSTM gating mechanism persisted the information
Recommended from our members
Neurocognitive profiles in autism spectrum disorder
textThe current research project examines the performance of a group of high functioning young adult males with autism spectrum disorders on standardized measures of neurocognitive functioning to determine whether distinct cognitive profiles of strengths and weaknesses emerge. Neuropsychological test data across various domains: general cognitive ability, visuospatial processing, verbal learning and memory, visual learning and memory, working memory, reasoning, cognitive flexibility, attention, receptive language, expressive language, social and emotional processing, and fine motor skills was examined. Data were analyzed using cluster analysis to assess for the presence and nature of unique clusters/subgroups based on neuropsychological test performance. Three unique clusters were derived from the analyses. This study highlights the well-documented heterogeneity across the spectrum of autism and suggests a method for parsing a heterogeneous sample of ASD subjects into smaller and more meaningful homogeneous groups using standardized neuropsychological assessments.Educational Psycholog
Bilingual episodic memory: an introduction
Our current models of bilingual memory are essentially accounts of semantic memory whose goal is to explain bilingual lexical access to underlying imagistic and conceptual referents. While this research has included episodic memory, it has focused largely on recall for words, phrases, and sentences in the service of understanding the structure of semantic memory. Building on the four papers in this special issue, this article focuses on larger units of episodic memory(from quotidian events with simple narrative form to complex autobiographical memories) in service of developing a model of bilingual episodic memory. This requires integrating theory and research on how culture-specific narrative traditions inform encoding and retrieval with theory and research on the relation between(monolingual) semantic and episodic memory(Schank, 1982; Schank & Abelson, 1995; Tulving, 2002). Then, taking a cue from memory-based text processing studies in psycholinguistics(McKoon & Ratcliff, 1998), we suggest that as language forms surface in the progressive retrieval of features of an event, they trigger further forms within the same language serving to guide a within-language/ within-culture retrieval
Understanding acceptability judgments: Additivity and working memory effects
Linguists build theories of grammar based largely on
acceptability contrasts. But these contrasts can reflect
grammatical constraints and/or constraints on language
processing. How can theorists determine the extent to which the acceptability of an utterance depends on functional constraints? In a series of acceptability experiments, we consider two factors that might indicate processing contributions to acceptability contrasts: (1) the way constraints combine (i.e., additively or super-additively), and (2) the way a comprehender’s working memory resources influence acceptability judgments. Results suggest that multiple sources of processing difficulty combine to produce super-additive effects, but multiple grammatical violations do not. Furthermore, when acceptability judgments improve with higher working memory scores, this appears to be due to functional constraints. We conclude that tests of (super)-additivity and of differences in working memory can help to identify the effects of processing difficulty (due to functional constraints)
Optimizing Memory Efficiency for Convolution Kernels on Kepler GPUs
Convolution is a fundamental operation in many applications, such as computer
vision, natural language processing, image processing, etc. Recent successes of
convolutional neural networks in various deep learning applications put even
higher demand on fast convolution. The high computation throughput and memory
bandwidth of graphics processing units (GPUs) make GPUs a natural choice for
accelerating convolution operations. However, maximally exploiting the
available memory bandwidth of GPUs for convolution is a challenging task. This
paper introduces a general model to address the mismatch between the memory
bank width of GPUs and computation data width of threads. Based on this model,
we develop two convolution kernels, one for the general case and the other for
a special case with one input channel. By carefully optimizing memory access
patterns and computation patterns, we design a communication-optimized kernel
for the special case and a communication-reduced kernel for the general case.
Experimental data based on implementations on Kepler GPUs show that our kernels
achieve 5.16X and 35.5% average performance improvement over the latest cuDNN
library, for the special case and the general case, respectively
Evaluation of the NLP Components of the OVIS2 Spoken Dialogue System
The NWO Priority Programme Language and Speech Technology is a 5-year
research programme aiming at the development of spoken language information
systems. In the Programme, two alternative natural language processing (NLP)
modules are developed in parallel: a grammar-based (conventional, rule-based)
module and a data-oriented (memory-based, stochastic, DOP) module. In order to
compare the NLP modules, a formal evaluation has been carried out three years
after the start of the Programme. This paper describes the evaluation procedure
and the evaluation results. The grammar-based component performs much better
than the data-oriented one in this comparison.Comment: Proceedings of CLIN 9
Memory-Based Lexical Acquisition and Processing
Current approaches to computational lexicology in language technology are
knowledge-based (competence-oriented) and try to abstract away from specific
formalisms, domains, and applications. This results in severe complexity,
acquisition and reusability bottlenecks. As an alternative, we propose a
particular performance-oriented approach to Natural Language Processing based
on automatic memory-based learning of linguistic (lexical) tasks. The
consequences of the approach for computational lexicology are discussed, and
the application of the approach on a number of lexical acquisition and
disambiguation tasks in phonology, morphology and syntax is described.Comment: 18 page
- …