8,696 research outputs found
Integrated speech and morphological processing in a connectionist continuous speech understanding for Korean
A new tightly coupled speech and natural language integration model is
presented for a TDNN-based continuous possibly large vocabulary speech
recognition system for Korean. Unlike popular n-best techniques developed for
integrating mainly HMM-based speech recognition and natural language processing
in a {\em word level}, which is obviously inadequate for morphologically
complex agglutinative languages, our model constructs a spoken language system
based on a {\em morpheme-level} speech and language integration. With this
integration scheme, the spoken Korean processing engine (SKOPE) is designed and
implemented using a TDNN-based diphone recognition module integrated with a
Viterbi-based lexical decoding and symbolic phonological/morphological
co-analysis. Our experiment results show that the speaker-dependent continuous
{\em eojeol} (Korean word) recognition and integrated morphological analysis
can be achieved with over 80.6% success rate directly from speech inputs for
the middle-level vocabularies.Comment: latex source with a4 style, 15 pages, to be published in computer
processing of oriental language journa
On Tree-Based Neural Sentence Modeling
Neural networks with tree-based sentence encoders have shown better results
on many downstream tasks. Most of existing tree-based encoders adopt syntactic
parsing trees as the explicit structure prior. To study the effectiveness of
different tree structures, we replace the parsing trees with trivial trees
(i.e., binary balanced tree, left-branching tree and right-branching tree) in
the encoders. Though trivial trees contain no syntactic information, those
encoders get competitive or even better results on all of the ten downstream
tasks we investigated. This surprising result indicates that explicit syntax
guidance may not be the main contributor to the superior performances of
tree-based neural sentence modeling. Further analysis show that tree modeling
gives better results when crucial words are closer to the final representation.
Additional experiments give more clues on how to design an effective tree-based
encoder. Our code is open-source and available at
https://github.com/ExplorerFreda/TreeEnc.Comment: To Appear at EMNLP 201
Latent Tree Learning with Differentiable Parsers: Shift-Reduce Parsing and Chart Parsing
Latent tree learning models represent sentences by composing their words
according to an induced parse tree, all based on a downstream task. These
models often outperform baselines which use (externally provided) syntax trees
to drive the composition order. This work contributes (a) a new latent tree
learning model based on shift-reduce parsing, with competitive downstream
performance and non-trivial induced trees, and (b) an analysis of the trees
learned by our shift-reduce model and by a chart-based model.Comment: ACL 2018 workshop on Relevance of Linguistic Structure in Neural
Architectures for NL
A Data-Oriented Approach to Semantic Interpretation
In Data-Oriented Parsing (DOP), an annotated language corpus is used as a
stochastic grammar. The most probable analysis of a new input sentence is
constructed by combining sub-analyses from the corpus in the most probable way.
This approach has been succesfully used for syntactic analysis, using corpora
with syntactic annotations such as the Penn Treebank. If a corpus with
semantically annotated sentences is used, the same approach can also generate
the most probable semantic interpretation of an input sentence. The present
paper explains this semantic interpretation method, and summarizes the results
of a preliminary experiment. Semantic annotations were added to the syntactic
annotations of most of the sentences of the ATIS corpus. A data-oriented
semantic interpretation algorithm was succesfully tested on this semantically
enriched corpus.Comment: 10 pages, Postscript; to appear in Proceedings Workshop on
Corpus-Oriented Semantic Analysis, ECAI-96, Budapes
Connectionist natural language parsing
The key developments of two decades of connectionist parsing are reviewed. Connectionist parsers are assessed according to their ability to learn to represent syntactic structures from examples automatically, without being presented with symbolic grammar rules. This review also considers the extent to which connectionist parsers offer computational models of human sentence processing and provide plausible accounts of psycholinguistic data. In considering these issues, special attention is paid to the level of realism, the nature of the modularity, and the type of processing that is to be found in a wide range of parsers
- …