37,492 research outputs found
Structural parsing
Parsing is an essential part of natural language processing. In this paper, structural parsing, which is based on the theory of knowledge graphs, is introduced. Under consideration of the semantic and syntactic features of natural language, both semantic and syntactic word graphs are formed. Grammar rules are derived from the syntactic word graphs. Due to the distinctions between Chinese and English, the grammar rules are given for the Chinese version and the English version of syntactic word graphs respectively. By traditional parsing a parse tree can then be given for a sentence, that can be used to map the sentence on a sentence graph. This is called structural parsing. The relationship with utterance paths is discussed. As a result, chunk indicators are proposed to guide structural parsing
Learning Fault-tolerant Speech Parsing with SCREEN
This paper describes a new approach and a system SCREEN for fault-tolerant
speech parsing. SCREEEN stands for Symbolic Connectionist Robust EnterprisE for
Natural language. Speech parsing describes the syntactic and semantic analysis
of spontaneous spoken language. The general approach is based on incremental
immediate flat analysis, learning of syntactic and semantic speech parsing,
parallel integration of current hypotheses, and the consideration of various
forms of speech related errors. The goal for this approach is to explore the
parallel interactions between various knowledge sources for learning
incremental fault-tolerant speech parsing. This approach is examined in a
system SCREEN using various hybrid connectionist techniques. Hybrid
connectionist techniques are examined because of their promising properties of
inherent fault tolerance, learning, gradedness and parallel constraint
integration. The input for SCREEN is hypotheses about recognized words of a
spoken utterance potentially analyzed by a speech system, the output is
hypotheses about the flat syntactic and semantic analysis of the utterance. In
this paper we focus on the general approach, the overall architecture, and
examples for learning flat syntactic speech parsing. Different from most other
speech language architectures SCREEN emphasizes an interactive rather than an
autonomous position, learning rather than encoding, flat analysis rather than
in-depth analysis, and fault-tolerant processing of phonetic, syntactic and
semantic knowledge.Comment: 6 pages, postscript, compressed, uuencoded to appear in Proceedings
of AAAI 9
Joint Morphological and Syntactic Disambiguation
In morphologically rich languages, should morphological and syntactic disambiguation be treated sequentially or as a single problem? We describe several efficient, probabilistically interpretable ways to apply joint inference to morphological and syntactic disambiguation using lattice parsing. Joint inference is shown to compare favorably to pipeline parsing methods across a variety of component models. State-of-the-art performance on Hebrew Treebank parsing is demonstrated using the new method. The benefits of joint inference are modest with the current component models, but appear to increase as components themselves improve
Dependency parsing of Turkish
The suitability of different parsing methods for different languages is an important topic in
syntactic parsing. Especially lesser-studied languages, typologically different from the languages
for which methods have originally been developed, poses interesting challenges in this respect.
This article presents an investigation of data-driven dependency parsing of Turkish, an agglutinative
free constituent order language that can be seen as the representative of a wider class
of languages of similar type. Our investigations show that morphological structure plays an
essential role in finding syntactic relations in such a language. In particular, we show that
employing sublexical representations called inflectional groups, rather than word forms, as the
basic parsing units improves parsing accuracy. We compare two different parsing methods, one
based on a probabilistic model with beam search, the other based on discriminative classifiers and
a deterministic parsing strategy, and show that the usefulness of sublexical units holds regardless
of parsing method.We examine the impact of morphological and lexical information in detail and
show that, properly used, this kind of information can improve parsing accuracy substantially.
Applying the techniques presented in this article, we achieve the highest reported accuracy for
parsing the Turkish Treebank
A Lexicalized Tree-Adjoining Grammar for Vietnamese
In this paper, we present the first sizable grammar built for Vietnamese using LTAG, developed over the past two years, named vnLTAG. This grammar aims at modelling written language and is general enough to be both application- and domain-independent. It can be used for the morpho-syntactic tagging and syntactic parsing of Vietnamese texts, as well as text generation. We then present a robust parsing scheme using vnLTAG and a parser for the grammar. We finish with an evaluation using a test suite
A Diagram Is Worth A Dozen Images
Diagrams are common tools for representing complex concepts, relationships
and events, often when it would be difficult to portray the same information
with natural images. Understanding natural images has been extensively studied
in computer vision, while diagram understanding has received little attention.
In this paper, we study the problem of diagram interpretation and reasoning,
the challenging task of identifying the structure of a diagram and the
semantics of its constituents and their relationships. We introduce Diagram
Parse Graphs (DPG) as our representation to model the structure of diagrams. We
define syntactic parsing of diagrams as learning to infer DPGs for diagrams and
study semantic interpretation and reasoning of diagrams in the context of
diagram question answering. We devise an LSTM-based method for syntactic
parsing of diagrams and introduce a DPG-based attention model for diagram
question answering. We compile a new dataset of diagrams with exhaustive
annotations of constituents and relationships for over 5,000 diagrams and
15,000 questions and answers. Our results show the significance of our models
for syntactic parsing and question answering in diagrams using DPGs
Keystroke dynamics as signal for shallow syntactic parsing
Keystroke dynamics have been extensively used in psycholinguistic and writing
research to gain insights into cognitive processing. But do keystroke logs
contain actual signal that can be used to learn better natural language
processing models?
We postulate that keystroke dynamics contain information about syntactic
structure that can inform shallow syntactic parsing. To test this hypothesis,
we explore labels derived from keystroke logs as auxiliary task in a multi-task
bidirectional Long Short-Term Memory (bi-LSTM). Our results show promising
results on two shallow syntactic parsing tasks, chunking and CCG supertagging.
Our model is simple, has the advantage that data can come from distinct
sources, and produces models that are significantly better than models trained
on the text annotations alone.Comment: In COLING 201
- …