168 research outputs found
Grammar Is a System That Characterizes Talk in Interaction
Much of contemporary mainstream formal grammar theory is unable to provide analyses for language as it occurs in actual spoken interaction. Its analyses are developed for a cleaned up version of language which omits the disfluencies, non-sentential utterances, gestures, and many other phenomena that are ubiquitous in spoken language. Using evidence from linguistics, conversation analysis, multimodal communication, psychology, language acquisition, and neuroscience, we show these aspects of language use are rule governed in much the same way as phenomena captured by conventional grammars. Furthermore, we argue that over the past few years some of the tools required to provide a precise characterizations of such phenomena have begun to emerge in theoretical and computational linguistics; hence, there is no reason for treating them as "second class citizens" other than pre-theoretical assumptions about what should fall under the purview of grammar. Finally, we suggest that grammar formalisms covering such phenomena would provide a better foundation not just for linguistic analysis of face-to-face interaction, but also for sister disciplines, such as research on spoken dialogue systems and/or psychological work on language acquisition
Embedding Predications
Written communication is rarely a sequence of simple assertions. More often, in addition to simple assertions, authors express subjectivity, such as beliefs, speculations, opinions, intentions, and desires. Furthermore, they link statements of various kinds to form a coherent discourse that reflects their pragmatic intent. In computational semantics, extraction of simple assertions (propositional meaning) has attracted the greatest attention, while research that focuses on extra-propositional aspects of meaning has remained sparse overall and has been largely limited to narrowly defined categories, such as hedging or sentiment analysis, treated in isolation.
In this thesis, we contribute to the understanding of extra-propositional meaning in natural language understanding, by providing a comprehensive account of the semantic phenomena that occur beyond simple assertions and examining how a coherent discourse is formed from lower level semantic elements. Our approach is linguistically based, and we propose a general, unified treatment of the semantic phenomena involved, within a computationally viable framework. We identify semantic embedding as the core notion involved in expressing extra-propositional meaning. The embedding framework is based on the structural distinction between embedding and atomic predications, the former corresponding to extra-propositional aspects of meaning. It incorporates the notions of predication source, modality scale, and scope. We develop an embedding categorization scheme and a dictionary based on it, which provide the necessary means to interpret extra-propositional meaning with a compositional semantic interpretation methodology. Our syntax-driven methodology exploits syntactic dependencies to construct a semantic embedding graph of a document. Traversing the graph in a bottom-up manner guided by compositional operations, we construct predications corresponding to extra-propositional semantic content, which form the basis for addressing practical tasks. We focus on text from two distinct domains: news articles from the Wall Street Journal, and scientific articles focusing on molecular biology. Adopting a task-based evaluation strategy, we consider the easy adaptability of the core framework to practical tasks that involve some extra-propositional aspect as a measure of its success. The computational tasks we consider include hedge/uncertainty detection, scope resolution, negation detection, biological event extraction, and attribution resolution. Our competitive results in these tasks demonstrate the viability of our proposal
Attribution: a computational approach
Our society is overwhelmed with an ever growing amount of information. Effective
management of this information requires novel ways to filter and select the most relevant
pieces of information. Some of this information can be associated with the source
or sources expressing it. Sources and their relation to what they express affect information
and whether we perceive it as relevant, biased or truthful. In news texts in
particular, it is common practice to report third-party statements and opinions. Recognizing
relations of attribution is therefore a necessary step toward detecting statements
and opinions of specific sources and selecting and evaluating information on the basis
of its source.
The automatic identification of Attribution Relations has applications in numerous
research areas. Quotation and opinion extraction, discourse and factuality have
all partly addressed the annotation and identification of Attribution Relations. However,
disjoint efforts have provided a partial and partly inaccurate picture of attribution.
Moreover, these research efforts have generated small or incomplete resources, thus
limiting the applicability of machine learning approaches. Existing approaches to extract
Attribution Relations have focused on rule-based models, which are limited both
in coverage and precision.
This thesis presents a computational approach to attribution that recasts attribution
extraction as the identification of the attributed text, its source and the lexical cue linking
them in a relation. Drawing on preliminary data-driven investigation, I present a
comprehensive lexicalised approach to attribution and further refine and test a previously
defined annotation scheme. The scheme has been used to create a corpus annotated
with Attribution Relations, with the goal of contributing a large and complete
resource than can lay the foundations for future attribution studies.
Based on this resource, I developed a system for the automatic extraction of attribution
relations that surpasses traditional syntactic pattern-based approaches. The system
is a pipeline of classification and sequence labelling models that identify and link each
of the components of an attribution relation. The results show concrete opportunities
for attribution-based applications
Head-Driven Phrase Structure Grammar
Head-Driven Phrase Structure Grammar (HPSG) is a constraint-based or declarative approach to linguistic knowledge, which analyses all descriptive levels (phonology, morphology, syntax, semantics, pragmatics) with feature value pairs, structure sharing, and relational constraints. In syntax it assumes that expressions have a single relatively simple constituent structure. This volume provides a state-of-the-art introduction to the framework. Various chapters discuss basic assumptions and formal foundations, describe the evolution of the framework, and go into the details of the main syntactic phenomena. Further chapters are devoted to non-syntactic levels of description. The book also considers related fields and research areas (gesture, sign languages, computational linguistics) and includes chapters comparing HPSG with other frameworks (Lexical Functional Grammar, Categorial Grammar, Construction Grammar, Dependency Grammar, and Minimalism)
- …