Search CORE

27,715 research outputs found

Data-Oriented Language Processing. An Overview

Author: Bod Rens
Scha Remko
Publication venue
Publication date: 01/01/1996
Field of study

During the last few years, a new approach to language processing has started to emerge, which has become known under various labels such as "data-oriented parsing", "corpus-based interpretation", and "tree-bank grammar" (cf. van den Berg et al. 1994; Bod 1992-96; Bod et al. 1996a/b; Bonnema 1996; Charniak 1996a/b; Goodman 1996; Kaplan 1996; Rajman 1995a/b; Scha 1990-92; Sekine & Grishman 1995; Sima'an et al. 1994; Sima'an 1995-96; Tugwell 1995). This approach, which we will call "data-oriented processing" or "DOP", embodies the assumption that human language perception and production works with representations of concrete past language experiences, rather than with abstract linguistic rules. The models that instantiate this approach therefore maintain large corpora of linguistic representations of previously occurring utterances. When processing a new input utterance, analyses of this utterance are constructed by combining fragments from the corpus; the occurrence-frequencies of the fragments are used to estimate which analysis is the most probable one. In this paper we give an in-depth discussion of a data-oriented processing model which employs a corpus of labelled phrase-structure trees. Then we review some other models that instantiate the DOP approach. Many of these models also employ labelled phrase-structure trees, but use different criteria for extracting fragments from the corpus or employ different disambiguation strategies (Bod 1996b; Charniak 1996a/b; Goodman 1996; Rajman 1995a/b; Sekine & Grishman 1995; Sima'an 1995-96); other models use richer formalisms for their corpus annotations (van den Berg et al. 1994; Bod et al., 1996a/b; Bonnema 1996; Kaplan 1996; Tugwell 1995).Comment: 34 pages, Postscrip

arXiv.org e-Print Archive

CiteSeerX

Joint Video and Text Parsing for Understanding Events and Answering Queries

Author: Choe Tae Eun
Lee Mun Wai
Meng Meng
Tu Kewei
Zhu Song-Chun
Publication venue
Publication date: 21/02/2014
Field of study

We propose a framework for parsing video and text jointly for understanding events and answering user queries. Our framework produces a parse graph that represents the compositional structures of spatial information (objects and scenes), temporal information (actions and events) and causal information (causalities between events and fluents) in the video and text. The knowledge representation of our framework is based on a spatial-temporal-causal And-Or graph (S/T/C-AOG), which jointly models possible hierarchical compositions of objects, scenes and events as well as their interactions and mutual contexts, and specifies the prior probabilistic distribution of the parse graphs. We present a probabilistic generative model for joint parsing that captures the relations between the input video/text, their corresponding parse graphs and the joint parse graph. Based on the probabilistic model, we propose a joint parsing system consisting of three modules: video parsing, text parsing and joint inference. Video parsing and text parsing produce two parse graphs from the input video and text respectively. The joint inference module produces a joint parse graph by performing matching, deduction and revision on the video and text parse graphs. The proposed framework has the following objectives: Firstly, we aim at deep semantic parsing of video and text that goes beyond the traditional bag-of-words approaches; Secondly, we perform parsing and reasoning across the spatial, temporal and causal dimensions based on the joint S/T/C-AOG representation; Thirdly, we show that deep joint parsing facilitates subsequent applications such as generating narrative text descriptions and answering queries in the forms of who, what, when, where and why. We empirically evaluated our system based on comparison against ground-truth as well as accuracy of query answering and obtained satisfactory results

arXiv.org e-Print Archive

CiteSeerX

An improved parser for data-oriented lexical-functional analysis

Author: Bod Rens
Publication venue
Publication date: 01/01/2000
Field of study

We present an LFG-DOP parser which uses fragments from LFG-annotated sentences to parse new sentences. Experiments with the Verbmobil and Homecentre corpora show that (1) Viterbi n best search performs about 100 times faster than Monte Carlo search while both achieve the same accuracy; (2) the DOP hypothesis which states that parse accuracy increases with increasing fragment size is confirmed for LFG-DOP; (3) LFG-DOP's relative frequency estimator performs worse than a discounted frequency estimator; and (4) LFG-DOP significantly outperforms Tree-DOP is evaluated on tree structures only.Comment: 8 page

arXiv.org e-Print Archive

CiteSeerX

International Migration, Integration and Social Cohesion online publications

Iterated learning and grounding: from holistic to compositional languages

Author: Vogt Paul
Publication venue
Publication date: 01/01/2003
Field of study

This paper presents a new computational model for studying the origins and evolution of compositional languages grounded through the interaction between agents and their environment. The model is based on previous work on adaptive grounding of lexicons and the iterated learning model. Although the model is still in a developmental phase, the first results show that a compositional language can emerge in which the structure reflects regularities present in the population's environment

CiteSeerX

CogPrints Cognitive Sciences Eprint Archive

Tilburg University Repository

Treo: Textual Syntax for Reo Connectors

Author: Arbab Farhad
Dokter Kasper
Publication venue: 'Open Publishing Association'
Publication date: 01/01/2018
Field of study

Reo is an interaction-centric model of concurrency for compositional specification of communication and coordination protocols. Formal verification tools exist to ensure correctness and compliance of protocols specified in Reo, which can readily be (re)used in different applications, or composed into more complex protocols. Recent benchmarks show that compiling such high-level Reo specifications produces executable code that can compete with or even beat the performance of hand-crafted programs written in languages such as C or Java using conventional concurrency constructs. The original declarative graphical syntax of Reo does not support intuitive constructs for parameter passing, iteration, recursion, or conditional specification. This shortcoming hinders Reo's uptake in large-scale practical applications. Although a number of Reo-inspired syntax alternatives have appeared in the past, none of them follows the primary design principles of Reo: a) declarative specification; b) all channel types and their sorts are user-defined; and c) channels compose via shared nodes. In this paper, we offer a textual syntax for Reo that respects these principles and supports flexible parameter passing, iteration, recursion, and conditional specification. In on-going work, we use this textual syntax to compile Reo into target languages such as Java, Promela, and Maude.Comment: In Proceedings MeTRiD 2018, arXiv:1806.0933

arXiv.org e-Print Archive

CWI's Institutional Repository

Leiden University Scholary Publications

GF-DOP: grammatical feature data-oriented parsing

Author: Finn Ríona
Hearne Mary
van Genabith Josef
Way Andy
Publication venue: CSLI Publications
Publication date: 01/01/2006
Field of study

This paper proposes an extension of Tree-DOP which approximates the LFG-DOP model. GF-DOP combines the robustness of the DOP model with some of the linguistic competence of LFG. LFG c-structure trees are augmented with LFG functional information, with the aim of (i) generating more informative parses than Tree-DOP; (ii) improving overall parse ranking by modelling grammatical features; and (iii) avoiding the inconsistent probability models of LFG-DOP. In a number of experiments on the HomeCentre corpus, we report on which (groups of) features most heavily influence parse quality, both positively and negatively

Irish Universities

DCU Online Research Access Service