Search CORE

182 research outputs found

SemTAG, the LORIA toolbox for TAG-based Parsing and Generation

Author: Gardent Claire
Kow Eric
Parmentier Yannick
Publication venue: HAL CCSD
Publication date: 15/07/2006
Field of study

In this paper, we introduce SemTAG, a toolbox for TAG-based parsing and generation. This environment supports the development of wide-coverage grammars and differs from existing environments for TAG such as XTAG, in that it includes a semantic dimension. SemTAG is open-source and freely available

INRIA a CCSD electronic archive server

HAL Descartes

Hal-Diderot

Experiments on Building Language Resources for Multi-Modal Dialogue Systems

Author: Langlois David
Romary Laurent
Todirascu Amalia
Publication venue: HAL CCSD
Publication date: 01/01/2004
Field of study

Colloque avec actes et comité de lecture. internationale.International audienceThe paper presents the experiments made to adapt and to synchronise the linguistic resources of the French language processing modules integrated in the MIAMM prototype, designed to handle multi-modal human-machine interactions. These experiments allowed us to identify a methodology for adapting multilingual resources for a dialogue system. In the paper, we describe the iterative joint process used to build linguistic resources for the two cooperative modules: speech recognition for speech modality and syntactic/semantic parsing

CiteSeerX

INRIA a CCSD electronic archive server

HAL Descartes

Hal-Diderot

Working together towards an ideal infrastructure for language learner corpora

Author: Boyd Adriane
Jansen Maarten
Lindström Tiedemann Therese
Mikelić Preradović Nives
Rosen Alexandr
Rosén Dan
Stemle Egon W.
Volodina Elena
Publication venue: Presses universitaires de Louvain
Publication date: 01/01/2019
Field of study

Peer reviewe

Helsingin yliopiston digitaalinen arkisto

Compiling and annotating a learner corpus for a morphologically rich language: CzeSL, a corpus of non-native Czech

Author: Hana Jiří
Jelínek Tomáš
Rosen Alexandr
Vidová Hladká Barbora
Škodová Svatava
Štindlová Barbora
Publication venue: 'Charles University in Prague, Karolinum Press'
Publication date: 01/01/2020
Field of study

Learner corpora, linguistic collections documenting a language as used by learners, provide an important empirical foundation for language acquisition research and teaching practice. This book presents CzeSL, a corpus of non-native Czech, against the background of theoretical and practical issues in the current learner corpus research. Languages with rich morphology and relatively free word order, including Czech, are particularly challenging for the analysis of learner language. The authors address both the complexity of learner error annotation, describing three complementary annotation schemes, and the complexity of description of non-native Czech in terms of standard linguistic categories. The book discusses in detail practical aspects of the corpus creation: the process of collection and annotation itself, the supporting tools, the resulting data, their formats and search platforms. The chapter on use cases exemplifies the usefulness of learner corpora for teaching, language acquisition research, and computational linguistics. Any researcher developing learner corpora will surely appreciate the concluding chapter listing lessons learned and pitfalls to avoid

CU Digital Repository

Directory of Open Access Books (DOAB)

Contributions to the Theory of Finite-State Based Grammars

Author: Yli-Jyrä Anssi
Publication venue: Helsingfors universitet
Publication date: 01/06/2005
Field of study

This dissertation is a theoretical study of finite-state based grammars used in natural language processing. The study is concerned with certain varieties of finite-state intersection grammars (FSIG) whose parsers define regular relations between surface strings and annotated surface strings. The study focuses on the following three aspects of FSIGs: (i) Computational complexity of grammars under limiting parameters In the study, the computational complexity in practical natural language processing is approached through performance-motivated parameters on structural complexity. Each parameter splits some grammars in the Chomsky hierarchy into an infinite set of subset approximations. When the approximations are regular, they seem to fall into the logarithmic-time hierarchyand the dot-depth hierarchy of star-free regular languages. This theoretical result is important and possibly relevant to grammar induction. (ii) Linguistically applicable structural representations Related to the linguistically applicable representations of syntactic entities, the study contains new bracketing schemes that cope with dependency links, left- and right branching, crossing dependencies and spurious ambiguity. New grammar representations that resemble the Chomsky-Schützenberger representation of context-free languages are presented in the study, and they include, in particular, representations for mildly context-sensitive non-projective dependency grammars whose performance-motivated approximations are linear time parseable. (iii) Compilation and simplification of linguistic constraints Efficient compilation methods for certain regular operations such as generalized restriction are presented. These include an elegant algorithm that has already been adopted as the approach in a proprietary finite-state tool. In addition to the compilation methods, an approach to on-the-fly simplifications of finite-state representations for parse forests is sketched. These findings are tightly coupled with each other under the theme of locality. I argue that the findings help us to develop better, linguistically oriented formalisms for finite-state parsing and to develop more efficient parsers for natural language processing. Avainsanat: syntactic parsing, finite-state automata, dependency grammar, first-order logic, linguistic performance, star-free regular approximations, mildly context-sensitive grammar

Helsingin yliopiston digitaalinen arkisto

Proceedings

Author: Langgård Per
Moshagen Sjur Nørstebø
Publication venue
Publication date: 19/10/2011
Field of study

Proceedings of the NODALIDA 2011 Workshop Visibility and Availability of LT Resources. Editors: Sjur Nørstebø Moshagen and Per Langgård. NEALT Proceedings Series, Vol. 13 (2011), vi+32 pp. © 2011 The editors and contributors. Published by Northern European Association for Language Technology (NEALT) http://omilia.uio.no/nealt . Electronically published at Tartu University Library (Estonia) http://hdl.handle.net/10062/1697

DSpace at Tartu University Library

Fluent APIs in Functional Languages (full version)

Author: Gil Yossi
Roth Ori
Publication venue
Publication date: 02/11/2022
Field of study

Fluent API is an object-oriented pattern for smart and elegant embedded DSLs. As fluent API designs typically rely on function overloading, they are hard to realize in functional programming languages. We show how to write functional fluent APIs using parametric polymorphism and unification instead of overloading. Our designs support all regular and deterministic context-free DSLs and beyond

arXiv.org e-Print Archive

CLiFF Notes: Research In Natural Language Processing at the University of Pennsylvania

Author: Graduate Students Faculty &
Publication venue: ScholarlyCommons
Publication date: 01/03/1992
Field of study

The Computational Linguistics Feedback Forum (CLIFF) is a group of students and faculty who gather once a week to discuss the members\u27 current research. As the word feedback suggests, the group\u27s purpose is the sharing of ideas. The group also promotes interdisciplinary contacts between researchers who share an interest in Cognitive Science. There is no single theme describing the research in Natural Language Processing at Penn. There is work done in CCG, Tree adjoining grammars, intonation, statistical methods, plan inference, instruction understanding, incremental interpretation, language acquisition, syntactic parsing, causal reasoning, free word order languages, ... and many other areas. With this in mind, rather than trying to summarize the varied work currently underway here at Penn, we suggest reading the following abstracts to see how the students and faculty themselves describe their work. Their abstracts illustrate the diversity of interests among the researchers, explain the areas of common interest, and describe some very interesting work in Cognitive Science. This report is a collection of abstracts from both faculty and graduate students in Computer Science, Psychology and Linguistics. We pride ourselves on the close working relations between these groups, as we believe that the communication among the different departments and the ongoing inter-departmental research not only improves the quality of our work, but makes much of that work possible

ScholarlyCommons@Penn

Creating a Semantic Graph from Wikipedia

Author: Tanner Ryan
Publication venue: Digital Commons @ Trinity
Publication date: 01/04/2012
Field of study

With the continued need to organize and automate the use of data, solutions are needed to transform unstructred text into structred information. By treating dependency grammar functions as programming language functions, this process produces \property maps which connect entities (people, places, events) with snippets of information. These maps are used to construct a semantic graph. By inputting Wikipedia, a large graph of information is produced representing a section of history. The resulting graph allows a user to quickly browse a topic and view the interconnections between entities across history

Trinity University

An Estelle compiler

Author: Van Dijk Jacques
Publication venue: Department of Computer Science
Publication date: 01/01/1988
Field of study

The increasing development and use of computer networks has necessitated international standards to be defined. Central to the standardization efforts is the concept of a Formal Description Technique (FDT) which is used to provide a definition medium for communication protocols and services. This document describes the design and implementation of one of the few existing compilers for the one such FDT, the language "Estelle" ([ISO85], [ISO86], [ISO87])

Cape Town University OpenUCT