6 research outputs found
Tree-Grammar Linear Typing for unified Super-Tagging/Probabilistic Parsing Models
We integrate super-tagging, guided-parsing and probabilistic parsing in the framework of an item-based LTAG chart parser. Items are based on a linear-typing of trees that encodes their expanding path, starting from their anchor. 1 Introduction Practical implementations of LTAG parsing have to face heavy lexical ambiguity and parsing combinatorial ambiguity. Main techniques to address these issues are super-tagging (Joshi and Srinivas, 1994), which consists in disambiguating elementary trees before parsing; guided-parsing, like head-driven parsing (van Noord, 1994) or anchor driven parsing (Lavelli and Satta, 1991; Lopez, 1998); and probabilistic parsing (Schabes, 1992; Caroll and Weir, 1997). All of these approaches exploit specific properties of LTAG to improve parsing efficiency, but none is totally satisfactory. Guided-parsing is a very usefull means to limit overgeneration of spurious items in the chart, but it does not provide a new ambiguity bound. Besides, lexical ambiguity re..
Filtering Errors and Repairing Linguistic Anomalies for Spoken Dialogue Systems
Our work addresses the integration of speech recognition and language processing for whole spoken dialogue systems. To filter ill-recognized words, we design an on-line computing of word confidence scores based on the recognizer output hypothesis. To infer as much information as possible from the retained sequence of words, we propose a bottom-up syntacticosemantic robust parsing relying on a lexicalized tree grammar and on integrated repairing strategies
Hybrid Approach to Spoken Query Processing in Document Retrieval System
In the context of the THISL spoken document retrieval system, we present a hybrid approach to spoken query processing, which enables to increase recognition rates and to extract relevant information for the application. The query processing is distributed between grammar and language model, based on the assumption that a query can be decomposed in two relatively independent parts; the addressing form, which is parsed with a grammar, and the queried content, which is scored with a domain specific language model. Our aim is to retrieve the content sequence, which allows us to consult the database, but also, to keep information about the query formulations in order to develop an interaction between the user and the retrieval engine. This leads us to work closely with the speech recogniser and to carry out together the recognition and the query analysis. 1. Introduction Speech recognition technology has now reached a stage where it can reasonably provide baseline systems for spoken inter..
Parsing Strategy For Spoken Language Interfaces With A Lexicalized Tree Grammar
Our work addresses the integration of speech recognition and natural language processing for spoken dialogue systems. To deal with recognition errors, we investigate two repairing strategies integrated in a parsing based on a Lexicalized Tree Grammar. The first strategy takes its root in the recognition hypothesis, the other in the linguistic expectations. We expose a formal framework to express the grammar, to describe the repairing strategies and to foresee further strategies
Tree-Grammar Linear Typing for unified Super-Tagging/Probabilistic Parsing Models
We integrate super-tagging, guided-parsing and probabilistic parsing in the framework of an item-based LTAG chart parser. Items are based on a linear-typing of trees that encodes their expanding path, starting from their anchor. 1 Introduction Practical implementations of LTAG parsing have to face heavy lexical ambiguity and parsing combinatorial ambiguity. Main techniques to address these issues are super-tagging (Joshi and Srinivas, 1994), which consists in disambiguating elementary trees before parsing; guided-parsing, like head-driven parsing (van Noord, 1994) or anchor driven parsing (Lavelli and Satta, 1991; Lopez, 1998); and probabilistic parsing (Schabes, 1992; Caroll and Weir, 1997). All of these approaches exploit specific properties of LTAG to improve parsing efficiency, but none is totally satisfactory. Guided-parsing is a very usefull means to limit overgeneration of spurious items in the chart, but it does not provide a new ambiguity bound. Besides, lexical ambiguity re..