Search CORE

4 research outputs found

Error Mining on Dependency Trees

Author: Gardent Claire
Narayan Shashi
Publication venue: HAL CCSD
Publication date: 01/01/2012
Field of study

International audienceIn recent years, error mining approaches were developed to help identify the most likely sources of parsing failures in parsing systems using handcrafted grammars and lexicons. However the techniques they use to enumerate and count n-grams builds on the sequential nature of a text corpus and do not easily extend to structured data. In this paper, we propose an algorithm for mining trees and apply it to detect the most likely sources of generation failure. We show that this tree mining algorithm permits identifying not only errors in the generation system (grammar, lexicon) but also mismatches between the structures contained in the input and the input structures expected by our generator as well as a few idiosyncrasies/error in the input data

CiteSeerX

INRIA a CCSD electronic archive server

Automatic test suite generation for PMCFG grammars

Author: Lindstr\uf6m Claessen Koen
Listenmaa Inari
Publication venue: 'EasyChair'
Publication date: 01/01/2018
Field of study

We present a method for finding errors in formalized natural language grammars, by automatically and systematically generating test cases that are intended to be judged by a human oracle. The method works on a per-construction basis; given a construction from the grammar, it generates a finite but complete set of test sentences (typically tens or hundreds), where that construction is used in all possible ways. Our method is an alternative to using a corpus or a treebank, where no such completeness guarantees can be made. The method is language-independent and is implemented for the grammar formalism PMCFG, but also works for weaker grammar formalisms. We evaluate the method on a number of different grammars for different natural languages, with sizes ranging from toy examples to real-world grammars

Chalmers Research

Error Mining with Suspicion Trees: Seeing the Forest for the Trees

Author: Gardent Claire
Narayan Shashi
Publication venue: HAL CCSD
Publication date: 08/12/2012
Field of study

International audienceIn recent years, error mining approaches have been proposed to identify the most likely sources of errors in symbolic parsers and generators. However the techniques used generate a flat list of suspicious forms ranked by decreasing order of suspicion. We introduce a novel algorithm that structures the output of error mining into a tree (called, suspicion tree) highlighting the relationships between suspicious forms. We illustrate the impact of our approach by applying it to detect and analyse the most likely sources of failure in surface realisation; and we show how the suspicion tree built by our algorithm helps presenting the errors identified by error mining in a linguistically meaningful way thus providing better support for error analysis. The right frontier of the tree highlights the relative importance of the main error cases while the subtrees of a node indicate how a given error case divides into smaller more specific case

CiteSeerX

INRIA a CCSD electronic archive server

Mining for Parsing Failures

Author: de Kok Daniël
van Noord Gerardus
Publication venue: College Publications
Publication date: 01/01/2017
Field of study

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

Dissertations of the University of Groningen