Search CORE

1,670 research outputs found

Evaluating two methods for Treebank grammar compaction

Author: Gaizauskas R.
Hepple M.
Krotov A.
Wilks Y.
Publication venue: 'Cambridge University Press (CUP)'
Publication date: 01/12/1999
Field of study

Treebanks, such as the Penn Treebank, provide a basis for the automatic creation of broad coverage grammars. In the simplest case, rules can simply be ‘read off’ the parse-annotations of the corpus, producing either a simple or probabilistic context-free grammar. Such grammars, however, can be very large, presenting problems for the subsequent computational costs of parsing under the grammar. In this paper, we explore ways by which a treebank grammar can be reduced in size or ‘compacted’, which involve the use of two kinds of technique: (i) thresholding of rules by their number of occurrences; and (ii) a method of rule-parsing, which has both probabilistic and non-probabilistic variants. Our results show that by a combined use of these two techniques, a probabilistic context-free grammar can be reduced in size by 62% without any loss in parsing performance, and by 71% to give a gain in recall, but some loss in precision

Crossref

White Rose Research Online

University of Sheffield TREC-8 Q & A System

Author: Gaizauskas R.
Hepple M.
Humphreys K.
Sanderson M.
Publication venue: 'University of Aden - Faculty of Economics and Administration'
Publication date: 01/01/1999
Field of study

The system entered by the University of Sheffield in the question answering track of TREC-8 is the result of coupling two existing technologies - information retrieval (IR) and information extraction (IE). In essence the approach is this: the IR system treats the question as a query and returns a set of top ranked documents or passages; the IE system uses NLP techniques to parse the question, analyse the top ranked documents or passages returned by the IR system, and instantiate a query variable in the semantic representation of the question against the semantic representation of the analysed documents or passages. Thus, while the IE system by no means attempts “full text understanding", this approach is a relatively deep approach which attempts to work with meaning representations. Since the information retrieval systems we used were not our own (AT&T and UMass) and were used more or less “off the shelf", this paper concentrates on describing the modifications made to our existing information extraction system to allow it to participate in the Q & A task

White Rose Research Online

Compacting the Penn Treebank Grammar

Author: Gaizauskas Robert
Hepple Mark
Krotov Alexander
Wilks Yorick
Publication venue
Publication date: 01/01/1998
Field of study

Treebanks, such as the Penn Treebank (PTB), offer a simple approach to obtaining a broad coverage grammar: one can simply read the grammar off the parse trees in the treebank. While such a grammar is easy to obtain, a square-root rate of growth of the rule set with corpus size suggests that the derived grammar is far from complete and that much more treebanked text would be required to obtain a complete grammar, if one exists at some limit. However, we offer an alternative explanation in terms of the underspecification of structures within the treebank. This hypothesis is explored by applying an algorithm to compact the derived grammar by eliminating redundant rules -- rules whose right hand sides can be parsed by other rules. The size of the resulting compacted grammar, which is significantly less than that of the full treebank grammar, is shown to approach a limit. However, such a compacted grammar does not yield very good performance figures. A version of the compaction algorithm taking rule probabilities into account is proposed, which is argued to be more linguistically motivated. Combined with simple thresholding, this method can be used to give a 58% reduction in grammar size without significant change in parsing performance, and can produce a 69% reduction with some gain in recall, but a loss in precision.Comment: 5 pages, 2 figure

arXiv.org e-Print Archive

CiteSeerX

Crossref

The Labour Relations Act and global competitiveness

Author: Hepple Bob
Publication venue: 'African Journals Online (AJOL)'
Publication date: 27/06/2016
Field of study

No Abstrac

AJOL - African Journals Online

Implementation of liquid culture for tuberculosis diagnosis in a remote setting: lessons learned.

Author: Cheruiyot C
Hepple P
Novoa-Cain J
Richter E
Ritmeijer K
Publication venue
Publication date: 01/03/2011
Field of study

Although sputum smear microscopy is the primary method for tuberculosis (TB) diagnosis in low-resource settings, it has low sensitivity. The World Health Organization recommends the use of liquid culture techniques for TB diagnosis and drug susceptibility testing in low- and middle-income countries. An evaluation of samples from southern Sudan found that culture was able to detect cases of active pulmonary TB and extra-pulmonary TB missed by conventional smear microscopy. However, the long delays involved in obtaining culture results meant that they were usually not clinically useful, and high rates of non-tuberculous mycobacteria isolation made interpretation of results difficult. Improvements in diagnostic capacity and rapid speciation facilities, either on-site or through a local reference laboratory, are crucial

MSF Field Research

The church in rural Missouri, Part 3. Clergymen in rural Missouri

Author: Hepple Lawrence Milton
Publication venue: University of Missouri, College of Agriculture, Agricultural Experiment Station
Publication date: 01/01/1958
Field of study

"December 1958.

University of Missouri: MOspace

Grammar and processing of order and dependency: a categorial approach

Author: Hepple Mark
Publication venue: The University of Edinburgh
Publication date
Field of study

Edinburgh Research Archive

Experiments in Structure-Preserving Grammar Compaction

Author: Hepple Mark
van Genabith Josef
Publication venue
Publication date: 01/01/2000
Field of study

Structure preserving grammar compaction (SPC) is a simple CFG compaction technique originally described in (van Genabith et al., 1999a, 1999b). It works by generalising category labels and in so doing plugs holes in the grammar. To date the method has been tested on small corpra only. In the present research we apply SPC to a large grammar extracted from the Penn Treebank and examine its effects on rule treebank grammar size and on rule accession rates (as an indicator of grammar completeness) . 1 Introduction Tree banks and resources compiled from treebanks are potentially very useful in NLP. Grammars extracted from treebanks --- so called treebank grammars (Charniak, 1996) --- can form the basis of large coverage NLP systems. Such treebank grammars, however, can suffer from several shortcomings: they commonly feature a large number of flat, highly specific rules that may be rarely used, with ensuing costs for processing (load) under the grammar

CiteSeerX

Irish Universities

DCU Online Research Access Service

Rural social organization in Dent County, Missouri

Author: Almack Ronald B.
Hepple Lawrence Milton
Publication venue: University of Missouri, College of Agriculture, Agricultural Experiment Station
Publication date: 01/01/1950
Field of study

Also available online.Digitized 2007 AES

University of Missouri: MOspace