Search CORE

2,751 research outputs found

Introduction to Library Trends 47 (3) Winter 1999: Folkloristic Approaches in Library and Information Science

Author: Hearne Betsy
Publication venue: Graduate School of Library and Information Science. University of Illinois at Urbana-Champaign
Publication date: 01/01/1999
Field of study

published or submitted for publicatio

Illinois Digital Environment for Access to Learning and Scholarship Repository

Disambiguation strategies for data-oriented translation

Author: Hearne Mary
Way Andy
Publication venue
Publication date: 01/01/2006
Field of study

The Data-Oriented Translation (DOT) model { originally proposed in (Poutsma, 1998, 2003) and based on Data-Oriented Parsing (DOP) (e.g. (Bod, Scha, & Sima'an, 2003)) { is best described as a hybrid model of translation as it combines examples, linguistic information and a statistical translation model. Although theoretically interesting, it inherits the computational complexity associated with DOP. In this paper, we focus on one computational challenge for this model: efficiently selecting the `best' translation to output. We present four different disambiguation strategies in terms of how they are implemented in our DOT system, along with experiments which investigate how they compare in terms of accuracy and efficiency

CiteSeerX

Irish Universities

DCU Online Research Access Service

Seeing the wood for the trees: data-oriented translation

Author: Hearne Mary
Way Andy
Publication venue
Publication date: 01/01/2003
Field of study

Data-Oriented Translation (DOT), which is based on Data-Oriented Parsing (DOP), comprises an experience-based approach to translation, where new translations are derived with reference to grammatical analyses of previous translations. Previous DOT experiments [Poutsma, 1998, Poutsma, 2000a, Poutsma, 2000b] were small in scale because important advances in DOP technology were not incorporated into the translation model. Despite this, related work [Way, 1999, Way, 2003a, Way, 2003b] reports that DOT models are viable in that solutions to ‘hard’ translation cases are readily available. However, it has not been shown to date that DOT models scale to larger datasets. In this work, we describe a novel DOT system, inspired by recent advances in DOP parsing technology. We test our system on larger, more complex corpora than have been used heretofore, and present both automatic and human evaluations which show that high quality translations can be achieved at reasonable speeds

CiteSeerX

DCU Online Research Access Service

Structured parameter estimation for LFG-DOP using Backoff

Author: Hearne Mary
Sima'an Khalil
Publication venue
Publication date: 01/01/2003
Field of study

Despite its state-of-the-art performance, the Data Oriented Parsing (DOP) model has been shown to suffer from biased parameter estimation, and the good performance seems more the result of ad hoc adjustments than correct probabilistic generalization over the data. In recent work, we developed a new estimation procedure, called Backoff Estimation, for DOP models that are based on Phrase-Structure annotations (so called Tree-DOP models). Backoff Estimation deviates from earlier methods in that it treats the model parameters as a highly structured space of correlated events (backoffs), rather than a set of disjoint events. In this paper we show that the problem of biased estimates also holds for DOP models that are based on Lexical-Functional Grammar annotations (i.e. LFG-DOP), and that the LFG-DOP parameters also constitute a hierarchically structured space. Subsequently, we adapt the Backoff Estimation algorithm from Tree-DOP to LFG-DOP models. Backoff Estimation turns out to be a natural solution to some of the specific problems of robust parsing under LFGDOP

Irish Universities

DCU Online Research Access Service

Data-oriented parsing and the Penn Chinese treebank

Author: Hearne Mary
Way Andy
Publication venue
Publication date: 01/01/2004
Field of study

We present an investigation into parsing the Penn Chinese Treebank using a Data-Oriented Parsing (DOP) approach. DOP comprises an experience-based approach to natural language parsing. Most published research in the DOP framework uses PStrees as its representation schema. Drawbacks of the DOP approach centre around issues of efficiency. We incorporate recent advances in DOP parsing techniques into a novel DOP parser which generates a compact representation of all subtrees which can be derived from any full parse tree. We compare our work to previous work on parsing the Penn Chinese Treebank, and provide both a quantitative and qualitative evaluation. While our results in terms of Precision and Recall are slightly below those published in related research, our approach requires no manual encoding of head rules, nor is a development phase per se necessary. We also note that certain constructions which were problematic in this previous work can be handled correctly by our DOP parser. Finally, we observe that the ‘DOP Hypothesis’ is confirmed for parsing the Penn Chinese Treebank

CiteSeerX

Irish Universities

DCU Online Research Access Service

Parallel Treebanks in Phrase-Based Statistical Machine Translation

Author: Hearne Mary
Tinsley John
Way Andy
Publication venue
Publication date: 01/01/2009
Field of study

Given much recent discussion and the shift in focus of the field, it is becoming apparent that the incorporation of syntax is the way forward for the current state-of-the-art in machine translation (MT). Parallel treebanks are a relatively recent innovation and appear to be ideal candidates for MT training material. However, until recently there has been no other means to build them than by hand. In this paper, we describe how we make use of new tools to automatically build a large parallel treebank and extract a set of linguistically motivated phrase pairs from it. We show that adding these phrase pairs to the translation model of a baseline phrase-based statistical MT (PBSMT) system leads to significant improvements in translation quality. We describe further experiments on incorporating parallel treebank information into PBSMT, such as word alignments. We investigate the conditions under which the incorporation of parallel treebank data performs optimally. Finally, we discuss the potential of parallel treebanks in other paradigms of MT

CiteSeerX

Irish Universities

DCU Online Research Access Service

Comparing constituency and dependency representations for SMT phrase-extraction

Author: Hearne Mary
Ozdowska Sylwia
Tinsley John
Publication venue
Publication date: 01/01/2008
Field of study

We consider the value of replacing and/or combining string-based methods with syntax-based methods for phrase-based statistical machine translation (PBSMT), and we also consider the relative merits of using constituency-annotated vs. dependency-annotated training data. We automatically derive two subtree-aligned treebanks, dependency-based and constituency-based, from a parallel English–French corpus and extract syntactically motivated word- and phrase-pairs. We automatically measure PB-SMT quality. The results show that combining string-based and syntax-based word- and phrase-pairs can improve translation quality irrespective of the type of syntactic annotation. Furthermore, using dependency annotation yields greater translation quality than constituency annotation for PB-SMT

Irish Universities

DCU Online Research Access Service

Stated Preferences for Ecotourism Alternatives On the Standing Rock Sioux Indian Reservation

Author: Hearne Robert R.
Tuscherer Sheldon
Publication venue
Publication date
Field of study

Despite favorable locations and the potential for economic development, Native American tribes have not developed their ecotourism markets substantially. This paper presents a choice experiment analysis of potential tourist and local resident preferences for alternative ecotourism development scenarios for the Standing Rock Sioux Indian Reservation. The choice experiments elicitation featured attributes of both cultural and nature-based tourist attractions. Survey results demonstrated that visitors interviewed at powwows had significantly different preferences from those interviewed at local tourist attractions. Results from all samples showed positive preferences towards an amphitheater, a nature trail, and a bison meal, and no preference toward an ATV trail. Non-powwow tourists had significant willingness to pay for a number of potential attractions, including nature trails, a road through the bison pasture, and an interpretive center with amphitheatre show.choice experiments, ecotourism, Native Americans, Standing Rock Sioux, Resource /Energy Economics and Policy,

Research Papers in Economics

THE USE OF CHOICE EXPERIMENTS TO ANALYZE CONSUMER PREFERENCES FOR ORGANIC PRODUCE IN COSTA RICA

Author: Hearne Robert R.
Volcan Mirel
Publication venue
Publication date
Field of study

Choice Experiments are used to elicit Costa Rican consumer preferences for different attributes of organic and conventional vegetables in a hypothetical market. Focus groups identified a primary concern with the food safety and a secondary interest on the environmental impact of production practices. Two alternative national certification seals were proposed: 1) a "Blue Seal" certifying the Department of Public Health's approval for food safety; and 2) a "Green Seal" certifying Ministry of Agriculture's approval for environmentally sound production practices. Three other attributes were selected: "Appearance", "Size", and "Price". These attributes, together with the proposed labels, were presented in different combinations to a sample of 432 Costa Rican consumers at ten supermarkets located in the urban Central Valley. The results of the multinomial logit model demonstrate that the attributes "Appearance" and "Price" the have the strongest influence over the probability choosing alternative scenarios. Also, there was a significant preference for the "Blue Seal" and the "Blue Seal" and "Green Seal" combined. The socioeconomic variables turned out to be not significant in consumer choice. The results show a MWTP of 20% for the "Blue Seal" certifying healthy produce, and an additional 19% for the "Green Seal". The favorable acceptance of the certification seals on the part of the Costa Rican consumer can imply a large internal market for organic and ecologically healthy produce.Consumer/Household Economics,

Research Papers in Economics

Water Markets in Mexico: Opportunities and Constraints

Author: Hearne Robert R.
Trava Jose L.
Publication venue
Publication date
Field of study

In 1992, the Government of Mexico initiated a new national water law which decentralised water resources management and allowed the market transfer of water-use concessions between individual irrigators. These reforms were expected to improve water resources management through greater user participation in irrigation management, as well as to increase irrigators incentives to improve water-use efficiency. At the time of its proposal the 1992 Federal Water Law was considered to the first step in the establishment of limited water markets. This paper addresses the opportunities and constraints to improved water resource use and allocation through the market incentives that result from transferable water-use permits. The paper reviews water allocation institutions in Mexico and provides case studies of water allocation and decision-making.Resource /Energy Economics and Policy,

Research Papers in Economics