3,856 research outputs found
Improving the translation environment for professional translators
When using computer-aided translation systems in a typical, professional translation workflow, there are several stages at which there is room for improvement. The SCATE (Smart Computer-Aided Translation Environment) project investigated several of these aspects, both from a human-computer interaction point of view, as well as from a purely technological side.
This paper describes the SCATE research with respect to improved fuzzy matching, parallel treebanks, the integration of translation memories with machine translation, quality estimation, terminology extraction from comparable texts, the use of speech recognition in the translation process, and human computer interaction and interface design for the professional translation environment. For each of these topics, we describe the experiments we performed and the conclusions drawn, providing an overview of the highlights of the entire SCATE project
A derivational rephrasing experiment for question answering
In Knowledge Management, variations in information expressions have proven a
real challenge. In particular, classical semantic relations (e.g. synonymy) do
not connect words with different parts-of-speech. The method proposed tries to
address this issue. It consists in building a derivational resource from a
morphological derivation tool together with derivational guidelines from a
dictionary in order to store only correct derivatives. This resource, combined
with a syntactic parser, a semantic disambiguator and some derivational
patterns, helps to reformulate an original sentence while keeping the initial
meaning in a convincing manner This approach has been evaluated in three
different ways: the precision of the derivatives produced from a lemma; its
ability to provide well-formed reformulations from an original sentence,
preserving the initial meaning; its impact on the results coping with a real
issue, ie a question answering task . The evaluation of this approach through a
question answering system shows the pros and cons of this system, while
foreshadowing some interesting future developments
Dutch hypernym detection : does decompounding help?
This research presents experiments carried out to improve the precision and recall of Dutch hypernym detection. To do so, we applied a data-driven semantic relation finder that starts from a list of automatically extracted domain-specific terms from technical corpora, and generates a list of hypernym relations between these terms. As Dutch technical terms often consist of compounds written in one orthographic unit, we investigated the impact of a decompounding module on the performance of the hypernym detection system.
In addition, we also improved the precision of the system by designing filters taking into account statistical and linguistic information.
The experimental results show that both the precision and recall of the hypernym detection system improved, and that the decompounding module is especially effective for hypernym detection in Dutch
Morphological Disambiguation by Voting Constraints
We present a constraint-based morphological disambiguation system in which
individual constraints vote on matching morphological parses, and
disambiguation of all the tokens in a sentence is performed at the end by
selecting parses that receive the highest votes. This constraint application
paradigm makes the outcome of the disambiguation independent of the rule
sequence, and hence relieves the rule developer from worrying about potentially
conflicting rule sequencing. Our results for disambiguating Turkish indicate
that using about 500 constraint rules and some additional simple statistics, we
can attain a recall of 95-96% and a precision of 94-95% with about 1.01 parses
per token. Our system is implemented in Prolog and we are currently
investigating an efficient implementation based on finite state transducers.Comment: 8 pages, Latex source. To appear in Proceedings of ACL/EACL'97
Compressed postscript also available as
ftp://ftp.cs.bilkent.edu.tr/pub/ko/acl97.ps.
Ontologies and Information Extraction
This report argues that, even in the simplest cases, IE is an ontology-driven
process. It is not a mere text filtering method based on simple pattern
matching and keywords, because the extracted pieces of texts are interpreted
with respect to a predefined partial domain model. This report shows that
depending on the nature and the depth of the interpretation to be done for
extracting the information, more or less knowledge must be involved. This
report is mainly illustrated in biology, a domain in which there are critical
needs for content-based exploration of the scientific literature and which
becomes a major application domain for IE
- …