26,537 research outputs found
Unity in diversity : integrating differing linguistic data in TUSNELDA
This paper describes the creation and preparation of TUSNELDA, a collection of corpus data built for linguistic research. This collection contains a number of linguistically annotated corpora which differ in various aspects such as language, text sorts / data types, encoded annotation levels, and linguistic theories underlying the annotation. The paper focuses on this variation on the one hand and the way how these heterogeneous data are integrated into one resource on the other hand
Natural language processing
Beginning with the basic issues of NLP, this chapter aims to chart the major research activities in this area since the last ARIST Chapter in 1996 (Haas, 1996), including: (i) natural language text processing systems - text summarization, information extraction, information retrieval, etc., including domain-specific applications; (ii) natural language interfaces; (iii) NLP in the context of www and digital libraries ; and (iv) evaluation of NLP systems
Software Infrastructure for Natural Language Processing
We classify and review current approaches to software infrastructure for
research, development and delivery of NLP systems. The task is motivated by a
discussion of current trends in the field of NLP and Language Engineering. We
describe a system called GATE (a General Architecture for Text Engineering)
that provides a software infrastructure on top of which heterogeneous NLP
processing modules may be evaluated and refined individually, or may be
combined into larger application systems. GATE aims to support both researchers
and developers working on component technologies (e.g. parsing, tagging,
morphological analysis) and those working on developing end-user applications
(e.g. information extraction, text summarisation, document generation, machine
translation, and second language learning). GATE promotes reuse of component
technology, permits specialisation and collaboration in large-scale projects,
and allows for the comparison and evaluation of alternative technologies. The
first release of GATE is now available - see
http://www.dcs.shef.ac.uk/research/groups/nlp/gate/Comment: LaTeX, uses aclap.sty, 8 page
Localisation and linguistic anomalies
Interactive systems may seek to accommodate users whose first language is not English. Usually, this entails a focus on translation and related features of localisation. While such motivation is worthy, the results are often less than ideal. In raising awareness of the shortcomings of localisation, we hope to improve the prospects for successful second-language support. To this end, the present paper describes three varieties of linguistic irregularity that we have encountered in localised systems and suggests that these anomalies are direct results of localisation. This underlines the need for better end-user guidance in managing local language resources and supports our view that complementary local resources may hold the key to second language user support
Platform Relative Sensor Abstractions across Mobile Robots using Computer Vision and Sensor Integration
Uniform sensor management and abstraction across different robot platforms is a difficult task due to the sheer diversity of sensing devices. However, because these sensors can be grouped into categories that in essence provide the same information, we can capture their similarities and create abstractions. An example would be distance data measured by an assortment of range sensors, or alternatively extracted from a camera using image processing. This paper describes how using software components it is possible to uniformly construct high-level abstractions of sensor information across various robots in a way to support the portability of common code that uses these abstractions (e.g. obstacle avoidance, wall following). We demonstrate our abstractions on a number of robots using different configurations of range sensors and cameras
GATE -- an Environment to Support Research and Development in Natural Language Engineering
We describe a software environment to support research and development in natural language (NL) engineering. This environment -- GATE (General Architecture for Text Engineering) -- aims to advance research in the area of machine processing of natural languages by providing a software infrastructure on top of which heterogeneous NL component modules may be evaluated and refined individually or may be combined into larger application systems. Thus, GATE aims to support both researchers and developers working on component technologies (e.g. parsing, tagging, morphological analysis) and those working on developing end-user applications (e.g. information extraction, text summarisation, document generation, machine translation, and second language learning). GATE will promote reuse of component technology, permit specialisation and collaboration in large-scale projects, and allow for the comparison and evaluation of alternative technologies. The first release of GATE is now available
- …