1,218 research outputs found
ATLAS: A flexible and extensible architecture for linguistic annotation
We describe a formal model for annotating linguistic artifacts, from which we
derive an application programming interface (API) to a suite of tools for
manipulating these annotations. The abstract logical model provides for a range
of storage formats and promotes the reuse of tools that interact through this
API. We focus first on ``Annotation Graphs,'' a graph model for annotations on
linear signals (such as text and speech) indexed by intervals, for which
efficient database storage and querying techniques are applicable. We note how
a wide range of existing annotated corpora can be mapped to this annotation
graph model. This model is then generalized to encompass a wider variety of
linguistic ``signals,'' including both naturally occuring phenomena (as
recorded in images, video, multi-modal interactions, etc.), as well as the
derived resources that are increasingly important to the engineering of natural
language processing systems (such as word lists, dictionaries, aligned
bilingual corpora, etc.). We conclude with a review of the current efforts
towards implementing key pieces of this architecture.Comment: 8 pages, 9 figure
Multilingual Text to Speech in embedded systems using RC8660
Most multilingual Test to Speech (TTS) systems are software applications which allow people with visual impairments or reading disabilities to listen the written material using computer. This paper describes an approach to make a multilingual TTS and embed it into the portable, low cost, and standalone embedded system to access and read electronic documents particularly in developing countries. There are several TTS such as Doubletalk, DECtalk, and Dolphin available in market, also there are some products using TTS such as Talking OCR, Bill Reader and Intel Reader, which are not affordable or multilingual. To design this system OMAP3530 an application processor board is considered as the hardware platform to process the language-independent parts of the application and RC8660 used as an integrated TTS processor
The "Tiepstem" : an experimental Dutch keyboard-to-speech system for the speech impaired
An experimental Dutch keyboard-to-speech system has been developed to explor the possibilities and limitations of Dutch speech synthesis in a communication aid for the speech impaired. The system uses diphones and a formant synthesizer chip for speech synthesis. Input to the system is in pseudo-phonetic notation. Intonation contours using a declination line and various rises and falls are generated starting from an input consisting of punctuation and accent marks. The hardware design has resulted in a small, portable and battery-powered device. A short evaluation with users has been carried out, which has shown possibilities for such a device but has also indicated some problems with the current pseudo-phonetic input
Validation of Coding Schemes and Coding Workbench
This report presents methodology and results of the validation of the MATE best practice coding schemes and the MATE workbench. The validation phase covered the period from September 1999 to February 2000, and involved project partners as well as Advisory Panel members who kindly volunteered to act as external evaluators. The first part of the report focuses on the evaluation of the theoretical work in MATE while the second part concentrates on the workbench . In both cases, a questionnaire has been used as a core tool to obtain feedback from evaluators. A major probem has been the short time available for evaluation which has implied that less feedbach than originally expected could be obtained . Evaluation of MATE results will continue after the end of the project
META-NORD: Towards Sharing of Language Resources in Nordic and Baltic Countries
This paper introduces the META-NORD project which develops Nordic and Baltic part of the European open language resource infrastructure. META-NORD works on assembling, linking across languages, and making widely available the basic language resources used by developers, professionals and researchers to build specific products and applications. The goals of the project, overall approach and specific focus lines on wordnets, terminology resources and treebanks are described. Moreover, results achieved in first five months of the project, i.e. language whitepapers, metadata specification and IPR, are presented.Peer reviewe
An Intelligent Text Extraction and Navigation System
We present sppc, a high-performance system for intelligent text extraction and navigation from German free text documents. The main purpose of sppc is to extract as much linguistic structure as possible for performing domain-specific processing. sppc consists of a set of domain-independent shallow core components which are realized by means of cascaded weighted finite state machines and generic dynamic tries. All extracted information is represented uniformly in one data structure (called the text chart) in a highly compact and linked form in order to support indexing and navigation through the set of solutions. Germa
- …