1,218 research outputs found

    ATLAS: A flexible and extensible architecture for linguistic annotation

    Full text link
    We describe a formal model for annotating linguistic artifacts, from which we derive an application programming interface (API) to a suite of tools for manipulating these annotations. The abstract logical model provides for a range of storage formats and promotes the reuse of tools that interact through this API. We focus first on ``Annotation Graphs,'' a graph model for annotations on linear signals (such as text and speech) indexed by intervals, for which efficient database storage and querying techniques are applicable. We note how a wide range of existing annotated corpora can be mapped to this annotation graph model. This model is then generalized to encompass a wider variety of linguistic ``signals,'' including both naturally occuring phenomena (as recorded in images, video, multi-modal interactions, etc.), as well as the derived resources that are increasingly important to the engineering of natural language processing systems (such as word lists, dictionaries, aligned bilingual corpora, etc.). We conclude with a review of the current efforts towards implementing key pieces of this architecture.Comment: 8 pages, 9 figure

    Multilingual Text to Speech in embedded systems using RC8660

    Get PDF
    Most multilingual Test to Speech (TTS) systems are software applications which allow people with visual impairments or reading disabilities to listen the written material using computer. This paper describes an approach to make a multilingual TTS and embed it into the portable, low cost, and standalone embedded system to access and read electronic documents particularly in developing countries. There are several TTS such as Doubletalk, DECtalk, and Dolphin available in market, also there are some products using TTS such as Talking OCR, Bill Reader and Intel Reader, which are not affordable or multilingual. To design this system OMAP3530 an application processor board is considered as the hardware platform to process the language-independent parts of the application and RC8660 used as an integrated TTS processor

    The "Tiepstem" : an experimental Dutch keyboard-to-speech system for the speech impaired

    Get PDF
    An experimental Dutch keyboard-to-speech system has been developed to explor the possibilities and limitations of Dutch speech synthesis in a communication aid for the speech impaired. The system uses diphones and a formant synthesizer chip for speech synthesis. Input to the system is in pseudo-phonetic notation. Intonation contours using a declination line and various rises and falls are generated starting from an input consisting of punctuation and accent marks. The hardware design has resulted in a small, portable and battery-powered device. A short evaluation with users has been carried out, which has shown possibilities for such a device but has also indicated some problems with the current pseudo-phonetic input

    Natural language software registry (second edition)

    Get PDF

    Validation of Coding Schemes and Coding Workbench

    Get PDF
    This report presents methodology and results of the validation of the MATE best practice coding schemes and the MATE workbench. The validation phase covered the period from September 1999 to February 2000, and involved project partners as well as Advisory Panel members who kindly volunteered to act as external evaluators. The first part of the report focuses on the evaluation of the theoretical work in MATE while the second part concentrates on the workbench . In both cases, a questionnaire has been used as a core tool to obtain feedback from evaluators. A major probem has been the short time available for evaluation which has implied that less feedbach than originally expected could be obtained . Evaluation of MATE results will continue after the end of the project

    META-NORD: Towards Sharing of Language Resources in Nordic and Baltic Countries

    Get PDF
    This paper introduces the META-NORD project which develops Nordic and Baltic part of the European open language resource infrastructure. META-NORD works on assembling, linking across languages, and making widely available the basic language resources used by developers, professionals and researchers to build specific products and applications. The goals of the project, overall approach and specific focus lines on wordnets, terminology resources and treebanks are described. Moreover, results achieved in first five months of the project, i.e. language whitepapers, metadata specification and IPR, are presented.Peer reviewe

    IMAGINE Final Report

    No full text

    An Intelligent Text Extraction and Navigation System

    Get PDF
    We present sppc, a high-performance system for intelligent text extraction and navigation from German free text documents. The main purpose of sppc is to extract as much linguistic structure as possible for performing domain-specific processing. sppc consists of a set of domain-independent shallow core components which are realized by means of cascaded weighted finite state machines and generic dynamic tries. All extracted information is represented uniformly in one data structure (called the text chart) in a highly compact and linked form in order to support indexing and navigation through the set of solutions. Germa
    corecore