278 research outputs found

    A cross-language document retrieval system based on semantic annotation

    Full text link
    The paper describes a cross-lingual document retrieval system in the medical domain that employs a controlled vocabulary (UMLS1) in constructing an XML-based intermediary representation into which queries as well as documents are mapped. The system assists in the retrieval of English and German medical scientific abstracts relevant to a German query document (electronic patient record). The modularity of the system allows for deployment in other domains, given appropriate linguistic and semantic resources

    Kaukolu: Hub of the semantic corporate intranet

    No full text
    Abstract. Due to their low entry barrier, easy deployment, and simple yet powerful features, wikis have gained popularity for agile knowledge management in communities of almost all sizes. Semantic wikis strive to give entered information more structure in order to allow automatic processing of the wiki’s contents. This facilitates enhanced navigation and search in the wiki itself as well as simple reuse of information in external applications or for generating different views on the same information. This makes semantic wikis especially interesting for corporate intranet deployment, implementing the Semantic Intranet. In this paper, we will have a look at Kaukolu, an open source semantic wiki prototype, being deployed in a corporate intranet. External applications use information authored in Kaukolu, effectively forming a cluster of applications interacting and sharing data.

    Phonological properties of Portuguese clitics: A Declarative Approach

    No full text
    It has repeatedly been noted in the literature (Crysmann, 2000a; Spencer, 1991, among others) that weak pronominals in European Portuguese present some diverging evidence as to their status as lexical affixes or postlexical clitics

    Comparative Evaluation of Techniques for Word Recognition Improvement . . .

    No full text
    Character recognition results are typically post-processed by dictionary look-up methods. Still, the quality of resulting word hypotheses remains lousy. This paper describes and compares three known methods for word-level postprocessing of OCRed documents which all are based on purely statistical means of syntactic language modelling. The three methods compared and tested are described and especially their application to word syntax is related. The implementations have been tested on about 90 printed business letters of different quality. Training of the methods has been undertaken on news paper texts with some 34 millions of running words. Although test set and training set cover different fields of language, the results are quite encouraging and show the methods to be useful in general. 1: Basic Idea and Overview After several decades of research of different approaches and development of well-working systems within the field of optical character recognition (OCR), results of commer..
    • …
    corecore