82 research outputs found

    BioCaster: detecting public health rumors with a Web-based text mining system

    Get PDF
    Summary: BioCaster is an ontology-based text mining system for detecting and tracking the distribution of infectious disease outbreaks from linguistic signals on the Web. The system continuously analyzes documents reported from over 1700 RSS feeds, classifies them for topical relevance and plots them onto a Google map using geocoded information. The background knowledge for bridging the gap between Layman's terms and formal-coding systems is contained in the freely available BioCaster ontology which includes information in eight languages focused on the epidemiological role of pathogens as well as geographical locations with their latitudes/longitudes. The system consists of four main stages: topic classification, named entity recognition (NER), disease/location detection and event recognition. Higher order event analysis is used to detect more precisely specified warning signals that can then be notified to registered users via email alerts. Evaluation of the system for topic recognition and entity identification is conducted on a gold standard corpus of annotated news articles

    Automatic Thai Ontology Construction and Maintenance System

    No full text
    Ontology is an essential resource to enhance the performance of Information Processing system such as information integration, document classification in taxonomies, including information retrieval and data cleaning in database system. This paper proposes three methodologies for Automatic Thai Ontology Construction and Maintenance from technical corpus, dictionary and thesaurus. For corpus based ontology construction, Shallow Parser is used for terms extraction. Syntactic-semantic constraint and Name Entities Extraction are used for ontological relation identification. For dictionary based Ontology extraction, we applied Task Oriented Parser to extract relational terms. Finally, we converted Broader/Narrower relation of the Domain Specific thesaurus to IS-A relation. The accuracy of the Automatic Thai Ontology Construction and Maintenance System based on agriculture corpus, dictionary and thesaurus is 73 %, 100% and 91% respectively. The organizing system accuracy is 87 %
    • …
    corecore