802 research outputs found

    Semantic Modeling of Analytic-based Relationships with Direct Qualification

    Full text link
    Successfully modeling state and analytics-based semantic relationships of documents enhances representation, importance, relevancy, provenience, and priority of the document. These attributes are the core elements that form the machine-based knowledge representation for documents. However, modeling document relationships that can change over time can be inelegant, limited, complex or overly burdensome for semantic technologies. In this paper, we present Direct Qualification (DQ), an approach for modeling any semantically referenced document, concept, or named graph with results from associated applied analytics. The proposed approach supplements the traditional subject-object relationships by providing a third leg to the relationship; the qualification of how and why the relationship exists. To illustrate, we show a prototype of an event-based system with a realistic use case for applying DQ to relevancy analytics of PageRank and Hyperlink-Induced Topic Search (HITS).Comment: Proceedings of the 2015 IEEE 9th International Conference on Semantic Computing (IEEE ICSC 2015

    Tagging and parsing with cascaded Markov models : automation of corpus annotation

    Get PDF
    This thesis presents new techniques for parsing natural language. They are based on Markov Models, which are commonly used in part-of-speech tagging for sequential processing on the world level. We show that Markov Models can be successfully applied to other levels of syntactic processing. first two classification task are handled: the assignment of grammatical functions and the labeling of non-terminal nodes. Then, Markov Models are used to recognize hierarchical syntactic structures. Each layer of a structure is represented by a separate Markov Model. The output of a lower layer is passed as input to a higher layer, hence the name: Cascaded Markov Models. Instead of simple symbols, the states emit partial context-free structures. The new techniques are applied to corpus annotation and partial parsing and are evaluated using corpora of different languages and domains.Ausgehend von Markov-Modellen, die fĂŒr das Part-of-Speech-Tagging eingesetzt werden, stellt diese Arbeit Verfahren vor, die Markov-Modelle auch auf weiteren Ebenen der syntaktischen Verarbeitung erfolgreich nutzen. Dies betrifft zum einen Klassifikationen wie die Zuweisung grammatischer Funktionen und die Bestimmung von Kategorien nichtterminaler Knoten, zum anderen die Zuweisung hierarchischer, syntaktischer Strukturen durch Markov-Modelle. Letzteres geschieht durch die ReprĂ€sentation jeder Ebene einer syntaktischen Struktur durch ein eigenes Markov-Modell, was den Namen des Verfahrens prĂ€gt: Kaskadierte Markov-Modelle. Deren ZustĂ€nde geben anstelle atomarer Symbole partielle kontextfreie Strukturen aus. Diese Verfahren kommen in der Korpusannotation und dem partiellen Parsing zum Einsatz und werden anhand mehrerer Korpora evaluiert

    Enrichment of Renaissance texts with proper names

    Get PDF
    International audienceThe Renom project proposes to enrich Renaissance texts by proper names. These texts present two new challenges: great diversity due to various spellings of words; numerous XML-TEI tags to save the exact format of original edition. The task consisted to add Named Entity tags to this format tagging with generally the left context and sometimes the right context of a name. To do that, we improved the free and open source program CasSys to parse texts with Unitex graph cascades and we built dictionaries and specific cascades. The slot error rate was 6.1%. Proper Names and maps. were to allow navigating into. So, this paper deals with Named Entity Recognition in Renaissance texts

    A New Open Information Extraction System Using Sentence Difficulty Estimation

    Get PDF
    The World Wide Web has a considerable amount of information expressed using natural language. While unstructured text is often difficult for machines to understand, Open Information Extraction (OIE) is a relation-independent extraction paradigm designed to extract assertions directly from massive and heterogeneous corpora. Allocation of low-cost computational resources is a main demand for Open Relation Extraction (ORE) systems. A large number of ORE methods have been proposed recently, covering a wide range of NLP tools, from ``shallow'' (e.g., part-of-speech tagging) to ``deep'' (e.g., semantic role labeling). There is a trade-off between NLP tools depth versus efficiency (computational cost) of ORE systems. This paper describes a novel approach called Sentence Difficulty Estimator for Open Information Extraction (SDE-OIE) for automatic estimation of relation extraction difficulty by developing some difficulty classifiers. These classifiers dedicate the input sentence to an appropriate OIE extractor in order to decrease the overall computational cost. Our evaluations show that an intelligent selection of a proper depth of ORE systems has a significant improvement on the effectiveness and scalability of SDE-OIE. It avoids wasting resources and achieves almost the same performance as its constituent deep extractor in a more reasonable time

    Data-Driven Spelling Correction using Weighted Finite-State Methods

    Get PDF
    This paper presents two systems for spelling correction formulated as a sequence labeling task. One of the systems is an unstructured classifier and the other one is structured. Both systems are implemented using weighted finite-state methods. The structured system delivers stateof-the-art results on the task of tweet normalization when compared with the recent AliSeTra system introduced by Eger et al. (2016) even though the system presented in the paper is simpler than AliSeTra because it does not include a model for input segmentation. In addition to experiments on tweet normalization, we present experiments on OCR post-processing using an Early Modern Finnish corpus of OCR processed newspaper text.Peer reviewe

    Digital signal processing for the analysis of fetal breathing movements

    Get PDF

    Tagging and parsing with cascaded Markov models : automation of corpus annotation

    Get PDF
    This thesis presents new techniques for parsing natural language. They are based on Markov Models, which are commonly used in part-of-speech tagging for sequential processing on the world level. We show that Markov Models can be successfully applied to other levels of syntactic processing. first two classification task are handled: the assignment of grammatical functions and the labeling of non-terminal nodes. Then, Markov Models are used to recognize hierarchical syntactic structures. Each layer of a structure is represented by a separate Markov Model. The output of a lower layer is passed as input to a higher layer, hence the name: Cascaded Markov Models. Instead of simple symbols, the states emit partial context-free structures. The new techniques are applied to corpus annotation and partial parsing and are evaluated using corpora of different languages and domains.Ausgehend von Markov-Modellen, die fĂŒr das Part-of-Speech-Tagging eingesetzt werden, stellt diese Arbeit Verfahren vor, die Markov-Modelle auch auf weiteren Ebenen der syntaktischen Verarbeitung erfolgreich nutzen. Dies betrifft zum einen Klassifikationen wie die Zuweisung grammatischer Funktionen und die Bestimmung von Kategorien nichtterminaler Knoten, zum anderen die Zuweisung hierarchischer, syntaktischer Strukturen durch Markov-Modelle. Letzteres geschieht durch die ReprĂ€sentation jeder Ebene einer syntaktischen Struktur durch ein eigenes Markov-Modell, was den Namen des Verfahrens prĂ€gt: Kaskadierte Markov-Modelle. Deren ZustĂ€nde geben anstelle atomarer Symbole partielle kontextfreie Strukturen aus. Diese Verfahren kommen in der Korpusannotation und dem partiellen Parsing zum Einsatz und werden anhand mehrerer Korpora evaluiert
    • 

    corecore