2,886 research outputs found

    Normalisation of imprecise temporal expressions extracted from text

    Get PDF
    Information extraction systems and techniques have been largely used to deal with the increasing amount of unstructured data available nowadays. Time is among the different kinds of information that may be extracted from such unstructured data sources, including text documents. However, the inability to correctly identify and extract temporal information from text makes it difficult to understand how the extracted events are organised in a chronological order. Furthermore, in many situations, the meaning of temporal expressions (timexes) is imprecise, such as in “less than 2 years” and “several weeks”, and cannot be accurately normalised, leading to interpretation errors. Although there are some approaches that enable representing imprecise timexes, they are not designed to be applied to specific scenarios and difficult to generalise. This paper presents a novel methodology to analyse and normalise imprecise temporal expressions by representing temporal imprecision in the form of membership functions, based on human interpretation of time in two different languages (Portuguese and English). Each resulting model is a generalisation of probability distributions in the form of trapezoidal and hexagonal fuzzy membership functions. We use an adapted F1-score to guide the choice of the best models for each kind of imprecise timex and a weighted F1-score (F1 3 D ) as a complementary metric in order to identify relevant differences when comparing two normalisation models. We apply the proposed methodology for three distinct classes of imprecise timexes, and the resulting models give distinct insights in the way each kind of temporal expression is interpreted

    Endoscopic Transnasal Versus Open Transcranial Cranial Base Surgery: The Need For A Serene Assessment

    Get PDF

    On Range Searching with Semialgebraic Sets II

    Full text link
    Let PP be a set of nn points in Rd\R^d. We present a linear-size data structure for answering range queries on PP with constant-complexity semialgebraic sets as ranges, in time close to O(n11/d)O(n^{1-1/d}). It essentially matches the performance of similar structures for simplex range searching, and, for d5d\ge 5, significantly improves earlier solutions by the first two authors obtained in~1994. This almost settles a long-standing open problem in range searching. The data structure is based on the polynomial-partitioning technique of Guth and Katz [arXiv:1011.4105], which shows that for a parameter rr, 1<rn1 < r \le n, there exists a dd-variate polynomial ff of degree O(r1/d)O(r^{1/d}) such that each connected component of RdZ(f)\R^d\setminus Z(f) contains at most n/rn/r points of PP, where Z(f)Z(f) is the zero set of ff. We present an efficient randomized algorithm for computing such a polynomial partition, which is of independent interest and is likely to have additional applications

    Data reliability assessment in a data warehouse opened on the Web

    Get PDF
    International audienceThis paper presents an ontology-driven workflow that feeds and queries a data warehouse opened on the Web. Data are extracted from data tables in Web documents. As web documents are very heterogeneous in nature, a key issue in this workflow is the ability to assess the reliability of retrieved data. We first recall the main steps of our method to annotate and query Web data tables driven by a domain ontology. Then we propose an original method to assess Web data table reliability from a set of criteria by the means of evidence theory. Finally, we show how we extend the workflow to integrate the reliability assessment step

    Ontology-based knowledge representation and semantic search information retrieval: case study of the underutilized crops domain

    Get PDF
    The aim of using semantic technologies in domain knowledge modeling is to introduce the semantic meaning of concepts in knowledge bases, such that they are both human-readable as well as machine-understandable. Due to their powerful knowledge representation formalism and associated inference mechanisms, ontology-based approaches have been increasingly adopted to formally represent domain knowledge. The primary objective of this thesis work has been to use semantic technologies in advancing knowledge-sharing of Underutilized crops as a domain and investigate the integration of underlying ontologies developed in OWL (Web Ontology Language) with augmented SWRL (Semantic Web Rule Language) rules for added expressiveness. The work further investigated generating ontologies from existing data sources and proposed the reverse-engineering approach of generating domain specific conceptualization through competency questions posed from possible ontology users and domain experts. For utilization, a semantic search engine (the Onto-CropBase) has been developed to serve as a Web-based access point for the Underutilized crops ontology model. Relevant linked-data in Resource Description Framework Schema (RDFS) were added for comprehensiveness in generating federated queries. While the OWL/SWRL combination offers a highly expressive ontology language for modeling knowledge domains, the combination is found to be lacking supplementary descriptive constructs to model complex real-life scenarios, a necessary requirement for a successful Semantic Web application. To this end, the common logic programming formalisms for extending Description Logic (DL)-based ontologies were explored and the state of the art in SWRL expressiveness extensions determined with a view to extending the SWRL formalism. Subsequently, a novel fuzzy temporal extension to the Semantic Web Rule Language (FT-SWRL), which combines SWRL with fuzzy logic theories based on the valid-time temporal model, has been proposed to allow modeling imprecise temporal expressions in domain ontologies

    Temporal detection and analysis of guideline interactions

    Get PDF
    Background Clinical practice guidelines (CPGs) are assuming a major role in the medical area, to grant the quality of medical assistance, supporting physicians with evidence-based information of interventions in the treatment of single pathologies. The treatment of patients affected by multiple diseases (comorbid patients) is one of the main challenges for the modern healthcare. It requires the development of new methodologies, supporting physicians in the treatment of interactions between CPGs. Several approaches have started to face such a challenging problem. However, they suffer from a substantial limitation: they do not take into account the temporal dimension. Indeed, practically speaking, interactions occur in time. For instance, the effects of two actions taken from different guidelines may potentially conflict, but practical conflicts happen only if the times of execution of such actions are such that their effects overlap in time. Objectives We aim at devising a methodology to detect and analyse interactions between CPGs that considers the temporal dimension. Methods In this paper, we first extend our previous ontological model to deal with the fact that actions, goals, effects and interactions occur in time, and to model both qualitative and quantitative temporal constraints between them. Then, we identify different application scenarios, and, for each of them, we propose different types of facilities for user physicians, useful to support the temporal detection of interactions. Results We provide a modular approach in which different Artificial Intelligence temporal reasoning techniques, based on temporal constraint propagation, are widely exploited to provide users with such facilities. We applied our methodology to two cases of comorbidities, using simplified versions of CPGs. Conclusion We propose an innovative approach to the detection and analysis of interactions between CPGs considering different sources of temporal information (CPGs, ontological knowledge and execution logs), which is the first one in the literature that takes into account the temporal issues, and accounts for different application scenarios

    EquiX---A Search and Query Language for XML

    Full text link
    EquiX is a search language for XML that combines the power of querying with the simplicity of searching. Requirements for such languages are discussed and it is shown that EquiX meets the necessary criteria. Both a graphical abstract syntax and a formal concrete syntax are presented for EquiX queries. In addition, the semantics is defined and an evaluation algorithm is presented. The evaluation algorithm is polynomial under combined complexity. EquiX combines pattern matching, quantification and logical expressions to query both the data and meta-data of XML documents. The result of a query in EquiX is a set of XML documents. A DTD describing the result documents is derived automatically from the query.Comment: technical report of Hebrew University Jerusalem Israe

    AAPOR Report on Big Data

    Get PDF
    In recent years we have seen an increase in the amount of statistics in society describing different phenomena based on so called Big Data. The term Big Data is used for a variety of data as explained in the report, many of them characterized not just by their large volume, but also by their variety and velocity, the organic way in which they are created, and the new types of processes needed to analyze them and make inference from them. The change in the nature of the new types of data, their availability, the way in which they are collected, and disseminated are fundamental. The change constitutes a paradigm shift for survey research.There is a great potential in Big Data but there are some fundamental challenges that have to be resolved before its full potential can be realized. In this report we give examples of different types of Big Data and their potential for survey research. We also describe the Big Data process and discuss its main challenges
    corecore