304 research outputs found

    A unified framework for managing provenance information in translational research

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>A critical aspect of the NIH <it>Translational Research </it>roadmap, which seeks to accelerate the delivery of "bench-side" discoveries to patient's "bedside," is the management of the <it>provenance </it>metadata that keeps track of the origin and history of data resources as they traverse the path from the bench to the bedside and back. A comprehensive provenance framework is essential for researchers to verify the quality of data, reproduce scientific results published in peer-reviewed literature, validate scientific process, and associate trust value with data and results. Traditional approaches to provenance management have focused on only partial sections of the translational research life cycle and they do not incorporate "domain semantics", which is essential to support domain-specific querying and analysis by scientists.</p> <p>Results</p> <p>We identify a common set of challenges in managing provenance information across the <it>pre-publication </it>and <it>post-publication </it>phases of data in the translational research lifecycle. We define the semantic provenance framework (SPF), underpinned by the Provenir upper-level provenance ontology, to address these challenges in the four stages of provenance metadata:</p> <p>(a) Provenance <b>collection </b>- during data generation</p> <p>(b) Provenance <b>representation </b>- to support interoperability, reasoning, and incorporate domain semantics</p> <p>(c) Provenance <b>storage </b>and <b>propagation </b>- to allow efficient storage and seamless propagation of provenance as the data is transferred across applications</p> <p>(d) Provenance <b>query </b>- to support queries with increasing complexity over large data size and also support knowledge discovery applications</p> <p>We apply the SPF to two exemplar translational research projects, namely the Semantic Problem Solving Environment for <it>Trypanosoma cruzi </it>(<it>T.cruzi </it>SPSE) and the Biomedical Knowledge Repository (BKR) project, to demonstrate its effectiveness.</p> <p>Conclusions</p> <p>The SPF provides a unified framework to effectively manage provenance of translational research data during pre and post-publication phases. This framework is underpinned by an upper-level provenance ontology called Provenir that is extended to create domain-specific provenance ontologies to facilitate provenance interoperability, seamless propagation of provenance, automated querying, and analysis.</p

    Massively parallel reasoning in transitive relationship hierarchies

    Get PDF
    This research focuses on building a parallel knowledge representation and reasoning system for the purpose of making progress in realizing human-like intelligence. To achieve human-like intelligence, it is necessary to model human reasoning processes by programs. Knowledge in the real world is huge in size, complex in structure, and is also constantly changing even in limited domains. Unfortunately, reasoning algorithms are very often intractable, which means that they are too slow for any practical applications. One technique to deal with this problem is to design special-purpose reasoners. Many past Al systems have worked rather nicely for limited problem sizes, but attempts to extend them to realistic subsets of world knowledge have led to difficulties. Even special purpose reasoners are not immune to this impasse. In this work, to overcome this problem, we are combining special purpose reasoners with massive We have developed and implemented a massively parallel transitive closure reasoner, called Hydra, that can dynamically assimilate any transitive, binary relation and efficiently answer queries using the transitive closure of all those relations. Within certain limitations, we achieve constant-time responses for transitive closure queries. Hydra can dynamically insert new concepts or new links into a. knowledge base for realistic problem sizes. To get near human-like reasoning capabilities requires the possibility of dynamic updates of the transitive relation hierarchies. Our incremental, massively parallel, update algorithms can achieve almost constant time updates of large knowledge bases. Hydra expands the boundaries of Knowledge Representation and Reasoning in a number of different directions: (1) Hydra improves the representational power of current systems. We have developed a set-based representation for class hierarchies that makes it easy to represent class hierarchies on arrays of processors. Furthermore, we have developed and implemented two methods for mapping this set-based representation onto the processor space of a Connection Machine. These two representations, the Grid Representation and the Double Strand Representation successively improve transitive closure reasoning in terms of speed and processor utilization. (2) Hydra allows fast rerieval and dynamic update of a large knowledge base. New fast update algorithms are formulated to dynamically insert new concepts or new relations into a knowledge base of thousands of nodes. (3) Hydra provides reasoning based on mixed hierarchical representations. We have designed representational tools and massively parallel reasoning algorithms to model reasoning in combined IS-A, Part-of, and Contained-in hierarchies. (4) Hydra\u27s reasoning facilities have been successfully applied to the Medical Entities Dictionary, a large medical vocabulary of Columbia Presbyterian Medical Center. As a result of (1) - (3), Hydra is more general than many current special-purpose reasoners, faster than currently existing general-purpose reasoners, and its knowledge base can be updated dynamically

    DynamiTE: Parallel Materialization of Dynamic RDF Data

    Get PDF
    One of the main advantages of using semantically annotated data is that machines can reason on it, deriving implicit knowledge from explicit information. In this context, materializing every possible implicit derivation from a given input can be computationally expensive, especially when considering large data volumes. Most of the solutions that address this problem rely on the assumption that the information is static, i.e., that it does not change, or changes very infrequently. However, the Web is extremely dynamic: online newspapers, blogs, social networks, etc., are frequently changed so that outdated information is removed and replaced with fresh data. This demands for a materialization that is not only scalable, but also reactive to changes. In this paper, we consider the problem of incremental materialization, that is, how to update the materialized derivations when new data is added or removed. To this purpose, we consider the Ļdf RDFS fragment [12], and present a parallel system that implements a number of algorithms to quickly recalculate the derivation. In case new data is added, our system uses a parallel version of the well-known semi-naive evaluation of Datalog. In case of removals, we have implemented two algorithms, one based on previous theoretical work, and another one that is more efficient since it does not require a complete scan of the input. We have evaluated the performance using a prototype system called DynamiTE, which organizes the knowledge bases with a number of indices to facilitate the query process and exploits parallelism to improve the performance. The results show that our methods are indeed capable to recalculate the derivation in a short time, opening the door to reasoning on much more dynamic data than is currently possible. Ā© 2013 Springer-Verlag

    Hybrid reasoning on OWL RL

    Get PDF

    A Hierarchical Path View Model for Path Finding in Intelligent Transportation Systems

    Full text link
    Effective path finding has been identified as an important requirement for dynamic route guidance in Intelligent Transportation Systems (ITS). Path finding is most efficient if the all-pair (shortest) paths are precomputed because path search requires only simple lookups of the precomputed path views. Such an approach however incurs path view maintenance (computation and update) and storage costs which can be unrealistically high for large ITS networks. To lower these costs, we propose a Hierarchical Path View Model (HPVM) that partitions an ITS road map, and then creates a hierarchical structure based on the road type classification. HPVM includes a map partition algorithm for creating the hierarchy, path view maintenance algorithms, and a heuristic hierarchical path finding algorithm that searches paths by traversing the hierarchy. HPVM captures the dynamicity of traffic change patterns better than the ITS path finding systems that use the hierarchical A * approach because: (1) during path search, HPVM traverses the hierarchy by dynamically selecting the connection points between two levels based on up-to-date traffic, and (2) HPVM can reroute the high-speed road traffic through local streets if needed. In this paper, we also present experimental results used to benchmark HPVM and to compare HPVM with alternative ITS path finding approaches, using both synthetic and real ITS maps that include a large Detroit map (> 28,000 nodes). The results show that the HPVM incurs much lower costs in path view maintenance and storage than the non-hierarchical path precomputation approach, and is more efficient in path search than the traditional ITS path finding using A * or hierarchical A * algorithms.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/45593/1/10707_2004_Article_142477.pd

    Identification of Design Principles

    Get PDF
    This report identifies those design principles for a (possibly new) query and transformation language for the Web supporting inference that are considered essential. Based upon these design principles an initial strawman is selected. Scenarios for querying the Semantic Web illustrate the design principles and their reflection in the initial strawman, i.e., a first draft of the query language to be designed and implemented by the REWERSE working group I4
    • ā€¦
    corecore