32 research outputs found

    Mapping Physical Formats to Logical Models to Extract Data and Metadata: The Defuddle Parsing Engine

    Full text link
    Scientists, fueled by the desire for systems-level understanding of phenomena, increasingly need to share their results across multiple disciplines. Accomplishing this requires data to be annotated, contextualized, and readily searchable and translated into other formats. While these requirements can be addressed by custom programming or obviated by community standardization, neither approach has ā€˜solvedā€™ the problem. In this paper, we describe a complementary approach ā€“ a general capability for articulating the format of arbitrary textual and binary data using a logical data model, expressed in XML-Schema, which can be used to provide annotation and context, extract metadata, and enable translation. This work is based on the draft specification for the Data Format Description Language and our open source ā€œDefuddleā€ parser. We present an overview of the specification, detail the design of Defuddle, and discuss the benefits and challenges of this general approach to enabling discovery and sharing of diverse data sets

    A ā€˜fuzzy clusteringā€™ approach to conceptual confusion: how to classify natural ecological associations

    Get PDF
    The concept of the marine ecological community has recently experienced renewed attention, mainly owing to a shift in conservation policies from targeting single and specific objec- tives (e.g. species) towards more integrated approaches. Despite the value of communities as dis- tinct entities, e.g. for conservation purposes, there is still an ongoing debate on the nature of spe- cies associations. They are seen either as communities, cohesive units of non-randomly associated and interacting members, or as assemblages, groups of species that are randomly associated. We investigated such dualism using fuzzy logic applied to a large dataset in the German Bight (south- eastern North Sea). Fuzzy logic provides the flexibility needed to describe complex patterns of natural systems. Assigning objects to more than one class, it enables the depiction of transitions, avoiding the rigid division into communities or assemblages. Therefore we identified areas with either structured or random species associations and mapped boundaries between communities or assemblages in this more natural way. We then described the impact of the chosen sampling design on the community identification. Four communities, their core areas and probability of occurrence were identified in the German Bight: AMPHIURA-FILIFORMIS, BATHYPOREIA-TELLINA, GONIADELLA-SPISULA, and PHORONIS. They were assessed by estimating overlap and compactness and supported by analysis of beta-diversity. Overall, 62% of the study area was characterized by high species turnover and instability. These areas are very relevant for conservation issues, but become undetectable when studies choose sampling designs with little information or at small spatial scales

    The First Provenance Challenge

    No full text
    The first Provenance Challenge was set up in order to provide a forum for the community to help understand the capabilities of different provenance systems and the expressiveness of their provenance representations. To this end, a Functional Magnetic Resonance Imaging workflow was defined, which participants had to either simulate or run in order to produce some provenance representation, from which a set of identified queries had to be implemented and executed. Sixteen teams responded to the challenge, and submitted their inputs. In this paper, we present the challenge workflow and queries, and summarise the participants contributions

    Development of the RIOT Web Service and Information Technologies to enable mechanism reduction for HCCI simulations.

    Get PDF
    Abstract. New approaches are being explored to facilitate multidisciplinary collaborative research of Homogenous Charge Compression Ignition (HCCI) combustion processes. In this paper, collaborative sharing of the Range Identification and Optimization Toolkit (RIOT) and related data and models is discussed. RIOT is a developmental approach to reduce the computational of detailed chemical kinetic mechanisms, enabling their use in modeling kinetically controlled combustion applications such as HCCI. These approaches are being developed and piloted as a part of the Collaboratory for Multiscale Chemical Sciences (CMCS) project. The capabilities of the RIOT code are shared through a portlet in the CMCS portal that allows easy specification and processing of RIOT inputs, remote execution of RIOT, tracking of data pedigree, and translation of RIOT outputs to a table view and to a commonly-used mechanism format. Introduction The urgent need for high-efficiency, low-emission energy utilization technologies for transportation, power generation, and manufacturing processes presents difficult challenges to the combustion research community. The needed predictive understanding requires systematic knowledge across the full range of physical scales involved in combustion processes -from the properties and interactions of individual molecules to the dynamics and products of turbulent multi-phase reacting flows. Innovative experimental techniques and computational approaches are revolutionizing the rate at which chemical science research can produce the new information necessary to advance our combustion knowledge. But the increased volume and complexity of this information often makes it even more difficult to derive the systems-level knowledge we need. Combustion researchers have responded by forming interdisciplinary communities intent on sharing information and coordinating research priorities. Such efforts face many barriers, however, including lack of data accessibility and interoperability, missing metadata and pedigree information, efficient approaches for sharing data and analysis tools, and the challenges of working together across geography, disciplines, and a very diverse spectrum of applications and funding. This challenge is especially difficult for those developing, sharing and/or using detailed chemical models of combustion to treat the oxidation of practical fuels. This is a very complex problem, and the development of new chemistry models requires a series of steps that involve acquiring and keeping track of a large amount of data and its pedigree. Also, this data is developed using a diverse range of codes and experiments spanning ab initio chemistry codes, laboratory kinetics and flame experiments, all the way to reacting flow simulations on massively parallel computers. Each of these processes typically requires different data formats, and often the data and/or analysis codes are only accessible by personally contacting the creator. Chemical models are usually shared in a legacy file format, such as Chemki

    Genes in the Ureteric Budding Pathway: Association Study on Vesico-Ureteral Reflux Patients

    Get PDF
    Vesico-ureteral reflux (VUR) is the retrograde passage of urine from the bladder to the urinary tract and causes 8.5% of end-stage renal disease in children. It is a complex genetic developmental disorder, in which ectopic embryonal ureteric budding is implicated in the pathogenesis. VUR is part of the spectrum of Congenital Anomalies of the Kidney and Urinary Tract (CAKUT). We performed an extensive association study for primary VUR using a two-stage, case-control design, investigating 44 candidate genes in the ureteric budding pathway in 409 Dutch VUR patients. The 44 genes were selected from the literature and a set of 567 single nucleotide polymorphisms (SNPs) capturing their genetic variation was genotyped in 207 cases and 554 controls. The 14 SNPs with p<0.005 were included in a follow-up study in 202 cases and 892 controls. Of the total cohort, āˆ¼50% showed a clear-cut primary VUR phenotype and āˆ¼25% had both a duplex collecting system and VUR. We also looked for association in these two extreme phenotype groups. None of the SNPs reached a significant p-value. Common genetic variants in four genes (GREM1, EYA1, ROBO2 and UPK3A) show a trend towards association with the development of primary VUR (GREM1, EYA1, ROBO2) or duplex collecting system (EYA1 and UPK3A). SNPs in three genes (TGFB1, GNB3 and VEGFA) have been shown to be associated with VUR in other populations. Only the result of rs1800469 in TGFB1 hinted at association in our study. This is the first extensive study of common variants in the genes of the ureteric budding pathway and the genetic susceptibility to primary VUR
    corecore