2 research outputs found

    A framework for the management of changing biological experimentation

    Get PDF
    There is no point expending time and effort developing a model if it is based on data that is out of date. Many models require large amounts of data from a variety of heterogeneous sources. This data is subject to frequent and unannounced changes. It may only be possible to know that data has fallen out of date by reconstructing the model with the new data but this leads to further problems. How and when does the data change and when does the model need to be rebuilt? At best, the model will need to be continually rebuilt in a desperate attempt to remain current. At worst, the model will be producing erroneous results. The recent advent of automated and semi-automated data-processing and analysis tools in the biological sciences has brought about a rapid expansion of publicly available data. Many problems arise in the attempt to deal with this magnitude of data; some have received more attention than others. One significant problem is that data within these publicly available databases is subject to change in an unannounced and unpredictable manner. Large amounts of complex data from multiple, heterogeneous sources are obtained and integrated using a variety of tools. These data and tools are also subject to frequent change, much like the biological data. Reconciling these changes, coupled with the interdisciplinary nature of in silico biological experimentation, presents a significant problem. We present the ExperimentBuilder, an application that records both the current and previous states of an experimental environment. Both the data and metadata about an experiment are recorded. The current and previous versions of each of these experimental components are maintained within the ExperimentBuilder. When any one of these components change, the ExperimentBuilder estimates not only the impact within that specific experiment, but also traces the impact throughout the entire experimental environment. This is achieved with the use of keyword profiles, a heuristic tool for estimating the content of the experimental component. We can compare one experimental component to another regardless of their type and content and build a network of inter-component relationships for the entire environment. Ultimately, we can present the impact of an update as a complete cost to the entire environment in order to make an informed decision about whether to recalculate our results

    V-Grid -- A Versioning Services Framework for the Grid

    No full text
    A large variety of emerging Computational Grid applications require versioning services to support effective management of constantly changing datasets and implementations of data processing transformations. This paper presents V-Grid, a framework for generating Grid Data Services with versioning support from UML models that contain structural description for the datasets and schema tuning information. The generated systems can be integrated using active rules to support dynamic composition of versioning services and large federated workspaces consisting of objects that reside in the individual systems
    corecore