2 research outputs found
A framework for the management of changing biological experimentation
There is no point expending time and effort developing a model if it is based on data that is out of date. Many models require large amounts of data from a variety of
heterogeneous sources. This data is subject to frequent and unannounced changes. It may only be possible to know that data has fallen out of date by reconstructing the
model with the new data but this leads to further problems. How and when does the data change and when does the model need to be rebuilt? At best, the model will need
to be continually rebuilt in a desperate attempt to remain current. At worst, the model will be producing erroneous results.
The recent advent of automated and semi-automated data-processing and analysis tools
in the biological sciences has brought about a rapid expansion of publicly available data.
Many problems arise in the attempt to deal with this magnitude of data; some have received more attention than others. One significant problem is that data within these
publicly available databases is subject to change in an unannounced and unpredictable
manner. Large amounts of complex data from multiple, heterogeneous sources are obtained and integrated using a variety of tools. These data and tools are also subject to
frequent change, much like the biological data. Reconciling these changes, coupled with
the interdisciplinary nature of in silico biological experimentation, presents a significant problem.
We present the ExperimentBuilder, an application that records both the current and previous states of an experimental environment. Both the data and metadata about
an experiment are recorded. The current and previous versions of each of these experimental components are maintained within the ExperimentBuilder. When any one
of these components change, the ExperimentBuilder estimates not only the impact within that specific experiment, but also traces the impact throughout the entire experimental environment. This is achieved with the use of keyword profiles, a heuristic tool for estimating the content of the experimental component. We can compare one
experimental component to another regardless of their type and content and build a network of inter-component relationships for the entire environment.
Ultimately, we can present the impact of an update as a complete cost to the entire
environment in order to make an informed decision about whether to recalculate our
results
V-Grid -- A Versioning Services Framework for the Grid
A large variety of emerging Computational Grid applications require versioning services to support effective management of constantly changing datasets and implementations of data processing transformations. This paper presents V-Grid, a framework for generating Grid Data Services with versioning support from UML models that contain structural description for the datasets and schema tuning information. The generated systems can be integrated using active rules to support dynamic composition of versioning services and large federated workspaces consisting of objects that reside in the individual systems