4 research outputs found

    Application of Systematic Data Mining for Prediction of Biological Quality Indices

    No full text
    Data mining is not only a simple application of an algorithm on the data set. It is rather a systematic approach that is absolutely necessary, if we want to obtain useful and meaningful patterns from data. This paper shows how the usage of systematic data mining can help to simplify the first determination of the quality of marine habitats in the western Baltic Sea. The Benthic Quality Index (BQI) has been introduced within the European Union Water Framework Directive to assess the quality of marine habitats. The index is based on sensitivity/tolerance classification and quantitative information on the composition of soft-bottom macrofauna. The calculation of the index is based on the exact designation of the found tax

    Components and Aspects of an Integrated Data Management Approach

    No full text
    The Kiel Data Management Infrastructure (KDMI) started from a cooperation of three large-scale projects (SFB574, SFB754 and Cluster of Excellence The Future Ocean) and the Leibniz Institute of Marine Sciences (IFM-GEOMAR). KDMI key features focus on the data provenance which we consider to comprise the entire workflow from field sampling or measurements through lab work to data calculation and evaluation. Managing the data of each individual project participant in this way yields the data management for the entire project and warrants the reusability of (meta)data. Accordingly scientists provide a workflow definition of their data creation procedures resulting in their target variables. The central idea in the development of the KDMI presented here is inspired by the object oriented programming concept which allows to have one object definition (workflow) and infinite numbers of object instances (data). Each definition is created by a graphical user interface and produces XML output stored in a database using a generic data model. On creation of a data instance the KDMI translates the definition into web forms for the scientist, the generic data model then accepts all information input following the given data provenance definition. An important aspect of the implementation phase is the possibility of a successive transition from daily measurement routines resulting in single spreadsheet files with well known points of failure and limited reusability to a central infrastructure as a single point of truth. An interim system allows users to upload and share data files from cruises and expeditions. It relates files to metadata such as where, when, what, who etc. As a proof of concept we use a 'truncated workflow' to migrate a selection of marine chemical data files and their structured metadata into the generic data model. A web application will allow data extraction for selectable parameters, time and geocoordinates. The availability of these widely used data is expected to motivate more scientists to design their own workflows for their upcoming work and their resulting data. This data provenance approach in terms of human workflows has several positive side effects: (1) the scientist designs the extend and timing of data and metadata prompts by workflow definitions while (2) consistency and completeness (mandatory information) of metadata in the resulting XML document can be checked by XML validation. (3) Storage of the entire data creation process (including raw data and processing steps) provides a multidimensional quality history accessible by all researchers in addition to the commonly applied one dimensional quality flag system and thus (4) improves the reuseability of the data. (5) The KDMI concept focuses on bringing data management infrastructure into the daily measurement routines instead of the final data management hassle at the end of each project. (6) The KDMI can be extended to other scientific disciplines or new scientific procedures by simply adding new workflow definitions. The data input can start from this point while domain specific outputs with the newly added data instances will be created by the KDM-Team. The KDMI follows scientists' requests for Web 2.0 like (net)working platforms but instead of sharing privacy or making friends it is all about sharing daily scientific work and data with project partners. For this purpose we have deployed a portal server (Liferay) where individual scientists are assigned to project communities and working groups or have their own working spaces. All these features are expected to raise the acceptance of the integrated data management applications and advance scientific collaboration

    A path to filled archives

    No full text
    Reluctance to deposit data is rife among researchers, despite broad agreement on the principle of data sharing. More and better information will reach hitherto empty archives, if professional support is given during data creation, not in a project's final phase
    corecore