48 research outputs found

    Analysis and Synthesis of Metadata Goals for Scientific Data

    Get PDF
    The proliferation of discipline-specific metadata schemes contributes to artificial barriers that can impede interdisciplinary and transdisciplinary research. The authors considered this problem by examining the domains, objectives, and architectures of nine metadata schemes used to document scientific data in the physical, life, and social sciences. They used a mixed-methods content analysis and Greenberg’s (2005) metadata objectives, principles, domains, and architectural layout (MODAL) framework, and derived 22 metadata-related goals from textual content describing each metadata scheme. Relationships are identified between the domains (e.g., scientific discipline and type of data) and the categories of scheme objectives. For each strong correlation (\u3e0.6), a Fisher’s exact test for nonparametric data was used to determine significance (p \u3c .05). Significant relationships were found between the domains and objectives of the schemes. Schemes describing observational data are more likely to have “scheme harmonization” (compatibility and interoperability with related schemes) as an objective; schemes with the objective “abstraction” (a conceptual model exists separate from the technical implementation) also have the objective “sufficiency” (the scheme defines a minimal amount of information to meet the needs of the community); and schemes with the objective “data publication” do not have the objective “element refinement.” The analysis indicates that many metadata-driven goals expressed by communities are independent of scientific discipline or the type of data, although they are constrained by historical community practices and workflows as well as the technological environment at the time of scheme creation. The analysis reveals 11 fundamental metadata goals for metadata documenting scientific data in support of sharing research data across disciplines and domains. The authors report these results and highlight the need for more metadata-related research, particularly in the context of recent funding agency policy changes

    Meta-Information as a Service: A Big Social Data Analysis Framework

    Get PDF
    Social information services generate a large amount of data. Traditional social information service analysis techniques first require the large data to be stored, and afterwards processed and analyzed. However, as the size of the data grows the storing and processing cost increases. In this paper, we propose a ‘Meta-Information as a Service’ (MIaaS) framework that extracts the data from various social information services and transforms into useful information. The framework provides a new formal model to present the services required for social information service data analysis. An efficient data model to store and access the information. We also propose a new Quality of Service (QoS) model to capture the dynamic features of social information services. We use social information service based sentiment analysis as a motivating scenario. Experiments are conducted on real dataset. The preliminary results prove the feasibility of the proposed approach

    Data Near Here: Bringing Relevant Data Closer to Scientists

    Get PDF
    Large scientific repositories run the risk of losing value as their holdings expand, if it means increased effort for a scientist to locate particular datasets of interest. We discuss the challenges that scientists face in locating relevant data, and present our work in applying Information Retrieval techniques to dataset search, as embodied in the Data Near Here application

    Vulnerabilities to agricultural production shocks: An extreme, plausible scenario for assessment of risk for the insurance sector

    Get PDF
    Climate risks pose a threat to the function of the global food system and therefore also a hazard to the global financial sector, the stability of governments, and the food security and health of the world’s population. This paper presents a method to assess plausible impacts of an agricultural production shock and potential materiality for global insurers. A hypothetical, near-term, plausible, extreme scenario was developed based upon modules of historical agricultural production shocks, linked under a warm phase El Niño-Southern Oscillation (ENSO) meteorological framework. The scenario included teleconnected floods and droughts in disparate agricultural production regions around the world, as well as plausible, extreme biotic shocks. In this scenario, global crop yield declines of 10% for maize, 11% for soy, 7% for wheat and 7% for rice result in quadrupled commodity prices and commodity stock fluctuations, civil unrest, significant negative humanitarian consequences and major financial losses worldwide. This work illustrates a need for the scientific community to partner across sectors and industries towards better-integrated global data, modeling and analytical capacities, to better respond to and prepare for concurrent agricultural failure. Governments, humanitarian organizations and the private sector collectively may recognize significant benefits from more systematic assessment of exposure to agricultural climate risk

    DEVELOPMENT OF A METADATA MANAGEMENT SYSTEM FOR AN INTERDISCIPLINARY RESEARCH PROJECT

    Get PDF

    Exploring the motive for data publication in open data initiative: Linking intention to action.

    Get PDF

    Data management practices in Educational Research

    Get PDF
    Data is very important in any research experiment because it occupies a central place in making decisions based on findings resulting from the analysis of such data. Given its central role, it follows that such an important asset as data, deserve effective management in order to protect the integrity and provide an opportunity for effective problem-solving. The main thrust of this paper was to examine data management practices that should be adopted by scholars in maintaining the quality of their research data. In achieving this, various concepts related to this paper were clarified. Various data management practices were also discussed beginning from data generation to data shredding. Based on the underlying observations from the light of the discussions made in this paper, it was recommended among others that: tertiary institutions in every part of the world should endeavour to establish a data management unit that will be saddled with the primary duty of formulating research data management policies, and the hosting of research data; every Journal should as a matter of compulsion, require the submission of data set corresponding to empirical papers submitted by authors and scholars

    Towards mainstreaming of biodiversity data publishing: recommendations of the GBIF Data Publishing Framework Task Group

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Data are the evidentiary basis for scientific hypotheses, analyses and publication, for policy formation and for decision-making. They are essential to the evaluation and testing of results by peer scientists both present and future. There is broad consensus in the scientific and conservation communities that data should be freely, openly available in a sustained, persistent and secure way, and thus standards for 'free' and 'open' access to data have become well developed in recent years. The question of effective access to data remains highly problematic.</p> <p>Discussion</p> <p>Specifically with respect to scientific publishing, the ability to critically evaluate a published scientific hypothesis or scientific report is contingent on the examination, analysis, evaluation - and if feasible - on the re-generation of data on which conclusions are based. It is not coincidental that in the recent 'climategate' controversies, the quality and integrity of data and their analytical treatment were central to the debate. There is recent evidence that even when scientific data are requested for evaluation they may not be available. The history of dissemination of scientific results has been marked by paradigm shifts driven by the emergence of new technologies. In recent decades, the advance of computer-based technology linked to global communications networks has created the potential for broader and more consistent dissemination of scientific information and data. Yet, in this digital era, scientists and conservationists, organizations and institutions have often been slow to make data available. Community studies suggest that the withholding of data can be attributed to a lack of awareness, to a lack of technical capacity, to concerns that data should be withheld for reasons of perceived personal or organizational self interest, or to lack of adequate mechanisms for attribution.</p> <p>Conclusions</p> <p>There is a clear need for institutionalization of a 'data publishing framework' that can address sociocultural, technical-infrastructural, policy, political and legal constraints, as well as addressing issues of sustainability and financial support. To address these aspects of a data publishing framework - a systematic, standard approach to the formal definition and public disclosure of data - in the context of biodiversity data, the Global Biodiversity Information Facility (GBIF, the single inter-governmental body most clearly mandated to undertake such an effort) convened a Data Publishing Framework Task Group. We conceive this data publishing framework as an environment conducive to ensure free and open access to world's biodiversity data. Here, we present the recommendations of that Task Group, which are intended to encourage free and open access to the worlds' biodiversity data.</p

    Iterative Near-Term Ecological Forecasting: Needs, Opportunities, And Challenges

    Get PDF
    Two foundational questions about sustainability are “How are ecosystems and the services they provide going to change in the future?” and “How do human decisions affect these trajectories?” Answering these questions requires an ability to forecast ecological processes. Unfortunately, most ecological forecasts focus on centennial-scale climate responses, therefore neither meeting the needs of near-term (daily to decadal) environmental decision-making nor allowing comparison of specific, quantitative predictions to new observational data, one of the strongest tests of scientific theory. Near-term forecasts provide the opportunity to iteratively cycle between performing analyses and updating predictions in light of new evidence. This iterative process of gaining feedback, building experience, and correcting models and methods is critical for improving forecasts. Iterative, near-term forecasting will accelerate ecological research, make it more relevant to society, and inform sustainable decision-making under high uncertainty and adaptive management. Here, we identify the immediate scientific and societal needs, opportunities, and challenges for iterative near-term ecological forecasting. Over the past decade, data volume, variety, and accessibility have greatly increased, but challenges remain in interoperability, latency, and uncertainty quantification. Similarly, ecologists have made considerable advances in applying computational, informatic, and statistical methods, but opportunities exist for improving forecast-specific theory, methods, and cyberinfrastructure. Effective forecasting will also require changes in scientific training, culture, and institutions. The need to start forecasting is now; the time for making ecology more predictive is here, and learning by doing is the fastest route to drive the science forward
    corecore