469 research outputs found

    Opportunity Brief | Beyond Original Intent: Environmental Data Stewardship for Diverse Uses

    Get PDF
    Environmental data is collected and used by researchers, regulatory agencies, communities, and businesses for a variety of purposes, though much of it only in single projects or to check regulatory boxes. While more data is being shared than ever before, spurred on by recent open data policies, increased availability does not guarantee that those who might use it can find, access, understand, or apply it in new contexts, nor that such data will be governed ethically.Open Environmental Data Project convened stakeholders from government, academia, environmental nonprofits, journalism, community organizations, and the private sector to discuss the challenges and promising solutions they've faced in sharing, using, and reusing environmental data. Four major needs emerged: findability and accessibility, data formats and infrastructure that enable interoperability, high quality or detailed enough data to answer different questions, and user capacity to understand and analyze the data. This brief offers nine opportunities to address these needs, each building on or leveraging existing efforts

    Framework to Automatically Determine the Quality of Open Data Catalogs

    Full text link
    Data catalogs play a crucial role in modern data-driven organizations by facilitating the discovery, understanding, and utilization of diverse data assets. However, ensuring their quality and reliability is complex, especially in open and large-scale data environments. This paper proposes a framework to automatically determine the quality of open data catalogs, addressing the need for efficient and reliable quality assessment mechanisms. Our framework can analyze various core quality dimensions, such as accuracy, completeness, consistency, scalability, and timeliness, offer several alternatives for the assessment of compatibility and similarity across such catalogs as well as the implementation of a set of non-core quality dimensions such as provenance, readability, and licensing. The goal is to empower data-driven organizations to make informed decisions based on trustworthy and well-curated data assets. The source code that illustrates our approach can be downloaded from https://www.github.com/jorge-martinez-gil/dataq/.Comment: 25 page

    The best of both worlds: highlighting the synergies of combining manual and automatic knowledge organization methods to improve information search and discovery.

    Get PDF
    Research suggests organizations across all sectors waste a significant amount of time looking for information and often fail to leverage the information they have. In response, many organizations have deployed some form of enterprise search to improve the 'findability' of information. Debates persist as to whether thesauri and manual indexing or automated machine learning techniques should be used to enhance discovery of information. In addition, the extent to which a knowledge organization system (KOS) enhances discoveries or indeed blinds us to new ones remains a moot point. The oil and gas industry was used as a case study using a representative organization. Drawing on prior research, a theoretical model is presented which aims to overcome the shortcomings of each approach. This synergistic model could help to re-conceptualize the 'manual' versus 'automatic' debate in many enterprises, accommodating a broader range of information needs. This may enable enterprises to develop more effective information and knowledge management strategies and ease the tension between what arc often perceived as mutually exclusive competing approaches. Certain aspects of the theoretical model may be transferable to other industries, which is an area for further research

    BBMRI-ERIC Negotiator:Implementing Efficient Access to Biobanks

    Get PDF
    Various biological resources, such as biobanks and disease-specific registries, have become indispensable resources to better understand the epidemiology and biological mechanisms of disease and are fundamental for advancing medical research. Nevertheless, biobanks and similar resources still face significant challenges to become more findable and accessible by users on both national and global scales. One of the main challenges for users is to find relevant resources using cataloging and search services such as the BBMRI-ERIC Directory, operated by European Research Infrastructure on Biobanking and Biomolecular Resources (BBMRI-ERIC), as these often do not contain the information needed by the researchers to decide if the resource has relevant material/data; these resources are only weakly characterized. Hence, the researcher is typically left with too many resources to explore and investigate. In addition, resources often have complex procedures for accessing holdings, particularly for depletable biological materials. This article focuses on designing a system for effective negotiation of access to holdings, in which a researcher can approach many resources simultaneously, while giving each resource team the ability to implement their own mechanisms to check if the material/data are available and to decide if access should be provided. The BBMRI-ERIC has developed and implemented an access and negotiation tool called the BBMRI-ERIC Negotiator. The Negotiator enables access negotiation to more than 600 biobanks from the BBMRI-ERIC Directory and other discovery services such as GBA/BBMRI-ERIC Locator or RD-Connect Finder. This article summarizes the principles that guided the design of the tool, the terminology used and underlying data model, request workflows, authentication and authorization mechanism(s), and the mechanisms and monitoring processes to stimulate the desired behavior of the resources: to effectively deliver access to biological material and data

    Use of Optional Data Curation Features by Users of Harvard Dataverse Repository

    Get PDF
    Objective: Investigate how different groups of depositors vary in their use of optional data curation features that provide support for FAIR research data in the Harvard Dataverse repository. Methods: A numerical score based upon the presence or absence of characteristics associated with the use of optional features was assigned to each of the 29,295 datasets deposited in Harvard Dataverse between 2007 and 2019. Statistical analyses were performed to investigate patterns of optional feature use amongst different groups of depositors and their relationship to other dataset characteristics. Results: Members of groups make greater use of Harvard Dataverse\u27s optional features than individual researchers. Datasets that undergo a data curation review before submission to Harvard Dataverse, are associated with a publication, or contain restricted files also make greater use of optional features. Conclusions: Individual researchers might benefit from increased outreach and improved documentation about the benefits and use of optional features to improve their datasets\u27 level of curation beyond the FAIR-informed support that the Harvard Dataverse repository provides by default. Platform designers, developers, and managers may also use the numerical scoring approach to explore how different user groups use optional application features

    The Search for a New OPAC: Selecting an Open Source Discovery Layer

    Get PDF
    In early 2011, an Indiana University Libraries task force was charged with selecting an open source discovery layer to serve as the public interface for IU's online catalog, IUCAT. This process included creating a rubric of core functionality and rating two discovery layers based on criteria in four main categories: general features and functionality; authentication and account management; export and share; and search functionality and results display. The article includes information about our rubric and the two discovery layers reviewed, Blacklight and VuFind, as well as a discussion of the priorities of the task force. The article concludes with future steps and anticipated highlights for IUCAT

    The FAIR Cookbook - the essential resource for and by FAIR doers

    Get PDF
    The notion that data should be Findable, Accessible, Interoperable and Reusable, according to the FAIR Principles, has become a global norm for good data stewardship and a prerequisite for reproducibility. Nowadays, FAIR guides data policy actions and professional practices in the public and private sectors. Despite such global endorsements, however, the FAIR Principles are aspirational, remaining elusive at best, and intimidating at worst. To address the lack of practical guidance, and help with capability gaps, we developed the FAIR Cookbook, an open, online resource of hands-on recipes for “FAIR doers” in the Life Sciences. Created by researchers and data managers professionals in academia, (bio)pharmaceutical companies and information service industries, the FAIR Cookbook covers the key steps in a FAIRification journey, the levels and indicators of FAIRness, the maturity model, the technologies, the tools and the standards available, as well as the skills required, and the challenges to achieve and improve data FAIRness. Part of the ELIXIR ecosystem, and recommended by funders, the FAIR Cookbook is open to contributions of new recipes.We thank all book dash participants and recipe authors, as well as the FAIRplus fellows, all partners, and the members of the FAIRplus Scientific Advisory Board, and the management team. In particular we acknowledge a number of colleagues for their role in the FAIRplus project, in particular: Ebitsam Alharbi (0000-0002-3887-3857), Oya Deniz Beyan (0000-0001-7611-3501), Ola Engkvist (0000-0003-4970-6461), Laura Furlong (0000-0002-9383-528X), Carole Goble (0000-0003-1219-2137), Mark Ibberson (0000-0003-3152-5670), Manfred Kohler, Nick Lynch (0000-0002-8997-5298), Scott Lusher (0000-0003-2401-4223), Jean-Marc Neefs, George Papadotas, Manuela Pruess (0000-0002-6857-5543), Ratnesh Sahay, Rudi Verbeeck (0000-0001-5445-6095), Bryn Williams-Jones, and Gesa Witt (0000-0003-2344-706X). This work and the authors were primarily funded by FAIRplus (IMI 802750). PRS and SAS also acknowledge contributions from the following grants (the FAIR Cookbook is also embedded in or connected to): ELIXIR Interoperability Platform, EOSC-Life (H2020-EU 824087), FAIRsharing (Wellcome 212930/Z/18/Z), NIH CFDE Coordinating Center (NIH Common Fund OT3OD025459-01), Precision Toxicology (H2020-EU 965406), UKRI DASH grant (MR/V038966/1), BY-COVID (Horizon-EU 101046203), AgroServ (Horizon-EU 101058020).Peer Reviewed"Article signat per 33 autors/es: Philippe Rocca-Serra, Wei Gu, Vassilios Ioannidis, Tooba Abbassi-Daloii, Salvador Capella-Gutierrez, Ishwar Chandramouliswaran, Andrea Splendiani, Tony Burdett, Robert T. Giessmann, David Henderson, Dominique Batista, Ibrahim Emam, Yojana Gadiya, Lucas Giovanni, Egon Willighagen, Chris Evelo, Alasdair J. G. Gray, Philip Gribbon, Nick Juty, Danielle Welter, Karsten Quast, Paul Peeters, Tom Plasterer, Colin Wood, Eelke van der Horst, Dorothy Reilly, Herman van Vlijmen, Serena Scollen, Allyson Lister, Milo Thurston, Ramon Granell, the FAIR Cookbook Contributors & Susanna-Assunta Sansone"Postprint (published version

    Challenges and requirements of heterogenous research data management in environmental sciences:a qualitative study

    Get PDF
    Abstract. The research focuses on the challenges and requirements of heterogeneous research data management in environmental sciences. Environmental research involves diverse data types, and effective management and integration of these data sets are crucial in managing heterogeneous environmental research data. The issue at hand is the lack of specific guidance on how to select and plan an appropriate data management practice to address the challenges of handling and integrating diverse data types in environmental research. The objective of the research is to identify the issues associated with the current data storage approach in research data management and determine the requirements for an appropriate system to address these challenges. The research adopts a qualitative approach, utilizing semi-structured interviews to collect data. Content analysis is employed to analyze the gathered data and identify relevant issues and requirements. The study reveals various issues in the current data management process, including inconsistencies in data treatment, the risk of unintentional data deletion, loss of knowledge due to staff turnover, lack of guidelines, and data scattered across multiple locations. The requirements identified through interviews emphasize the need for a data management system that integrates automation, open access, centralized storage, online electronic lab notes, systematic data management, secure repositories, reduced hardware storage, and version control with metadata support. The research identifies the current challenges faced by researchers in heterogeneous data management and compiles a list of requirements for an effective solution. The findings contribute to existing knowledge on research-related problems and provide a foundation for developing tailored solutions to meet the specific needs of researchers in environmental sciences
    • …
    corecore