1,941 research outputs found

    A call for public archives for biological image data

    Get PDF
    Public data archives are the backbone of modern biological and biomedical research. While archives for biological molecules and structures are well-established, resources for imaging data do not yet cover the full range of spatial and temporal scales or application domains used by the scientific community. In the last few years, the technical barriers to building such resources have been solved and the first examples of scientific outputs from public image data resources, often through linkage to existing molecular resources, have been published. Using the successes of existing biomolecular resources as a guide, we present the rationale and principles for the construction of image data archives and databases that will be the foundation of the next revolution in biological and biomedical informatics and discovery.Comment: 13 pages, 1 figur

    Enforcing public data archiving policies in academic publishing: A study of ecology journals

    Full text link
    To improve the quality and efficiency of research, groups within the scientific community seek to exploit the value of data sharing. Funders, institutions, and specialist organizations are developing and implementing strategies to encourage or mandate data sharing within and across disciplines, with varying degrees of success. Academic journals in ecology and evolution have adopted several types of public data archiving policies requiring authors to make data underlying scholarly manuscripts freely available. Yet anecdotes from the community and studies evaluating data availability suggest that these policies have not obtained the desired effects, both in terms of quantity and quality of available datasets. We conducted a qualitative, interview-based study with journal editorial staff and other stakeholders in the academic publishing process to examine how journals enforce data archiving policies. We specifically sought to establish who editors and other stakeholders perceive as responsible for ensuring data completeness and quality in the peer review process. Our analysis revealed little consensus with regard to how data archiving policies should be enforced and who should hold authors accountable for dataset submissions. Themes in interviewee responses included hopefulness that reviewers would take the initiative to review datasets and trust in authors to ensure the completeness and quality of their datasets. We highlight problematic aspects of these thematic responses and offer potential starting points for improvement of the public data archiving process.Comment: 35 pages, 1 figure, 1 tabl

    Enabling Web-scale data integration in biomedicine through Linked Open Data

    Get PDF
    The biomedical data landscape is fragmented with several isolated, heterogeneous data and knowledge sources, which use varying formats, syntaxes, schemas, and entity notations, existing on the Web. Biomedical researchers face severe logistical and technical challenges to query, integrate, analyze, and visualize data from multiple diverse sources in the context of available biomedical knowledge. Semantic Web technologies and Linked Data principles may aid toward Web-scale semantic processing and data integration in biomedicine. The biomedical research community has been one of the earliest adopters of these technologies and principles to publish data and knowledge on the Web as linked graphs and ontologies, hence creating the Life Sciences Linked Open Data (LSLOD) cloud. In this paper, we provide our perspective on some opportunities proffered by the use of LSLOD to integrate biomedical data and knowledge in three domains: (1) pharmacology, (2) cancer research, and (3) infectious diseases. We will discuss some of the major challenges that hinder the wide-spread use and consumption of LSLOD by the biomedical research community. Finally, we provide a few technical solutions and insights that can address these challenges. Eventually, LSLOD can enable the development of scalable, intelligent infrastructures that support artificial intelligence methods for augmenting human intelligence to achieve better clinical outcomes for patients, to enhance the quality of biomedical research, and to improve our understanding of living systems

    I'll take that to go:Big data bags and minimal identifiers for exchange of large, complex datasets

    Get PDF
    Big data workflows often require the assembly and exchange of complex, multi-element datasets. For example, in biomedical applications, the input to an analytic pipeline can be a dataset consisting thousands of images and genome sequences assembled from diverse repositories, requiring a description of the contents of the dataset in a concise and unambiguous form. Typical approaches to creating datasets for big data workflows assume that all data reside in a single location, requiring costly data marshaling and permitting errors of omission and commission because dataset members are not explicitly specified. We address these issues by proposing simple methods and tools for assembling, sharing, and analyzing large and complex datasets that scientists can easily integrate into their daily workflows. These tools combine a simple and robust method for describing data collections (BDBags), data descriptions (Research Objects), and simple persistent identifiers (Minids) to create a powerful ecosystem of tools and services for big data analysis and sharing. We present these tools and use biomedical case studies to illustrate their use for the rapid assembly, sharing, and analysis of large datasets

    Scientific Digital Data Repositories: Needs and Challenges for Cancer Researchers

    Get PDF
    The purpose of this study is to understand the varied data needs of molecular level cancer researchers who use light, fluorescent, and electron microscopy to obtain knowledge about cancer on a molecular level. It explores what data tools a sample of researchers are currently using to preserve their data for future access, and the needs of these researchers for depositing their digital research data into digital repositories. Data from the researchers suggest that they understand the need to preserve their raw and compiled data in places outside their laboratory, but they have not fully embraced the idea of depositing it in a repository. This seems most likely due to them not fully understanding what repositories are and what they provide. To increase the use of repositories by this research community, repositories need to promote themselves better and to offer additional services that are specific for the needs of this community.Master of Science in Library Scienc

    FAIRsharing, a cohesive community approach to the growth in standards, repositories and policies

    Get PDF
    In this modern, data-driven age, governments, funders and publishers expect greater transparency and reuse of research data, as well as greater access to and preservation of the data that supports research findings. Community-developed standards, such as those for the identification and reporting of data, underpin reproducible and reusable research, aid scholarly publishing, and drive both the discovery and evolution of scientific practice. The number of these standardization efforts, driven by large organizations or at the grass root level, has been on the rise since the early 2000s. Thousands of community-developed standards are available (across all disciplines), many of which have been created and/or implemented by several thousand data repositories. Nevertheless, their uptake by the research community, however, has been slow and uneven. This is mainly because investigators lack incentives to follow and adopt standards. The situation is exacerbated if standards are not promptly implemented by databases, repositories and other research tools, or endorsed by infrastructures. Furthermore, the fragmentation of community efforts results in the development of arbitrarily different, incompatible standards. In turn, this leads to standards becoming rapidly obsolete in fast-evolving research areas. As with any other digital object, standards, databases and repositories are dynamic in nature, with a life cycle that encompasses formulation, development and maintenance; their status in this cycle may vary depending on the level of activity of the developing group or community. There is an urgent need for a service that enhances the information available on the evolving constellation of heterogeneous standards, databases and repositories, guides users in the selection of these resources, and that works with developers and maintainers of these resources to foster collaboration and promote harmonization. Such an informative and educational service is vital to reduce the knowledge gap among those involved in producing, managing, serving, curating, preserving, publishing or regulating data. A diverse set of stakeholders-representing academia, industry, funding agencies, standards organizations, infrastructure providers and scholarly publishers, both national and domain-specific as well global and general organizations, have come together as a community, representing the core adopters, advisory board members, and/or key collaborators of the FAIRsharing resource. Here, we introduce its mission and community network. We present an evaluation of the standards landscape, focusing on those for reporting data and metadata - the most diverse and numerous of the standards - and their implementation by databases and repositories. We report on the ongoing challenge to recommend resources, and we discuss the importance of making standards invisible to the end users. We report on the ongoing challenge to recommend resources, and we discuss the importance of making standards invisible to the end users. We present guidelines that highlight the role each stakeholder group must play to maximize the visibility and adoption of standards, databases and repositories

    The Impact of Research Data Sharing and Reuse on Data Citation in STEM Fields

    Get PDF
    Despite the open science movement and mandates for the sharing of research data by major funding agencies and influential journals, the citation of data sharing and reuse has not become standard practice in the various science, technology, engineering and mathematics (STEM) fields. Advances in technology have lowered some barriers to data sharing, but it is a socio-technical phenomenon and the impact of the ongoing evolution in scholarly communication practices has yet to be quantified. Furthermore, there is need for a deeper and more nuanced understanding of author self-citation and recitation, the most often cited types of data, disciplinary differences regarding data citation and the extent of interdisciplinarity in data citation. This study employed a mixed methods approach that combined coding with semi-automatic text-searching techniques in order to assess the impact of data sharing and reuse on data citation in STEM fields. The research considered over 500,000 open research data entities, such as datasets, software and data studies, from over 350 repositories worldwide. I also examined 705 bibliographic publications with a total of 15,261 instances of data sharing, reuse, and citation the data, article, discipline and interdisciplinary levels. More specifically, I measured the phenomenon of data sharing in terms of formal data citation, frequently cited data types, and author self-citation, and I explored recitation at the levels of both data- and bibliography-level, and data reuse practices in bibliographies, associations of disciplines, and interdisciplinary contexts. The results of this research revealed, to begin with, disciplinary differences with regard to the impact of data sharing and reuse on data citation in STEM fields. This research also yielded the following additional findings regarding the citation of data by STEM researchers; 1) data sharing practices were diverse across disciplines: 2) data sharing has been increasing in recent years; 3) each discipline made use of major digital repositories; 4) these repositories took various forms depending on the discipline; 5) certain data types were more often cited in each discipline, so that the frequency distribution of the data types was highly skewed; 6) author self-citation and recitation followed similar trends at the data and bibliographic levels, but specific practices varied within each discipline; 7) associations between and across data and author self-citation and recitation at the bibliographic level were observed, with the self-citation rate differing significantly among disciplines;8) data reuse in bibliographies was rare yet diverse; 9) informal citation of data sharing and reuse at the bibliographic level was more common in certain fields, with astronomy/physics showing the highest amount (98%) and technology the lowest (69%); 10) within bibliographic publications, the documentation of data sharing and reuse occurred mainly in the main text; 11) publications in certain disciplines, such as chemistry, computing and engineering, did not attract citations from more than one field (i.e., showed no diversity); and, on the other hand,12) publications in other fields attracted a wide range of interdisciplinary data citations. This dissertation, then, contributes to the understanding of two key areas aspects of the current citation systems. First, the findings have practical implications for individual researchers, decision makers, funding agencies and publishers with regard to giving due credits to those who share their data. Second, this research has methodological implications in terms of reducing the labor required to analyze the full text of associated articles in order to identify evidence of data citation

    Raman spectral imaging in tissue engineering & regenerative medicine applications

    Get PDF
    The label-free nature of Raman spectroscopy makes it a valuable tool for cellular and tissue characterisation. Its ability to probe molecular vibrations within biological structures without affecting their biochemistry offers an advantage over conventional histological and biochemical assays. Providing a pure investigation of unperturbed biological processes, without the need for introduction of exogenous molecules for labelling, makes the information Raman spectroscopy offers very valuable in deciphering complex biological functions and mechanisms. Raman spectral signatures are unique "fingerprints" of each biomolecule probed and can be used for cellular phenotype characterisation, tissue composition, disease development in a cellular or tissue level and much more. This thesis focuses on the use of Raman spectral imaging in novel biological applications displaying its flexibility across the fields of tissue engineering and regenerative medicine. Bone regeneration was the first biological process investigated, where Raman spectral imaging was used to characterise bioactive glass-assisted bone repair using standard and novel glass compositions. Newly-formed bone quality was assessed using multivariate analysis, showing similar quality between glass compositions and existing bone. Morphological analysis after in vivo implantation of bioactive glass particles showed distinct spectral zones confirming results from existing in vitro models. The second application, focused on the development of a novel Raman-based gene delivery tracking methodology. Viral particles, containing modified viral-nucleotides with alkyne bonds were produced were successfully detected using Raman spectral imaging in cells after infection. The implications of this technology offer a new cell screening methodology for gene therapy. Finally, the potential of Raman spectral imaging as a complementary technique for 3D cell culture systems was explored. A computational framework was developed which allows for the visualisation and quantification of subcellular structures. The accurate 3D reconstruction of whole cells of known architecture from a volumetric hyperspectral Raman dataset was reported here for the first time. Moreover, using spectral unmixing algorithms to quantify subcellular components, revealed an unprecedented molecular specificity. This allowed imaging of cells within hydrogel-based 3D cell culture systems. The synergy of Raman spectral imaging, multivariate and image analysis to answer complex biological questions offers objective biomolecular characterisation, quantification and visualisation of molecular architecture. This work demonstrates the potential of Raman spectroscopy as a valuable complementary tool in tissue engineering and regenerative medicine applications.Open Acces
    • …
    corecore