10 research outputs found

    Using nanopublications as a distributed ledger of digital truth

    Get PDF
    With the increase in volume of research publications, it is very difficult for researchers to keep abreast of all work in their area. Additionally, the claims in classical publications are not machine-readable making it challenging to retrieve, integrate, and link prior work. Several semantic publishing approaches have been proposed to address these challenges, including Research Object, Executable Paper, Micropublications, and Nanopublications. Nanopublications are a granular way of publishing research-based claims, their associated provenance, and publication information (metadata of the nanopublication) in a machine-readable form. To date, over 10 million nanopublications have been published, covering a wide range of topics, predominantly in the life sciences. Nanopublications are immutable, decentralised/distributed, uniformly structured, granular level, and authentic. These features of nanopublications allow them to be used as a Distributed Ledger of Digital Truth. Such a ledger enables detecting conflicting claims and generating the timeline of discussion on a particular topic. However, the inability to identify all nanopublications related to a given topic prevent existing nanopublications forming a ledger. In this dissertation, we make the following contributions: (i) Identify quality issues regarding misuse of authorship properties and linkrot which impact on the quality of the digital ledger. We argue that the Nanopub community needs to be developed a set of guidelines for publishing nanopublications. (ii) Provide a framework for generating a timeline of discourse over a collection of nanopublications by retrieving and combining nanopublications on a particular topic to provide interoperability between them. (iii) Detect contradictory claims between nanopublications automatically highlighting the conflicts and provide explanations based on the provenance information in the nanopublications. Through these contributions, we show that nanopublications can form a distributed ledger of digital truth, providing key benefits such as citability, timelines of discourse, and conflict detection, to users of the ledger

    Serviços de integração de dados para aplicações biomédicas

    Get PDF
    Doutoramento em Informática (MAP-i)In the last decades, the field of biomedical science has fostered unprecedented scientific advances. Research is stimulated by the constant evolution of information technology, delivering novel and diverse bioinformatics tools. Nevertheless, the proliferation of new and disconnected solutions has resulted in massive amounts of resources spread over heterogeneous and distributed platforms. Distinct data types and formats are generated and stored in miscellaneous repositories posing data interoperability challenges and delays in discoveries. Data sharing and integrated access to these resources are key features for successful knowledge extraction. In this context, this thesis makes contributions towards accelerating the semantic integration, linkage and reuse of biomedical resources. The first contribution addresses the connection of distributed and heterogeneous registries. The proposed methodology creates a holistic view over the different registries, supporting semantic data representation, integrated access and querying. The second contribution addresses the integration of heterogeneous information across scientific research, aiming to enable adequate data-sharing services. The third contribution presents a modular architecture to support the extraction and integration of textual information, enabling the full exploitation of curated data. The last contribution lies in providing a platform to accelerate the deployment of enhanced semantic information systems. All the proposed solutions were deployed and validated in the scope of rare diseases.Nas últimas décadas, o campo das ciências biomédicas proporcionou grandes avanços científicos estimulados pela constante evolução das tecnologias de informação. A criação de diversas ferramentas na área da bioinformática e a falta de integração entre novas soluções resultou em enormes quantidades de dados distribuídos por diferentes plataformas. Dados de diferentes tipos e formatos são gerados e armazenados em vários repositórios, o que origina problemas de interoperabilidade e atrasa a investigação. A partilha de informação e o acesso integrado a esses recursos são características fundamentais para a extração bem sucedida do conhecimento científico. Nesta medida, esta tese fornece contribuições para acelerar a integração, ligação e reutilização semântica de dados biomédicos. A primeira contribuição aborda a interconexão de registos distribuídos e heterogéneos. A metodologia proposta cria uma visão holística sobre os diferentes registos, suportando a representação semântica de dados e o acesso integrado. A segunda contribuição aborda a integração de diversos dados para investigações científicas, com o objetivo de suportar serviços interoperáveis para a partilha de informação. O terceiro contributo apresenta uma arquitetura modular que apoia a extração e integração de informações textuais, permitindo a exploração destes dados. A última contribuição consiste numa plataforma web para acelerar a criação de sistemas de informação semânticos. Todas as soluções propostas foram validadas no âmbito das doenças raras

    Affordances and limitations of algorithmic criticism

    Get PDF
    Humanities scholars currently have access to unprecedented quantities of machine-readable texts, and, at the same time, the tools and the methods with which we can analyse and visualise these texts are becoming more and more sophisticated. As has been shown in numerous studies, many of the new technical possibilities that emerge from fields such as text mining and natural language processing can have useful applications within literary research. Computational methods can help literary scholars to discover interesting trends and correlations within massive text collections, and they can enable a thoroughly systematic examination of the stylistic properties of literary works. While such computer-assisted forms of reading have proven invaluable for research in the field of literary history, relatively few studies have applied these technologies to expand or to transform the ways in which we can interpret literary texts. Based on a comparative analysis of digital scholarship and traditional scholarship, this thesis critically examines the possibilities and the limitations of a computer-based literary criticism. It argues that quantitative analyses of data about literary techniques can often reveal surprising qualities of works of literature, which can, in turn, lead to new interpretative readings

    Out of cite, out of mind: the current state of practice, policy, and technology for the citation of data

    Get PDF
    PREFACE The growth in the capacity of the research community to collect and distribute data presents huge opportunities. It is already transforming old methods of scientific research and permitting the creation of new ones. However, the exploitation of these opportunities depends upon more than computing power, storage, and network connectivity. Among the promises of our growing universe of online digital data are the ability to integrate data into new forms of scholarly publishing to allow peer-examination and review of conclusions or analysis of experimental and observational data and the ability for subsequent researchers to make new analyses of the same data, including their combination with other data sets and uses that may have been unanticipated by the original producer or collector. The use of published digital data, like the use of digitally published literature, depends upon the ability to identify, authenticate, locate, access, and interpret them. Data citations provide necessary support for these functions, as well as other functions such as attribution of credit and establishment of provenance. References to data, however, present challenges not encountered in references to literature. For example, how can one specify a particular subset of data in the absence of familiar conventions such as page numbers or chapters? The traditions and good practices for maintaining the scholarly record by proper references to a work are well established and understood in regard to journal articles and other literature, but attributing credit by bibliographic references to data are not yet so broadly implemented

    Contributions towards understanding and building sustainable science

    Get PDF
    This dissertation focuses on either understanding and detecting threats to the epistemology of science (chapters 1-6) or making practical advances to remedy epistemological threats (chapters 7-9). Chapter 1 reviews the literature on responsible conduct of research, questionable research practices, and research misconduct. Chapter 2 reanalyzes Head et al (2015) their claims about widespread p-hacking for robustness. Chapter 3 examines 258,050 test results across 30,710 articles from eight high impact journals to investigate the existence of a peculiar prevalence of pp-values just below .05 (i.e., a bump) in the psychological literature, and a potential increase thereof over time. Chapter 4 examines evidence for false negatives in nonsignificant results throughout psychology, gender effects, and the Reproducibility Project: Psychology. Chapter 5 describes a dataset that is the result of content mining 167,318 published articles for statistical test results reported according to the standards prescribed by the American Psychological Association (APA). In Chapter 6, I test the validity of statistical methods to detect fabricated data in two studies. Chapter 7 tackles the issue of data extraction from figures in scholarly publications. In Chapter 8 I argue that "after-the-fact" research papers do not help alleviate issues of access, selective publication, and reproducibility, but actually cause some of these threats because the chronology of the research cycle is lost in a research paper. I propose to give up the academic paper and propose a digitally native "as-you-go" alternative. In Chapter 9 I propose a technical design for this

    Bioinformatic approaches to identify genomic, proteomic and metabolomic biomarkers for the metabolic syndrome

    Get PDF
    Advances in technology have turned modern biology into a data-intensive enterprise. The advent of high-output technologies like Microarrays and Next-generation sequencing technologies has resulted in researchers grappling not just with huge volumes but also multiple types of data. While generation and storage of high-quality data are an important research focus, it is increasingly recognized that translating data into actionable information and insight is a critical research challenge. To infer reliable conclusions from the data, it is often necessary to integrate large amounts of heterogeneous data with different formats and semantics. Given the breadth and volume of data involved, this goal is best achieved through automated methods and tools for data integration and workflow management. This thesis presents automated strategies that combine bioinformatics and statistical methods to identify novel biomarkers in high-throughput OMICs datasets pertaining to the metabolic syndrome and to gain mechanistic insight into the underlying biological processes. An underlying theme in this thesis is data-driven approaches that generate plausible hypothesis which is followed by experimental verification.UBL - phd migration 201

    Converting Scholarly Journals to Open Access: A Review of Approaches and Experiences

    Get PDF
    This report identifies ways through which subscription-based scholarly journals have converted their publishing models to open access (OA). The major goal was to identify specific scenarios that have been used or proposed for transitioning subscription journals to OA so that these scenarios can provide options for others seeking to “flip” their journals to OA. The report is based on the published literature as well as “gray” literature such as blog posts and press releases. In addition, interviews were conducted with eight experts in scholarly publishing. The report identifies a variety of goals for converting a journal to OA. While there are altruistic goals of making scholarship more accessible, the literature review and interviews suggest that there are also many practical reasons for transitioning to an OA model. In some instances, an OA business model is simply more economically viable. Also, it is not unusual for a society or editorial board to transition to an OA business model as a means of gaining independence from the current publisher. Increasing readership, the number and quality of submissions, and impact as measured in citations are important goals for most journals that are considering flipping. Goals and their importance often differ for various regions in the world and across different disciplines. Each journal’s situation is unique and it is important for those seeking to flip a journal to carefully consider exactly what they hope to achieve, what barriers they are likely to face, and how the changes that are being implemented will further the goals intended for their journal

    Converting Scholarly Journals to Open Access: A Review of Approaches and Experiences

    Get PDF
    This report identifies ways through which subscription-based scholarly journals have converted their publishing models to open access (OA). The major goal was to identify specific scenarios that have been used or proposed for transitioning subscription journals to OA so that these scenarios can provide options for others seeking to “flip” their journals to OA. The report is based on the published literature as well as “gray” literature such as blog posts and press releases. In addition, interviews were conducted with eight experts in scholarly publishing. The report identifies a variety of goals for converting a journal to OA. While there are altruistic goals of making scholarship more accessible, the literature review and interviews suggest that there are also many practical reasons for transitioning to an OA model. In some instances, an OA business model is simply more economically viable. Also, it is not unusual for a society or editorial board to transition to an OA business model as a means of gaining independence from the current publisher. Increasing readership, the number and quality of submissions, and impact as measured in citations are important goals for most journals that are considering flipping. Goals and their importance often differ for various regions in the world and across different disciplines. Each journal’s situation is unique and it is important for those seeking to flip a journal to carefully consider exactly what they hope to achieve, what barriers they are likely to face, and how the changes that are being implemented will further the goals intended for their journal
    corecore