4,356 research outputs found

    ProteoClade: A taxonomic toolkit for multi-species and metaproteomic analysis

    Get PDF
    We present ProteoClade, a Python toolkit that performs taxa-specific peptide assignment, protein inference, and quantitation for multi-species proteomics experiments. ProteoClade scales to hundreds of millions of protein sequences, requires minimal computational resources, and is open source, multi-platform, and accessible to non-programmers. We demonstrate its utility for processing quantitative proteomic data derived from patient-derived xenografts and its speed and scalability enable a novel de novo proteomic workflow for complex microbiota samples

    CHORUS Deliverable 2.1: State of the Art on Multimedia Search Engines

    Get PDF
    Based on the information provided by European projects and national initiatives related to multimedia search as well as domains experts that participated in the CHORUS Think-thanks and workshops, this document reports on the state of the art related to multimedia content search from, a technical, and socio-economic perspective. The technical perspective includes an up to date view on content based indexing and retrieval technologies, multimedia search in the context of mobile devices and peer-to-peer networks, and an overview of current evaluation and benchmark inititiatives to measure the performance of multimedia search engines. From a socio-economic perspective we inventorize the impact and legal consequences of these technical advances and point out future directions of research

    OntONeo: The Obstetric and Neonatal Ontology

    Get PDF
    This paper presents the Obstetric and Neonatal Ontology (OntONeo). This ontology has been created to provide a consensus representation of salient electronic health record (EHR) data and to serve interoperability of the associated data and information systems. More generally, it will serve interoperability of clinical and translational data, for example deriving from genomics disciplines and from clinical trials. Interoperability of EHR data is important to ensuring continuity of care during the prenatal and postnatal periods for both mother and child. As a strategy to advance such interoperability we use an approach based on ontological realism and on the ontology development principles of the Open Biomedical Ontologies Foundry, including reuse of reference ontologies wherever possible. We describe the structure and coverage domain of OntONeo and the process of creating and maintaining the ontology

    Annotation of existing databases using Semantic Web technologies:Making data more FAIR

    Get PDF
    Making data FAIR is an elaborate task. Hospitals and/or departments have to invest into technologies usually unknown and often do not have the resources to make data FAIR. Our work aims to provide a framework and tooling where users can easily make their data (more) FAIR. This framework uses RDF and OWL-based inferencing to annotate existing databases or comma-separated files. For every database, a custom ontology is build based on the database schema, which can be annotated to describe matching standardized terminologies. In this work, we describe the tooling developed, and the current implementation in an institutional datawarehouse pertaining over 3000 rectal cancer patients. We report on the performance (time) of the extraction and annotation process by the developed tooling. Furthermore, we do show that annotation of existing databases using OWL2-based reasoning is possible. Furthermore, we show that the ontology extracted from existing databases can provide a description framework to describe and annotate existing data sources. This would target mostly the “Interoperable” aspect of FAIR

    Templates as a method for implementing data provenance in decision support systems

    Get PDF
    AbstractDecision support systems are used as a method of promoting consistent guideline-based diagnosis supporting clinical reasoning at point of care. However, despite the availability of numerous commercial products, the wider acceptance of these systems has been hampered by concerns about diagnostic performance and a perceived lack of transparency in the process of generating clinical recommendations. This resonates with the Learning Health System paradigm that promotes data-driven medicine relying on routine data capture and transformation, which also stresses the need for trust in an evidence-based system. Data provenance is a way of automatically capturing the trace of a research task and its resulting data, thereby facilitating trust and the principles of reproducible research. While computational domains have started to embrace this technology through provenance-enabled execution middlewares, traditionally non-computational disciplines, such as medical research, that do not rely on a single software platform, are still struggling with its adoption. In order to address these issues, we introduce provenance templates – abstract provenance fragments representing meaningful domain actions. Templates can be used to generate a model-driven service interface for domain software tools to routinely capture the provenance of their data and tasks. This paper specifies the requirements for a Decision Support tool based on the Learning Health System, introduces the theoretical model for provenance templates and demonstrates the resulting architecture. Our methods were tested and validated on the provenance infrastructure for a Diagnostic Decision Support System that was developed as part of the EU FP7 TRANSFoRm project

    Browser-based Data Annotation, Active Learning, and Real-Time Distribution of Artificial Intelligence Models: From Tumor Tissue Microarrays to COVID-19 Radiology.

    Get PDF
    BACKGROUND: Artificial intelligence (AI) is fast becoming the tool of choice for scalable and reliable analysis of medical images. However, constraints in sharing medical data outside the institutional or geographical space, as well as difficulties in getting AI models and modeling platforms to work across different environments, have led to a "reproducibility crisis" in digital medicine. METHODS: This study details the implementation of a web platform that can be used to mitigate these challenges by orchestrating a digital pathology AI pipeline, from raw data to model inference, entirely on the local machine. We discuss how this federated platform provides governed access to data by consuming the Application Program Interfaces exposed by cloud storage services, allows the addition of user-defined annotations, facilitates active learning for training models iteratively, and provides model inference computed directly in the web browser at practically zero cost. The latter is of particular relevance to clinical workflows because the code, including the AI model, travels to the user's data, which stays private to the governance domain where it was acquired. RESULTS: We demonstrate that the web browser can be a means of democratizing AI and advancing data socialization in medical imaging backed by consumer-facing cloud infrastructure such as Box.com. As a case study, we test the accompanying platform end-to-end on a large dataset of digital breast cancer tissue microarray core images. We also showcase how it can be applied in contexts separate from digital pathology by applying it to a radiology dataset containing COVID-19 computed tomography images. CONCLUSIONS: The platform described in this report resolves the challenges to the findable, accessible, interoperable, reusable stewardship of data and AI models by integrating with cloud storage to maintain user-centric governance over the data. It also enables distributed, federated computation for AI inference over those data and proves the viability of client-side AI in medical imaging. AVAILABILITY: The open-source application is publicly available at , with a short video demonstration at

    Supporting personalised content management in smart health information portals

    Get PDF
    Information portals are seen as an appropriate platform for personalised healthcare and wellbeing information provision. Efficient content management is a core capability of a successful smart health information portal (SHIP) and domain expertise is a vital input to content management when it comes to matching user profiles with the appropriate resources. The rate of generation of new health-related content far exceeds the numbers that can be manually examined by domain experts for relevance to a specific topic and audience. In this paper we investigate automated content discovery as a plausible solution to this shortcoming that capitalises on the existing database of expert-endorsed content as an implicit store of knowledge to guide such a solution. We propose a novel content discovery technique based on a text analytics approach that utilises an existing content repository to acquire new and relevant content. We also highlight the contribution of this technique towards realisation of smart content management for SHIPs.<br /
    • …
    corecore