12 research outputs found
Establishing a consensus for the hallmarks of cancer based on gene ontology and pathway annotations
Background The hallmarks of cancer provide a highly cited and well-used conceptual framework for describing the processes involved in cancer cell development and tumourigenesis. However, methods for translating these high-level concepts into data-level associations between hallmarks and genes (for high throughput analysis), vary widely between studies. The examination of different strategies to associate and map cancer hallmarks reveals significant differences, but also consensus. Results Here we present the results of a comparative analysis of cancer hallmark mapping strategies, based on Gene Ontology and biological pathway annotation, from different studies. By analysing the semantic similarity between annotations, and the resulting gene set overlap, we identify emerging consensus knowledge. In addition, we analyse the differences between hallmark and gene set associations using Weighted Gene Co-expression Network Analysis and enrichment analysis. Conclusions Reaching a community-wide consensus on how to identify cancer hallmark activity from research data would enable more systematic data integration and comparison between studies. These results highlight the current state of the consensus and offer a starting point for further convergence. In addition, we show how a lack of consensus can lead to large differences in the biological interpretation of downstream analyses and discuss the challenges of annotating changing and accumulating biological data, using intermediate knowledge resources that are also changing over time.Computer Systems, Imagery and Medi
Large-scale zero-shot learning in the wild: classifying zoological illustrations
In this paper we analyse the classification of zoological illustrations. Historically, zoological illustrations were the modus operandi for the documentation of new species, and now serve as crucial sources for long-term ecological and biodiversity research. By employing computational methods for classification, the data can be made amenable to research. Automated species identification is challenging due to the long-tailed nature of the data, and the millions of possible classes in the species taxonomy. Success commonly depends on large training sets with many examples per class, but images from only a subset of classes are digitally available, and many images are unlabelled, since labelling requires domain expertise. We explore zero-shot learning to address the problem, where features are learned from classes with medium to large samples, which are then transferred to recognise classes with few or no training samples. We specifically explore how distributed, multi-modal background knowledge from data providers, such as the Global Biodiversity Information Facility (GBIF), iNaturalist, and the Biodiversity Heritage Library (BHL), can be used to share knowledge between classes for zero-shot learning. We train a prototypical network for zero-shot classification, and introduce fused prototypes (FP) and hierarchical prototype loss (HPL) to optimise the model. Finally, we analyse the performance of the model for use in real-world applications. The experimental results are encouraging, indicating potential for use of such models in an expert support system, but also express the difficulty of our task, showing a necessity for research into computer vision methods that are able to learn from small samples.Computer Systems, Imagery and Medi
FAIR high content screening in bioimaging
The Minimum Information for High Content Screening Microscopy Experiments (MIHCSME) is a metadata model and reusable tabular template for sharing and integrating high content imaging data. It has been developed by combining the ISA (Investigations, Studies, Assays) metadata standard with a semantically enriched instantiation of REMBI (Recommended Metadata for Biological Images). The tabular template provides an easy-to-use practical implementation of REMBI, specifically for High Content Screening (HCS) data. In addition, ISA compliance enables broader integration with other types of experimental data, paving the way for visual omics and multi-Omics integration. We show the utility of MIHCSME for HCS data using multiple examples from the Leiden FAIR Cell Observatory, a Euro-Bioimaging flagship node for high content screening and the pilot node for implementing Findable, Accessible, Interoperable and Reusable (FAIR) bioimaging data throughout the Netherlands Bioimaging network.Computer Systems, Imagery and MediaMicrobial Biotechnolog
Semantic annotation of natural history collections
Large collections of historical biodiversity expeditions are housed in natural history museums throughout the world. Potentially they can serve as rich sources of data for cultural historical and biodiversity research. However, they exist as only partially catalogued specimen repositories and images of unstructured, non-standardised, hand-written text and drawings. Although many archival collections have been digitised, disclosing their content is challenging. They refer to historical place names and outdated taxonomic classifications and are written in multiple languages. Efforts to transcribe the hand-written text can make the content accessible, but semantically describing and interlinking the content would further facilitate research. We propose a semantic model that serves to structure the named entities in natural history archival collections. In addition, we present an approach for the semantic annotation of these collections whilst documenting their provenance. This approach serves as an initial step for an adaptive learning approach for semi-automated extraction of named entities from natural history archival collections. The applicability of the semantic model and the annotation approach is demonstrated using image scans from a collection of 8, 000 field book pages gathered by the Committee for Natural History of the Netherlands Indies between 1820 and 1850, and evaluated together with domain experts from the field of natural and cultural history.Computer Systems, Imagery and Medi
Design of a FAIR digital data health infrastructure in Africa for COVID-19 reporting and research
The limited volume of COVID-19 data from Africa raises concerns for global genome research, which requires a diversity of genotypes for accurate disease prediction, including on the provenance of the new SARS-CoV-2 mutations. The Virus Outbreak Data Network (VODAN)-Africa studied the possibility of increasing the production of clinical data, finding concerns about data ownership, and the limited use of health data for quality treatment at point of care. To address this, VODAN Africa developed an architecture to record clinical health data and research data collected on the incidence of COVID-19, producing these as human- and machine-readable data objects in a distributed architecture of locally governed, linked, human- and machine-readable data. This architecture supports analytics at the point of care and-through data visiting, across facilities-for generic analytics. An algorithm was run across FAIR Data Points to visit the distributed data and produce aggregate findings. The FAIR data architecture is deployed in Uganda, Ethiopia, Liberia, Nigeria, Kenya, Somalia, Tanzania, Zimbabwe, and Tunisia.Computer Systems, Imagery and Medi
Tocilizumab in patients admitted to hospital with COVID-19 (RECOVERY): a randomised, controlled, open-label, platform trial
Background:
In this study, we aimed to evaluate the effects of tocilizumab in adult patients admitted to hospital with COVID-19 with both hypoxia and systemic inflammation.
Methods:
This randomised, controlled, open-label, platform trial (Randomised Evaluation of COVID-19 Therapy [RECOVERY]), is assessing several possible treatments in patients hospitalised with COVID-19 in the UK. Those trial participants with hypoxia (oxygen saturation <92% on air or requiring oxygen therapy) and evidence of systemic inflammation (C-reactive protein ≥75 mg/L) were eligible for random assignment in a 1:1 ratio to usual standard of care alone versus usual standard of care plus tocilizumab at a dose of 400 mg–800 mg (depending on weight) given intravenously. A second dose could be given 12–24 h later if the patient's condition had not improved. The primary outcome was 28-day mortality, assessed in the intention-to-treat population. The trial is registered with ISRCTN (50189673) and ClinicalTrials.gov (NCT04381936).
Findings:
Between April 23, 2020, and Jan 24, 2021, 4116 adults of 21 550 patients enrolled into the RECOVERY trial were included in the assessment of tocilizumab, including 3385 (82%) patients receiving systemic corticosteroids. Overall, 621 (31%) of the 2022 patients allocated tocilizumab and 729 (35%) of the 2094 patients allocated to usual care died within 28 days (rate ratio 0·85; 95% CI 0·76–0·94; p=0·0028). Consistent results were seen in all prespecified subgroups of patients, including those receiving systemic corticosteroids. Patients allocated to tocilizumab were more likely to be discharged from hospital within 28 days (57% vs 50%; rate ratio 1·22; 1·12–1·33; p<0·0001). Among those not receiving invasive mechanical ventilation at baseline, patients allocated tocilizumab were less likely to reach the composite endpoint of invasive mechanical ventilation or death (35% vs 42%; risk ratio 0·84; 95% CI 0·77–0·92; p<0·0001).
Interpretation:
In hospitalised COVID-19 patients with hypoxia and systemic inflammation, tocilizumab improved survival and other clinical outcomes. These benefits were seen regardless of the amount of respiratory support and were additional to the benefits of systemic corticosteroids.
Funding:
UK Research and Innovation (Medical Research Council) and National Institute of Health Research
Convalescent plasma in patients admitted to hospital with COVID-19 (RECOVERY): a randomised controlled, open-label, platform trial
Background:
Many patients with COVID-19 have been treated with plasma containing anti-SARS-CoV-2 antibodies. We aimed to evaluate the safety and efficacy of convalescent plasma therapy in patients admitted to hospital with COVID-19.
Methods:
This randomised, controlled, open-label, platform trial (Randomised Evaluation of COVID-19 Therapy [RECOVERY]) is assessing several possible treatments in patients hospitalised with COVID-19 in the UK. The trial is underway at 177 NHS hospitals from across the UK. Eligible and consenting patients were randomly assigned (1:1) to receive either usual care alone (usual care group) or usual care plus high-titre convalescent plasma (convalescent plasma group). The primary outcome was 28-day mortality, analysed on an intention-to-treat basis. The trial is registered with ISRCTN, 50189673, and ClinicalTrials.gov, NCT04381936.
Findings:
Between May 28, 2020, and Jan 15, 2021, 11558 (71%) of 16287 patients enrolled in RECOVERY were eligible to receive convalescent plasma and were assigned to either the convalescent plasma group or the usual care group. There was no significant difference in 28-day mortality between the two groups: 1399 (24%) of 5795 patients in the convalescent plasma group and 1408 (24%) of 5763 patients in the usual care group died within 28 days (rate ratio 1·00, 95% CI 0·93–1·07; p=0·95). The 28-day mortality rate ratio was similar in all prespecified subgroups of patients, including in those patients without detectable SARS-CoV-2 antibodies at randomisation. Allocation to convalescent plasma had no significant effect on the proportion of patients discharged from hospital within 28 days (3832 [66%] patients in the convalescent plasma group vs 3822 [66%] patients in the usual care group; rate ratio 0·99, 95% CI 0·94–1·03; p=0·57). Among those not on invasive mechanical ventilation at randomisation, there was no significant difference in the proportion of patients meeting the composite endpoint of progression to invasive mechanical ventilation or death (1568 [29%] of 5493 patients in the convalescent plasma group vs 1568 [29%] of 5448 patients in the usual care group; rate ratio 0·99, 95% CI 0·93–1·05; p=0·79).
Interpretation:
In patients hospitalised with COVID-19, high-titre convalescent plasma did not improve survival or other prespecified clinical outcomes.
Funding:
UK Research and Innovation (Medical Research Council) and National Institute of Health Research
The evolution of standards and data management practices in systems biology
Computer Systems, Imagery and Medi