Search CORE

9 research outputs found

Domain generalization across tumor types, laboratories, and species -- insights from the 2022 edition of the Mitosis Domain Generalization Challenge

Recognition of mitotic figures in histologic tumor specimens is highly relevant to patient outcome assessment. This task is challenging for algorithms and human experts alike, with deterioration of algorithmic performance under shifts in image representations. Considerable covariate shifts occur when assessment is performed on different tumor types, images are acquired using different digitization devices, or specimens are produced in different laboratories. This observation motivated the inception of the 2022 challenge on MItosis Domain Generalization (MIDOG 2022). The challenge provided annotated histologic tumor images from six different domains and evaluated the algorithmic approaches for mitotic figure detection provided by nine challenge participants on ten independent domains. Ground truth for mitotic figure detection was established in two ways: a three-expert consensus and an independent, immunohistochemistry-assisted set of labels. This work represents an overview of the challenge tasks, the algorithmic strategies employed by the participants, and potential factors contributing to their success. With an

F_1

score of 0.764 for the top-performing team, we summarize that domain generalization across various tumor domains is possible with today's deep learning-based recognition pipelines. However, we also found that domain characteristics not present in the training set (feline as new species, spindle cell shape as new morphology and a new scanner) led to small but significant decreases in performance. When assessed against the immunohistochemistry-assisted reference standard, all methods resulted in reduced recall scores, but with only minor changes in the order of participants in the ranking

arXiv.org e-Print Archive

Multi-label classification for biomedical literature: an overview of the BioCreative VII LitCovid Track for COVID-19 literature topic annotations

The coronavirus disease 2019 (COVID-19) pandemic has been severely impacting global society since December 2019. The related findings such as vaccine and drug development have been reported in biomedical literature—at a rate of about 10 000 articles on COVID-19 per month. Such rapid growth significantly challenges manual curation and interpretation. For instance, LitCovid is a literature database of COVID-19-related articles in PubMed, which has accumulated more than 200 000 articles with millions of accesses each month by users worldwide. One primary curation task is to assign up to eight topics (e.g. Diagnosis and Treatment) to the articles in LitCovid. The annotated topics have been widely used for navigating the COVID literature, rapidly locating articles of interest and other downstream studies. However, annotating the topics has been the bottleneck of manual curation. Despite the continuing advances in biomedical text-mining methods, few have been dedicated to topic annotations in COVID-19 literature. To close the gap, we organized the BioCreative LitCovid track to call for a community effort to tackle automated topic annotation for COVID-19 literature. The BioCreative LitCovid dataset—consisting of over 30 000 articles with manually reviewed topics—was created for training and testing. It is one of the largest multi-label classification datasets in biomedical scientific literature. Nineteen teams worldwide participated and made 80 submissions in total. Most teams used hybrid systems based on transformers. The highest performing submissions achieved 0.8875, 0.9181 and 0.9394 for macro-F1-score, micro-F1-score and instance-based F1-score, respectively. Notably, these scores are substantially higher (e.g. 12%, higher for macro F1-score) than the corresponding scores of the state-of-art multi-label classification method. The level of participation and results demonstrate a successful track and help close the gap between dataset curation and method development. The dataset is publicly available via https://ftp.ncbi.nlm.nih.gov/pub/lu/LitCovid/biocreative/ for benchmarking and further development

Enlighten

Text and Network-Mining for COVID-19 Intervention Studies

Author: Aditya Rao
Naveen Sivadasan
Rajgopal Srinivasan
Sujatha Kotte
Thomas Joseph
VG Saipradeep
Publication venue
Publication date: 06/05/2020
Field of study

Background: The COVID-19 pandemic has led to a massive and collective pursuit by the research community to find effective diagnostics, drugs and vaccines The large and growing body of literature present in MEDLINE and other online resources including various self-archive sites are invaluable for these efforts. MEDLINE has more than 30 million abstracts and an additional corpus related to COVID-19, SARS and MERS has more than 40,000 literature articles, and these numbers are growing. Automated extraction of useful information from literature and automated generation of novel insights is crucial for accelerated discovery of drug/vaccine targets and re-purposing drug candidates.Methods: We applied text-mining on MEDLINE abstracts and the CORD-19 corpus to extract a rich set of pair-wise correlations between various biomedical entities. We built a comprehensive pair-wise entity association network involving 15 different entity types using both text-mined associations as well as novel associations obtained using link prediction. The resulting network, which we call CoNetz, also contains a specialized COVID-19 subnetwork that provides a network view of COVID-19 related literature. Additionally, we developed a set of network exploration utilities and user-friendly network visualization utilities using NetworkX and PyVis.Results: CoNetz consisted of pair-wise associations involving 174,000 entities covering 15 different entity types. The specialized COVID-19 subnetwork consisted of 7.8 million pair-wise associations involving 43,000 entities. The network captured several of the well-known COVID-19 drug re-purposing candidates and also predicted novel candidates including ingavirin, laninamivir, nevirapine, paritaprevir, pranlukast and peficitinib.Conclusions: Our automated text and network-mining approach builds an up-to-date and comprehensive knowledge network from literature for COVID-19 studies. The wide range of entity types captured in CoNetz provides a rich neighborhood context around the relations of interest. The approach avoids multiple drawbacks associated with manual curation including cost and effort involved, lack of up-to-date information and limited coverage. Amongst the novel repurposing drugs predicted, laninamivir and paritaprevir are possible COVID-19 anti-viral drugs while pranlukast was postulated to be a candidate for managing severe respiratory symptoms in COVID-19 patients. CoNetz is available for download and use from https://web.rniapps.net/tcn/tcn.tar.gz</div

ChemRxiv

Phenotype-driven gene prioritization for rare diseases using graph convolution on heterogeneous networks

Author: Aditya Rao
Naveen Sivadasan
Rajgopal Srinivasan
Saipradeep VG
Sujatha Kotte
Thomas Joseph
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/07/2018
Field of study

Abstract Background One of the major goals of genomic medicine is the identification of causal genomic variants in a patient and their relation to the observed clinical phenotypes. Prioritizing the genomic variants by considering only the genotype information usually identifies a few hundred potential variants. Narrowing it down further to find the causal disease genes and relating them to the observed clinical phenotypes remains a significant challenge, especially for rare diseases. Methods We propose a phenotype-driven gene prioritization approach using heterogeneous networks in the context of rare diseases. Towards this, we first built a heterogeneous network consisting of ontological associations as well as curated associations involving genes, diseases, phenotypes and pathways from multiple sources. Motivated by the recent progress in spectral graph convolutions, we developed a graph convolution based technique to infer new phenotype-gene associations from this initial set of associations. We included these inferred associations in the initial network and termed this integrated network HANRD (Heterogeneous Association Network for Rare Diseases). We validated this approach on 230 recently published rare disease clinical cases using the case phenotypes as input. Results When HANRD was queried with the case phenotypes as input, the causal genes were captured within Top-50 for more than 31% of the cases and within Top-200 for more than 56% of the cases. The results showed improved performance when compared to other state-of-the-art tools. Conclusions In this study, we showed that the heterogeneous network HANRD, consisting of curated, ontological and inferred associations, helped improve causal gene identification in rare diseases. HANRD allows future enhancements by supporting incorporation of new entity types and additional information sources

Directory of Open Access Journals

Additional file 2 of Phenotype-driven gene prioritization for rare diseases using graph convolution on heterogeneous networks

Author: Aditya Rao (5489900)
Naveen Sivadasan (5489894)
Rajgopal Srinivasan (5489906)
Saipradeep VG (5489897)
Sujatha Kotte (5489903)
Thomas Joseph (408377)
Publication venue
Publication date
Field of study

Experimental Results in tabular format. (PDF 66 kb

FigShare

Additional file 1 of Phenotype-driven gene prioritization for rare diseases using graph convolution on heterogeneous networks

Author: Aditya Rao (5489900)
Naveen Sivadasan (5489894)
Rajgopal Srinivasan (5489906)
Saipradeep VG (5489897)
Sujatha Kotte (5489903)
Thomas Joseph (408377)
Publication venue
Publication date
Field of study

Rare disease clinical cases from recent publications. (PDF 139 kb

FigShare

Benchmarked approaches for reconstruction of in vitro cell lineages and in silico models of C. elegans and M. musculus developmental trees

Author: Chow K-HK
Chung V
Elowitz MB
Garry DJ
Gong W
Granados AA
Guan Y
Han L
Hu J
Jones MG
Joseph T
Khodaverdian A
Kwak I-Y
Liu Z
Mason M
Meyer P
Peng J
Prusokas A
Prusokas A
Rao A
Rao S
Raz O
Rennert P
Retkute R
Saipradeep VG
Salvador-Martinez I
Shang X
Shapiro E
Shendure J
Sivadasan N
Srinivasan R
Telford MJ
Wang R
Yosef N
Yu T
Zhang H
Zhang R
Publication venue: Cell Press
Publication date: 18/08/2021
Field of study

Newcastle University E-Prints

Multi-label classification for biomedical literature: an overview of the BioCreative VII LitCovid Track for COVID-19 literature topic annotations

Author: Allot Alexis
Bagherzadeh Parsa
Bergler Sabine
Bhatnagar Aakash
Bhavsar Nidhir
Chang Yung-Chun
Chatterjee Niladri
Chen Qingyu
Chersoni Emmanuele
Chizhikova Mariia
Dong Hang
Du Jingcheng
Dufour Richard
Fang Li
Friedrich Annemarie
Gu Jinghang
Islamaj Rezarta
Labrak Yanis
Laleye Fréjus
Leaman Robert
Lin Sheng-Jie
Lu Zhiyong
Otmakhova Yulia
Pollak Senja
Pujari Subhash Chandra
Rakotoson Loïc
Sivadasan Naveen
Tandon Kushagri
Tang Wentai
Tavchioski Ilija
Tian Shubo
Vg Saipradeep
Wang Kai
Wu Honghan
Xu Shuo
Yepes Antonio Jimeno
Zhang Hongtong
Zhang Jinfeng
Zhang Yuefu
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2022
Field of study

International audienceAbstract The coronavirus disease 2019 (COVID-19) pandemic has been severely impacting global society since December 2019. The related findings such as vaccine and drug development have been reported in biomedical literature—at a rate of about 10 000 articles on COVID-19 per month. Such rapid growth significantly challenges manual curation and interpretation. For instance, LitCovid is a literature database of COVID-19-related articles in PubMed, which has accumulated more than 200 000 articles with millions of accesses each month by users worldwide. One primary curation task is to assign up to eight topics (e.g. Diagnosis and Treatment) to the articles in LitCovid. The annotated topics have been widely used for navigating the COVID literature, rapidly locating articles of interest and other downstream studies. However, annotating the topics has been the bottleneck of manual curation. Despite the continuing advances in biomedical text-mining methods, few have been dedicated to topic annotations in COVID-19 literature. To close the gap, we organized the BioCreative LitCovid track to call for a community effort to tackle automated topic annotation for COVID-19 literature. The BioCreative LitCovid dataset—consisting of over 30 000 articles with manually reviewed topics—was created for training and testing. It is one of the largest multi-label classification datasets in biomedical scientific literature. Nineteen teams worldwide participated and made 80 submissions in total. Most teams used hybrid systems based on transformers. The highest performing submissions achieved 0.8875, 0.9181 and 0.9394 for macro-F1-score, micro-F1-score and instance-based F1-score, respectively. Notably, these scores are substantially higher (e.g. 12%, higher for macro F1-score) than the corresponding scores of the state-of-art multi-label classification method. The level of participation and results demonstrate a successful track and help close the gap between dataset curation and method development. The dataset is publicly available via https://ftp.ncbi.nlm.nih.gov/pub/lu/LitCovid/biocreative/ for benchmarking and further development. Database URL https://ftp.ncbi.nlm.nih.gov/pub/lu/LitCovid/biocreative

OPUS Augsburg

INRIA a CCSD electronic archive server

HAL Descartes

PubMed Central

UCL Discovery

Edinburgh Research Explorer

Oxford University Research Archive

Enlighten

Hal-Diderot

Phenotype-driven gene prioritization for rare diseases using graph convolution on heterogeneous networks

Author: A Antanaviciute
A Javed
A Krämer
A Sifrim
Aditya Rao
D Smedley
D Smedley
D Stavropoulos
D Warde-Farley
DK Hammond
E Guney
G Stelzer
Gene Ontology Consortium
H Lee
H Yang
I Adzhubei
J Piñero
JA Hanley
K Gray
K Sinsha
M Girdea
M Kutmon
M Singleton
M Xie
Naveen Sivadasan
O Vanunu
P Godard
P Kumar
R James
Rajgopal Srinivasan
S Köhler
S Köhler
Saipradeep VG
Sujatha Kotte
T Kelder
T Rolland
Thomas Joseph
W Bone
W Kibbe
X Wu
Y Chen
Y Deng
Y Li
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref