Search CORE

110 research outputs found

An explainable framework for drug repositioning from disease information network

Author: Duan Lei
He Chengxin
Huang Menglin
Song Linlin
Zheng Huiru
Publication venue: 'Elsevier BV'
Publication date: 28/10/2022
Field of study

Deep learning in clinical natural language processing: a methodical review.

Author: Datta Surabhi
Du Jingcheng
Ji Zongcheng
Roberts Kirk
Si Yuqi
Soni Sarvesh
Wang Qiong
Wei Qiang
Wu Stephen
Xiang Yang
Xu Hua
Zhao Bo
Publication venue: DigitalCommons@TMC
Publication date: 01/03/2020
Field of study

OBJECTIVE: This article methodically reviews the literature on deep learning (DL) for natural language processing (NLP) in the clinical domain, providing quantitative analysis to answer 3 research questions concerning methods, scope, and context of current research. MATERIALS AND METHODS: We searched MEDLINE, EMBASE, Scopus, the Association for Computing Machinery Digital Library, and the Association for Computational Linguistics Anthology for articles using DL-based approaches to NLP problems in electronic health records. After screening 1,737 articles, we collected data on 25 variables across 212 papers. RESULTS: DL in clinical NLP publications more than doubled each year, through 2018. Recurrent neural networks (60.8%) and word2vec embeddings (74.1%) were the most popular methods; the information extraction tasks of text classification, named entity recognition, and relation extraction were dominant (89.2%). However, there was a long tail of other methods and specific tasks. Most contributions were methodological variants or applications, but 20.8% were new methods of some kind. The earliest adopters were in the NLP community, but the medical informatics community was the most prolific. DISCUSSION: Our analysis shows growing acceptance of deep learning as a baseline for NLP research, and of DL-based NLP in the medical community. A number of common associations were substantiated (eg, the preference of recurrent neural networks for sequence-labeling named entity recognition), while others were surprisingly nuanced (eg, the scarcity of French language clinical NLP with deep learning). CONCLUSION: Deep learning has not yet fully penetrated clinical NLP and is growing rapidly. This review highlighted both the popular and unique trends in this active field

DigitalCommons@The Texas Medical Center

Recommended from our members

Prediction of microbial communities for urban metagenomics using neural network approach.

Author: Jiang Jyun-Yu
Ju Chelsea J-T
Wang Wei
Zhou Guangyu
Publication venue: eScholarship, University of California
Publication date: 01/10/2019
Field of study

BACKGROUND:Microbes are greatly associated with human health and disease, especially in densely populated cities. It is essential to understand the microbial ecosystem in an urban environment for cities to monitor the transmission of infectious diseases and detect potentially urgent threats. To achieve this goal, the DNA sample collection and analysis have been conducted at subway stations in major cities. However, city-scale sampling with the fine-grained geo-spatial resolution is expensive and laborious. In this paper, we introduce MetaMLAnn, a neural network based approach to infer microbial communities at unsampled locations given information reflecting different factors, including subway line networks, sampling material types, and microbial composition patterns. RESULTS:We evaluate the effectiveness of MetaMLAnn based on the public metagenomics dataset collected from multiple locations in the New York and Boston subway systems. The experimental results suggest that MetaMLAnn consistently performs better than other five conventional classifiers under different taxonomic ranks. At genus level, MetaMLAnn can achieve F1 scores of 0.63 and 0.72 on the New York and the Boston datasets, respectively. CONCLUSIONS:By exploiting heterogeneous features, MetaMLAnn captures the hidden interactions between microbial compositions and the urban environment, which enables precise predictions of microbial communities at unmeasured locations

eScholarship - University of California

Digital empathy secures Frankenstein's monster

Author: Bond RR
Engel F
Fuchs M
Hemmje Matthias
McKevitt PM
McTear Michael
Mulvenna Maurice
Walsh Paul
Zheng Huiru
Publication venue: Collaborative European Research Conference
Publication date: 30/03/2019
Field of study

Ulster University's Research Portal

Predicting the artificial immunity induced by RUTI® vaccine against tuberculosis using universal immune system simulator (UISS)

Author: Amat Merce
Bonaccorso Angela
Cardona Pere-Joan
Fichera Epifanio
Mitra Dipendra Kumar
Pappalardo Francesco
Parasiliti Palumbo Giuseppe Alessandro
Pennisi Marzio
Russo Giulia
Sgroi Giuseppe
Viceconti Marco
Walker Kenneth B.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2019
Field of study

BACKGROUND: Tuberculosis (TB) represents a worldwide cause of mortality (it infects one third of the world's population) affecting mostly developing countries, including India, and recently also developed ones due to the increased mobility of the world population and the evolution of different new bacterial strains capable to provoke multi-drug resistance phenomena. Currently, antitubercular drugs are unable to eradicate subpopulations of Mycobacterium tuberculosis (MTB) bacilli and therapeutic vaccinations have been postulated to overcome some of the critical issues related to the increase of drug-resistant forms and the difficult clinical and public health management of tuberculosis patients. The Horizon 2020 EC funded project "In Silico Trial for Tuberculosis Vaccine Development" (STriTuVaD) to support the identification of new therapeutic interventions against tuberculosis through novel in silico modelling of human immune responses to disease and vaccines, thereby drastically reduce the cost of clinical trials in this critical sector of public healthcare

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

Diposit Digital de Documents de la UAB

Archivio Istituzionale della Ricerca- Università del Piemonte Orientale

DeepEP: A Deep Learning Framework for Identifying Essential Proteins

Author: Li Min
Li Yaohang
Pan Yi
Wu Fang-Xiang
Zeng Min
Publication venue: ODU Digital Commons
Publication date: 01/12/2019
Field of study

Background: Essential proteins are crucial for cellular life and thus, identification of essential proteins is an important topic and a challenging problem for researchers. Recently lots of computational approaches have been proposed to handle this problem. However, traditional centrality methods cannot fully represent the topological features of biological networks. In addition, identifying essential proteins is an imbalanced learning problem; but few current shallow machine learning-based methods are designed to handle the imbalanced characteristics. Results: We develop DeepEP based on a deep learning framework that uses the node2vec technique, multi-scale convolutional neural networks and a sampling technique to identify essential proteins. In DeepEP, the node2vec technique is applied to automatically learn topological and semantic features for each protein in protein-protein interaction (PPI) network. Gene expression profiles are treated as images and multi-scale convolutional neural networks are applied to extract their patterns. In addition, DeepEP uses a sampling method to alleviate the imbalanced characteristics. The sampling method samples the same number of the majority and minority samples in a training epoch, which is not biased to any class in training process. The experimental results show that DeepEP outperforms traditional centrality methods. Moreover, DeepEP is better than shallow machine learning-based methods. Detailed analyses show that the dense vectors which are generated by node2vec technique contribute a lot to the improved performance. It is clear that the node2vec technique effectively captures the topological and semantic properties of PPI network. The sampling method also improves the performance of identifying essential proteins. Conclusion: We demonstrate that DeepEP improves the prediction performance by integrating multiple deep learning techniques and a sampling method. DeepEP is more effective than existing methods

Old Dominion University

Interpretable Deep Neural Network for Cancer Survival Analysis by Integrating Genomic and Clinical Data

Author: Hao Jie
Kang Mingon
Kim Youngsoon
Mallavarapu Tejaswini
Oh Jung Hun
Publication venue: Digital Scholarship@UNLV
Publication date: 23/12/2019
Field of study

Background: Understanding the complex biological mechanisms of cancer patient survival using genomic and clinical data is vital, not only to develop new treatments for patients, but also to improve survival prediction. However, highly nonlinear and high-dimension, low-sample size (HDLSS) data cause computational challenges to applying conventional survival analysis. Results: We propose a novel biologically interpretable pathway-based sparse deep neural network, named Cox-PASNet, which integrates high-dimensional gene expression data and clinical data on a simple neural network architecture for survival analysis. Cox-PASNet is biologically interpretable where nodes in the neural network correspond to biological genes and pathways, while capturing the nonlinear and hierarchical effects of biological pathways associated with cancer patient survival. We also propose a heuristic optimization solution to train Cox-PASNet with HDLSS data. Cox-PASNet was intensively evaluated by comparing the predictive performance of current state-of-the-art methods on glioblastoma multiforme (GBM) and ovarian serous cystadenocarcinoma (OV) cancer. In the experiments, Cox-PASNet showed out-performance, compared to the benchmarking methods. Moreover, the neural network architecture of Cox-PASNet was biologically interpreted, and several significant prognostic factors of genes and biological pathways were identified. Conclusions: Cox-PASNet models biological mechanisms in the neural network by incorporating biological pathway databases and sparse coding. The neural network of Cox-PASNet can identify nonlinear and hierarchical associations of genomic and clinical data to cancer patient survival. The open-source code of Cox-PASNet in PyTorch implemented for training, evaluation, and model interpretation is available at: https://github.com/DataX-JieHao/Cox-PASNet

University of Nevada, Las Vegas Repository

Medinoid : computer-aided diagnosis and localization of glaucoma using deep learning

Author: De Neve Wesley
Han Jong Chul
Hyun Seung Hyup
Janssens Olivier
Kee Changwon
Kim Mijung
Van Hoecke Sofie
Publication venue: 'MDPI AG'
Publication date: 01/01/2019
Field of study

Glaucoma is a leading eye disease, causing vision loss by gradually affecting peripheral vision if left untreated. Current diagnosis of glaucoma is performed by ophthalmologists, human experts who typically need to analyze different types of medical images generated by different types of medical equipment: fundus, Retinal Nerve Fiber Layer (RNFL), Optical Coherence Tomography (OCT) disc, OCT macula, perimetry, and/or perimetry deviation. Capturing and analyzing these medical images is labor intensive and time consuming. In this paper, we present a novel approach for glaucoma diagnosis and localization, only relying on fundus images that are analyzed by making use of state-of-the-art deep learning techniques. Specifically, our approach towards glaucoma diagnosis and localization leverages Convolutional Neural Networks (CNNs) and Gradient-weighted Class Activation Mapping (Grad-CAM), respectively. We built and evaluated different predictive models using a large set of fundus images, collected and labeled by ophthalmologists at Samsung Medical Center (SMC). Our experimental results demonstrate that our most effective predictive model is able to achieve a high diagnosis accuracy of 96%, as well as a high sensitivity of 96% and a high specificity of 100% for Dataset-Optic Disc (OD), a set of center-cropped fundus images highlighting the optic disc. Furthermore, we present Medinoid, a publicly-available prototype web application for computer-aided diagnosis and localization of glaucoma, integrating our most effective predictive model in its back-end

Multidisciplinary Digital Publishing Institute

Ghent University Academic Bibliography

Graph Theoretic and Pearson Correlation-Based Discovery of Network Biomarkers for Cancer

Author: Al Mamun Abdullah
Aqila Tasmia
Maharjan Mona
Mondal Ananda Mohan
Tanvir Raihanul Bari
Publication venue: FIU Digital Commons
Publication date: 05/06/2019
Field of study

Two graph theoretic concepts—clique and bipartite graphs—are explored to identify the network biomarkers for cancer at the gene network level. The rationale is that a group of genes work together by forming a cluster or a clique-like structures to initiate a cancer. After initiation, the disease signal goes to the next group of genes related to the second stage of a cancer, which can be represented as a bipartite graph. In other words, bipartite graphs represent the cross-talk among the genes between two disease stages. To prove this hypothesis, gene expression values for three cancers— breast invasive carcinoma (BRCA), colorectal adenocarcinoma (COAD) and glioblastoma multiforme (GBM)—are used for analysis. First, a co-expression gene network is generated with highly correlated gene pairs with a Pearson correlation coefficient ≥ 0.9. Second, clique structures of all sizes are isolated from the co-expression network. Then combining these cliques, three different biomarker modules are developed—maximal clique-like modules, 2-clique-1-bipartite modules, and 3-clique-2-bipartite modules. The list of biomarker genes discovered from these network modules are validated as the essential genes for causing a cancer in terms of network properties and survival analysis. This list of biomarker genes will help biologists to design wet lab experiments for further elucidating the complex mechanism of cancer

DigitalCommons@Florida International University

The RareDis corpus: A corpus annotated with rare diseases, their signs and symptoms

Author: Chacon Solano Esteban Gonzalo
Guerrero Aspizua Sara
Martinez De Miguel Claudia
Segura-Bedmar Isabel
Publication venue: 'Elsevier BV'
Publication date: 01/01/2022
Field of study

Rare diseases affect a small number of people compared to the general population. However, more than 6,000 different rare diseases exist and, in total, they affect more than 300 million people worldwide. Rare diseases share as part of their main problem, the delay in diagnosis and the sparse information available for researchers, clinicians, and patients. Finding a diagnostic can be a very long and frustrating experience for patients and their families. The average diagnostic delay is between 6–8 years. Many of these diseases result in different manifestations among patients, which hampers even more their detection and the correct treatment choice. Therefore, there is an urgent need to increase the scientific and medical knowledge about rare diseases. Natural Language Processing (NLP) can help to extract relevant information about rare diseases to facilitate their diagnosis and treatments, but most NLP techniques require manually annotated corpora. Therefore, our goal is to create a gold standard corpus annotated with rare diseases and their clinical manifestations. It could be used to train and test NLP approaches and the information extracted through NLP could enrich the knowledge of rare diseases, and thereby, help to reduce the diagnostic delay and improve the treatment of rare diseases. The paper describes the selection of 1,041 texts to be included in the corpus, the annotation process and the annotation guidelines. The entities (disease, rare disease, symptom, sign and anaphor) and the relationships (produces, is a, is acron, is synon, increases risk of, anaphora) were annotated. The RareDis corpus contains more than 5,000 rare diseases and almost 6,000 clinical manifestations are annotated. Moreover, the Inter Annotator Agreement evaluation shows a relatively high agreement (F1-measure equal to 83.5% under exact match criteria for the entities and equal to 81.3% for the relations). Based on these results, this corpus is of high quality, supposing a significant step for the field since there is a scarcity of available corpus annotated with rare diseases. This could open the door to further NLP applications, which would facilitate the diagnosis and treatment of these rare diseases and, therefore, would improve dramatically the quality of life of these patients.This work was supported by the Madrid Government (Comunidad de Madrid) under the Multiannual Agreement with UC3M in the line of "Fostering Young Doctors Research" (NLP4RARE-CM-UC3M) and in the context of the V PRICIT (Regional Programme of Research and Technological Innovation; the Multiannual Agreement with UC3M in the line of "Excellence of University Professors (EPUC3M17)"; and a grant from Spanish Ministry of Economy and Competitiveness (SAF2017-86810-R)

Universidad Carlos III de Madrid e-Archivo