47 research outputs found

    Explainable and Lightweight Model for COVID-19 Detection Using Chest Radiology Images

    Full text link
    Deep learning (DL) analysis of Chest X-ray (CXR) and Computed tomography (CT) images has garnered a lot of attention in recent times due to the COVID-19 pandemic. Convolutional Neural Networks (CNNs) are well suited for the image analysis tasks when trained on humongous amounts of data. Applications developed for medical image analysis require high sensitivity and precision compared to any other fields. Most of the tools proposed for detection of COVID-19 claims to have high sensitivity and recalls but have failed to generalize and perform when tested on unseen datasets. This encouraged us to develop a CNN model, analyze and understand the performance of it by visualizing the predictions of the model using class activation maps generated using (Gradient-weighted Class Activation Mapping) Grad-CAM technique. This study provides a detailed discussion of the success and failure of the proposed model at an image level. Performance of the model is compared with state-of-the-art DL models and shown to be comparable. The data and code used are available at https://github.com/aleesuss/c19

    Evaluating Generalizability of Deep Learning Models Using Indian-COVID-19 CT Dataset

    Full text link
    Computer tomography (CT) have been routinely used for the diagnosis of lung diseases and recently, during the pandemic, for detecting the infectivity and severity of COVID-19 disease. One of the major concerns in using ma-chine learning (ML) approaches for automatic processing of CT scan images in clinical setting is that these methods are trained on limited and biased sub-sets of publicly available COVID-19 data. This has raised concerns regarding the generalizability of these models on external datasets, not seen by the model during training. To address some of these issues, in this work CT scan images from confirmed COVID-19 data obtained from one of the largest public repositories, COVIDx CT 2A were used for training and internal vali-dation of machine learning models. For the external validation we generated Indian-COVID-19 CT dataset, an open-source repository containing 3D CT volumes and 12096 chest CT images from 288 COVID-19 patients from In-dia. Comparative performance evaluation of four state-of-the-art machine learning models, viz., a lightweight convolutional neural network (CNN), and three other CNN based deep learning (DL) models such as VGG-16, ResNet-50 and Inception-v3 in classifying CT images into three classes, viz., normal, non-covid pneumonia, and COVID-19 is carried out on these two datasets. Our analysis showed that the performance of all the models is comparable on the hold-out COVIDx CT 2A test set with 90% - 99% accuracies (96% for CNN), while on the external Indian-COVID-19 CT dataset a drop in the performance is observed for all the models (8% - 19%). The traditional ma-chine learning model, CNN performed the best on the external dataset (accu-racy 88%) in comparison to the deep learning models, indicating that a light-weight CNN is better generalizable on unseen data. The data and code are made available at https://github.com/aleesuss/c19

    SeqVItA: Sequence Variant Identification and Annotation Platform for Next Generation Sequencing Data

    Get PDF
    The current trend in clinical data analysis is to understand how individuals respond to therapies and drug interactions based on their genetic makeup. This has led to a paradigm shift in healthcare; caring for patients is now 99% information and 1% intervention. Reducing costs of next generation sequencing (NGS) technologies has made it possible to take genetic profiling to the clinical setting. This requires not just fast and accurate algorithms for variant detection, but also a knowledge-base for variant annotation and prioritization to facilitate tailored therapeutics based on an individual's genetic profile. Here we show that it is possible to provide a fast and easy access to all possible information about a variant and its impact on the gene, its protein product, associated pathways and drug-variant interactions by integrating previously reported knowledge from various databases. With this objective, we have developed a pipeline, Sequence Variants Identification and Annotation (SeqVItA) that provides end-to-end solution for small sequence variants detection, annotation and prioritization on a single platform. Parallelization of the variant detection step and with numerous resources incorporated to infer functional impact, clinical relevance and drug-variant associations, SeqVItA will benefit the clinical and research communities alike. Its open-source platform and modular framework allows for easy customization of the workflow depending on the data type (single, paired, or pooled samples), variant type (germline and somatic), and variant annotation and prioritization. Performance comparison of SeqVItA on simulated data and detection, interpretation and analysis of somatic variants on real data (24 liver cancer patients) is carried out. We demonstrate the efficacy of annotation module in facilitating personalized medicine based on patient's mutational landscape. SeqVItA is freely available at https://bioinf.iiit.ac.in/seqvita

    Meta-analysis of drought-tolerant genotypes in Oryza sativa: A network-based approach.

    No full text
    BackgroundDrought is a severe environmental stress. It is estimated that about 50% of the world rice production is affected mainly by drought. Apart from conventional breeding strategies to develop drought-tolerant crops, innovative computational approaches may provide insights into the underlying molecular mechanisms of stress response and identify drought-responsive markers. Here we propose a network-based computational approach involving a meta-analytic study of seven drought-tolerant rice genotypes under drought stress.ResultsCo-expression networks enable large-scale analysis of gene-pair associations and tightly coupled clusters that may represent coordinated biological processes. Considering differentially expressed genes in the co-expressed modules and supplementing external information such as resistance/tolerance QTLs, transcription factors, network-based topological measures, we identify and prioritize drought-adaptive co-expressed gene modules and potential candidate genes. Using the candidate genes that are well-represented across the datasets as 'seed' genes, two drought-specific protein-protein interaction networks (PPINs) are constructed with up- and down-regulated genes. Cluster analysis of the up-regulated PPIN revealed ABA signalling pathway as a central process in drought response with a probable crosstalk with energy metabolic processes. Tightly coupled gene clusters representing up-regulation of core cellular respiratory processes and enhanced degradation of branched chain amino acids and cell wall metabolism are identified. Cluster analysis of down-regulated PPIN provides a snapshot of major processes associated with photosynthesis, growth, development and protein synthesis, most of which are shut down during drought. Differential regulation of phytohormones, e.g., jasmonic acid, cell wall metabolism, signalling and posttranslational modifications associated with biotic stress are elucidated. Functional characterization of topologically important, drought-responsive uncharacterized genes that may play a role in important processes such as ABA signalling, calcium signalling, photosynthesis and cell wall metabolism is discussed. Further transgenic studies on these genes may help in elucidating their biological role under stress conditions.ConclusionCurrently, a large number of resources for rice functional genomics exist which are mostly underutilized by the scientific community. In this study, a computational approach integrating information from various resources such as gene co-expression networks, protein-protein interactions and pathway-level information is proposed to provide a systems-level view of complex drought-responsive processes across the drought-tolerant genotypes

    Velocity selection in coupled-map lattices

    No full text
    We investigate the phenomenon of velocity selection for traveling wave fronts in a class of coupled-map lattices, derived by discretizations of the Fisher equation [Ann. Eugenics 7, 355 (1937)]. We find that the velocity selection can be understood in terms of a discrete analog of the marginal-stability hypothesis. A perturbative approach also enables us to estimate the selected velocity accurately for small values of the discretization mesh sizes

    Sequence and Structure-Based Analyses of Human Ankyrin Repeats

    No full text
    Ankyrin is one of the most abundant protein repeat families found across all forms of life. It is found in a variety of multi-domain and single domain proteins in humans with diverse number of repeating units. They are observed to occur in several functionally diverse proteins, such as transcriptional initiators, cell cycle regulators, cytoskeletal organizers, ion transporters, signal transducers, developmental regulators, and toxins, and, consequently, defects in ankyrin repeat proteins have been associated with a number of human diseases. In this study, we have classified the human ankyrin proteins into clusters based on the sequence similarity in their ankyrin repeat domains. We analyzed the amino acid compositional bias and consensus ankyrin motif sequence of the clusters to understand the diversity of the human ankyrin proteins. We carried out network-based structural analysis of human ankyrin proteins across different clusters and showed the association of conserved residues with topologically important residues identified by network centrality measures. The analysis of conserved and structurally important residues helps in understanding their role in structural stability and function of these proteins. In this paper, we also discuss the significance of these conserved residues in disease association across the human ankyrin protein clusters

    In Silico Identification and Functional Characterization of Genetic Variations across DLBCL Cell Lines

    No full text
    Diffuse large B-cell lymphoma (DLBCL) is the most common form of non-Hodgkin lymphoma and frequently develops through the accumulation of several genetic variations. With the advancement in high-throughput techniques, in addition to mutations and copy number variations, structural variations have gained importance for their role in genome instability leading to tumorigenesis. In this study, in order to understand the genetics of DLBCL pathogenesis, we carried out a whole-genome mutation profile analysis of eleven human cell lines from germinal-center B-cell-like (GCB-7) and activated B-cell-like (ABC-4) subtypes of DLBCL. Analysis of genetic variations including small sequence variants and large structural variations across the cell lines revealed distinct variation profiles indicating the heterogeneous nature of DLBCL and the need for novel patient stratification methods to design potential intervention strategies. Validation and prognostic significance of the variants was assessed using annotations provided for DLBCL samples in cBioPortal for Cancer Genomics. Combining genetic variations revealed new subgroups between the subtypes and associated enriched pathways, viz., PI3K-AKT signaling, cell cycle, TGF-beta signaling, and WNT signaling. Mutation landscape analysis also revealed drug–variant associations and possible effectiveness of known and novel DLBCL treatments. From the whole-genome-based mutation analysis, our findings suggest putative molecular genetics of DLBCL lymphomagenesis and potential genomics-driven precision treatments
    corecore