84 research outputs found

    Graph Representation Learning in Biomedicine

    Full text link
    Biomedical networks are universal descriptors of systems of interacting elements, from protein interactions to disease networks, all the way to healthcare systems and scientific knowledge. With the remarkable success of representation learning in providing powerful predictions and insights, we have witnessed a rapid expansion of representation learning techniques into modeling, analyzing, and learning with such networks. In this review, we put forward an observation that long-standing principles of networks in biology and medicine -- while often unspoken in machine learning research -- can provide the conceptual grounding for representation learning, explain its current successes and limitations, and inform future advances. We synthesize a spectrum of algorithmic approaches that, at their core, leverage graph topology to embed networks into compact vector spaces, and capture the breadth of ways in which representation learning is proving useful. Areas of profound impact include identifying variants underlying complex traits, disentangling behaviors of single cells and their effects on health, assisting in diagnosis and treatment of patients, and developing safe and effective medicines

    Machine Learning Methods for Effectively Discovering Complex Relationships in Graph Data

    Get PDF
    Graphs are extensively employed in many systems due to their capability to capture the interactions (edges) among data (nodes) in many real-life scenarios. Social networks, biological networks and molecular graphs are some of the domains where data have inherent graph structural information. Built graphs can be used to make predictions in Machine Learning (ML) such as node classifications, link predictions, graph classifications, etc. But, existing ML algorithms hold a core assumption that data instances are independent of each other and hence prevent incorporating graph information into ML. This irregular and variable sized nature of non-Euclidean data makes learning underlying patterns of the graph more sophisticated. One approach is to convert the graph information into a lower dimensional space and use traditional learning methods on the reduced space. Meanwhile, Deep Learning has better performance than ML due to convolutional layers and recurrent layers which consider simple correlations in spatial and temporal data, respectively. This proves the importance of taking data interrelationships into account and Graph Convolutional Networks (GCNs) are inspired by this fact to exploit the structure of graphs to make better inference in both node-centric and graph-centric applications. In this dissertation, the graph based ML prediction is addressed in terms of both node classification and link prediction tasks. At first, GCN is thoroughly studied and compared with other graph embedding methods specific to biological networks. Next, we present several new GCN algorithms to improve the prediction performance related to biomedical networks and medical imaging tasks. A circularRNA (circRNA) and disease association network is modeled for both node classification and link prediction tasks to predict diseases relevant to circRNAs to demonstrate the effectiveness of graph convolutional learning. A GCN based chest X-ray image classification outperforms state-of-the-art transfer learning methods. Next, the graph representation is used to analyze the feature dependencies of data and select an optimal feature subset which respects the original data structure. Finally, the usability of this algorithm is discussed in identifying disease specific genes by exploiting gene-gene interactions

    A Deep Learning Approach for Multi-Omics Data Integration to Diagnose Early-Onset Colorectal Cancer

    Get PDF
    Colorectal cancer is one of the most common cancers and is a leading cause of death worldwide. It starts in the colon or the rectum, and they are often grouped together because they have many features in common. It has been noticed that colorectal cancer attacks young-onset patients who are less than 50 years of age in increasing rates lately. Rapid developments in omics technologies have led them to be highly regarded in the field of biomedical research for the early detection of cancer. Omics data revealed how different molecules and clinical features work together in the disease progression. However, Omics data sources are variants in nature and require careful preprocessing to be integrated. A convolutional neural network is a class of deep neural networks, commonly applied to analyze visual imagery. In this thesis, we propose a model that converts one-dimensional vectors of omics into RGB images to be integrated into the hidden layers of the convolutional neural network. The prediction model will allow all different omics to contribute to the decision making based on extracting the hidden interactions among these omics. These subsets of interacted omics can serve as potential biomarkers for young-onset colorectal cancer

    Advancing Biomedicine with Graph Representation Learning: Recent Progress, Challenges, and Future Directions

    Full text link
    Graph representation learning (GRL) has emerged as a pivotal field that has contributed significantly to breakthroughs in various fields, including biomedicine. The objective of this survey is to review the latest advancements in GRL methods and their applications in the biomedical field. We also highlight key challenges currently faced by GRL and outline potential directions for future research.Comment: Accepted by 2023 IMIA Yearbook of Medical Informatic

    DTiGNN: Learning drug-target embedding from a heterogeneous biological network based on a two-level attention-based graph neural network

    Get PDF
    Motivation: In vitro experiment-based drug-target interaction (DTI) exploration demands more human, financial and data resources. In silico approaches have been recommended for predicting DTIs to reduce time and cost. During the drug development process, one can analyze the therapeutic effect of the drug for a particular disease by identifying how the drug binds to the target for treating that disease. Hence, DTI plays a major role in drug discovery. Many computational methods have been developed for DTI prediction. However, the existing methods have limitations in terms of capturing the interactions via multiple semantics between drug and target nodes in a heterogeneous biological network (HBN). Methods: In this paper, we propose a DTiGNN framework for identifying unknown drug-target pairs. The DTiGNN first calculates the similarity between the drug and target from multiple perspectives. Then, the features of drugs and targets from each perspective are learned separately by using a novel method termed an information entropy-based random walk. Next, all of the learned features from different perspectives are integrated into a single drug and target similarity network by using a multi-view convolutional neural network. Using the integrated similarity networks, drug interactions, drug-disease associations, protein interactions and protein-disease association, the HBN is constructed. Next, a novel embedding algorithm called a meta-graph guided graph neural network is used to learn the embedding of drugs and targets. Then, a convolutional neural network is employed to infer new DTIs after balancing the sample using oversampling techniques. Results: The DTiGNN is applied to various datasets, and the result shows better performance in terms of the area under receiver operating characteristic curve (AUC) and area under precision-recall curve (AUPR), with scores of 0.98 and 0.99, respectively. There are 23,739 newly predicted DTI pairs in total

    Functionally Relevant Macromolecular Interactions of Disordered Proteins

    Get PDF
    Disordered proteins are relatively recent newcomers in protein science. They were first described in detail by Wright and Dyson, in their J. Mol. Biol. paper in 1999. First, it was generally thought for more than a decade that disordered proteins or disordered parts of proteins have different amino acid compositions than folded proteins, and various prediction methods were developed based on this principle. These methods were suitable for distinguishing between the disordered (unstructured) and structured proteins known at that time. In addition, they could predict the site where a folded protein binds to the disordered part of a protein, shaping the latter into a well-defined 3D structure. Recently, however, evidence has emerged for a new type of disordered protein family whose members can undergo coupled folding and binding without the involvement of any folded proteins. Instead, they interact with each other, stabilizing their structure via ā€œmutual synergistic foldingā€ and, surprisingly, they exhibit the same residue composition as the folded protein. Increasingly more examples have been found where disordered proteins interact with non-protein macromolecules, adding to the already large variety of proteinā€“protein interactions. There is also a very new phenomenon when proteins are involved in phase separation, which can represent a weak but functionally important macromolecular interaction. These phenomena are presented and discussed in the chapters of this book

    RNA, the Epicenter of Genetic Information

    Get PDF
    The origin story and emergence of molecular biology is muddled. The early triumphs in bacterial genetics and the complexity of animal and plant genomes complicate an intricate history. This book documents the many advances, as well as the prejudices and founder fallacies. It highlights the premature relegation of RNA to simply an intermediate between gene and protein, the underestimation of the amount of information required to program the development of multicellular organisms, and the dawning realization that RNA is the cornerstone of cell biology, development, brain function and probably evolution itself. Key personalities, their hubris as well as prescient predictions are richly illustrated with quotes, archival material, photographs, diagrams and references to bring the people, ideas and discoveries to life, from the conceptual cradles of molecular biology to the current revolution in the understanding of genetic information. Key Features Documents the confused early history of DNA, RNA and proteins - a transformative history of molecular biology like no other. Integrates the influences of biochemistry and genetics on the landscape of molecular biology. Chronicles the important discoveries, preconceptions and misconceptions that retarded or misdirected progress. Highlights major pioneers and contributors to molecular biology, with a focus on RNA and noncoding DNA. Summarizes the mounting evidence for the central roles of non-protein-coding RNA in cell and developmental biology. Provides a thought-provoking retrospective and forward-looking perspective for advanced students and professional researchers

    Dissecting the genetic bases of severe malaria resistance using genome-wide and post genomewide study approaches

    Get PDF
    P. falciparum malaria remains one of the leading public health problems worldwide. The global tally of malaria in 2018 was estimated at 228 million cases and 405, 000 deaths worldwide. African countries disproportionately carry the global burden of malaria accounting for 93% and 94% of cases and deaths, respectively. Even though most infected children recover from P. falciparum malaria, a small subset (~1%) of cases progresses to severe disease and death. Over the last decade, several genome-wide association studies (GWASs) have been conducted in diverse malaria endemic populations to understand the natural host protective immunity against severe malaria that can provide clues for the development of new vaccines and therapeutics. However, beyond identifying association variants, conventional GWAS approaches can't inform the underpinning biological functions. To bridge this gap, we applied various contemporary statistical genetic analytic approaches to malaria GWAS datasets of diverse malaria endemic populations. First, we accessed malaria resistance GWAS datasets of three African populations (N=~11,000) including Kenya, Gambia and Malawi from European Genome Phenome Archive (EGA) through MalariaGEN consortium standard data accession procedures. We explored the challenges of GWAS approaches in the genetically diverse Africa populations and figured out how various advanced statistical genetic methods can be implemented to address these challenges. We investigated single nucleotide polymorphism (SNP) heritability (h2 g) of malaria resistance in the Gambian populations and determined appropriate quality (QC) thresholds to accurately estimate the h2 g in our dataset. Second, we estimated h2 g in the three populations and partitioned the h2 g into chromosomes, allele frequencies and annotations using the genetic relationship-matrix restricted maximum likelihood approaches. We further created African specific reference panel from African population datasets obtained from 1000 Genomes Project and African Genome Variation Project dataset and computed linkage disequilibrium (LD). We used LD information obtained from these reference panels to compute cell-type specific and none cell-type specific enrichments for GWAS-summary statistics meta-analyzed across the three populations. Our results showed for the first time that malaria resistance is polygenic trait with h2 g of ~20% and that the causal variants are overrepresented around protein coding regions of the genome. We further showed that the h2 g is disproportionately concentrated on three chromosomes (chr 5, 11 and 20), suggesting cost-effectiveness of targeting these chromosomes in future malaria genomic sequencing studies. Third, we systematically predicted plausible candidate genes and pathways from functional analysis of severe malaria resistance GWAS summary statistics (N = 17,000) meta-analyzed across eleven populations in malaria endemic regions in Africa, Asia and Oceania. We applied positional mapping, expression quantitative trait locus (eQTL), chromatin interaction mapping and gene-based association analyses to identify candidate severe malaria resistance genes. We performed network and pathway analyses to investigate their shared biological functions. We further applied rare variant analysis to raw GWAS datasets of three malaria endemic populations including Kenya, Malawi and Gambia and performed various population genetic structures of the identified genes in the three endemic populations and 20 world-wide ethnics. Our functional mapping analysis identified 57 genes located in the known malaria genomic loci while our gene-based GWAS analysis identified additional 125 genes across the genome. The identified genes were significantly enriched in malaria pathogenic pathways including multiple overlapping pathways in erythrocyte-related functions, blood coagulations, ion channels, adhesion molecules, membrane signaling elements and neuronal systems. Furthermore, our population genetic analysis revealed that the minor allele frequencies (MAF) of the SNPs residing in the identified genes are generally higher in the three malaria endemic populations compared to global populations. Overall, our results suggest that severe malaria resistance trait is attributed to multiple genes that are enriched in pathways linked to severe malaria pathogenesis. This highlights the possibility of harnessing new malaria therapeutics that can simultaneously target multiple malaria protective host molecular pathways. In conclusions, this project showed that malaria resistance trait is mainly a polygenic trait which is influenced by genes and pathways linked to blood stage lifecycle of P. falciparum. These findings constitute the foundations for future experimental studies that can potentially lead to translational medicine including development of new vaccines and therapeutics. However, ā€˜-omics' studies including those implemented in this study, are limited to single datatype analysis and lack adequate power to explain the complexity of molecular processes and usually lead to identification of correlations than causations. Thus, beyond singe locus analysis, the future direction of malaria resistance requires a paradigm shift from single-omics to multi-stage and multi-dimensional integrative multi-omics studies that combines multiple data types from the human host, the parasite, and the environment. The current biotechnological and statistical advances may eventually lead to the feasibility of systems biology studies and revolutionize malaria research

    PROGRAM, THE NEBRASKA ACADEMY OF SCIENCES: One Hundred-Thirty-First Annual Meeting, APRIL 23-24, 2021. ONLINE

    Get PDF
    AFFILIATED SOCIETIES OF THE NEBRASKA ACADEMY OF SCIENCES, INC. 1.American Association of Physics Teachers, Nebraska Section: Web site: http://www.aapt.org/sections/officers.cfm?section=Nebraska 2.Friends of Loren Eiseley: Web site: http://www.eiseley.org/ 3.Lincoln Gem & Mineral Club: Web site: http://www.lincolngemmineralclub.org/ 4.Nebraska Chapter, National Council for Geographic Education 5.Nebraska Geological Society: Web site: http://www.nebraskageologicalsociety.org Sponsors of a $50 award to the outstanding student paper presented at the Nebraska Academy of SciencesAnnual Meeting, Earth Science /Nebraska Chapter, National Council Sections 6.Nebraska Graduate Women in Science 7.Nebraska Junior Academy of Sciences: Web site: http://www.nebraskajunioracademyofsciences.org/ 8.Nebraska Ornithologistsā€™ Union: Web site: http://www.noubirds.org/ 9.Nebraska Psychological Association: http://www.nebpsych.org/ 10.Nebraska-Southeast South Dakota Section Mathematical Association of America: Web site: http://sections.maa.org/nesesd/ 11.Nebraska Space Grant Consortium: Web site: http://www.ne.spacegrant.org/ CONTENTS AERONAUTICS & SPACE SCIENCE ANTHROPOLOGY APPLIED SCIENCE & TECHNOLOGY BIOLOGICAL & MEDICAL SCIENCES COLLEGIATE ACADEMY: BIOLOGY COLLEGIATE ACADEMY: CHEMISTRY & PHYSICS EARTH SCIENCES ENVIRONMENTAL SCIENCES GENERAL CHEMISTRY GENERAL PHYSICS TEACHING OF SCIENCE & MATHEMATICS 2020-2021 PROGRAM COMMITTEE 2020-2021 EXECUTIVE COMMITTEE FRIENDS OF THE ACADEMY NEBRASKA ACADEMY OF SCIENCS FRIEND OF SCIENCE AWARD WINNERS FRIEND OF SCIENCE AWARD TO DR PAUL KAR
    • ā€¦
    corecore