4 research outputs found

    Modern Computer Science Approaches in Biology: From Predicting Molecular Functions to Modeling Protein Structure

    No full text
    Computational machines have become an inseparable part of human lives during the last three decades. One of the crucial enabling technologies of this technological boom is Artificial Intelligence (AI), the field dedicated to simulating human-like behavior in machines. It takes many shapes and forms; however, a particular direction – Machine Learning (ML) – was incredibly impactful in the era of constant data aggregation. The goal of ML is an automated pattern inference and reasoning based solely on the input data. Becoming a household name, machine learning completely revolutionized natural sciences, providing aid to the physicists working on quantum mechanics, helping astronomers filter noisy data, as well as accelerating molecular and cellular discoveries made by chemists and biologists. One of the crucial aspects of everyone’s lives affected by ML technology is the medical care. Perhaps most notable in this area, precision medicine provides the direct opportunity to improve patients’ quality of life directly. The field of precision medicine is dedicated to identifying reasons for different treatment responses from patients and designing the best-suited diagnostics and intervention strategy for each individual. In recent years, the available data pool was expanded by the emergence of high- throughput ‘omics’ experimental technics, making it intractable for conventional manual analysis by a clinician or a biomedical researcher. The omics field emerged in earlier 2000s when next- generation sequencing (NGS) methods that made studying individual genomes possible first emerged. The next big breakthrough happened in 2008, when the second generation of NGS came into play, drastically decreasing the costs of conducting experiments. However, genomics is not the only field that experienced the revolutionary leap. Other quantitative methods that describe molecular processes taking place in the organism advanced rapidly: epigenomics, transcriptomics, proteomics, and metabolomics. Transcriptomics and proteomics are particularly interesting when studying diseases as they are providing a snapshot of the organism’s current state, allowing us to search for the root cause of a particular ailment. Furthermore, transcriptomics provides information on an important regulatory process--alternative splicing (AS). AS increases the versatility of the organism’s molecular arsenal and allows to build more complex systems using the same number of genes. This feat is achieved via combinatorically shuffling selected protein coding parts – exons – from the mRNA molecule prior its transformation into a protein. Thus, AS is a crucial intermediate stage between the gene expression and protein translation. My work focuses on the computational analysis of biological data and encompasses structural genomics, transcriptomics, and proteomics. Individual projects range from elucidating disease etiology and uncovering molecular mechanisms of actions of the alternative splicing to searching for the protein expression-based treatment response biomarkers and studying the potential drug targets on the SARS-CoV-2 viral particle surface. Over the course of these studies I designed a machine learning model that estimates the AS effect on protein-protein interactions; developed a novel quantitative measure that gauges an impact that the alternatively spliced isoforms introduce to the biological system; predicted isoform stability using proteogenomic data and transfer learning; identified response biomarkers for the Gulf War veterans affected by one of the most complex known acquired syndromes for the acupuncture treatment; modeled protein complexes of SARS-CoV-2 virus and simulated its entire envelope in solvent using molecular dynamics methods. This work brings together two important aspects of modern omics studies – transcriptomics and proteomics. It highlights an importance of computational methods development for the modern field of precision medicine

    Structural Genomics of SARS-CoV-2 Indicates Evolutionary Conserved Functional Regions of Viral Proteins

    No full text
    During its first two and a half months, the recently emerged 2019 novel coronavirus, SARS-CoV-2, has already infected over one-hundred thousand people worldwide and has taken more than four thousand lives. However, the swiftly spreading virus also caused an unprecedentedly rapid response from the research community facing the unknown health challenge of potentially enormous proportions. Unfortunately, the experimental research to understand the molecular mechanisms behind the viral infection and to design a vaccine or antivirals is costly and takes months to develop. To expedite the advancement of our knowledge, we leveraged data about the related coronaviruses that is readily available in public databases and integrated these data into a single computational pipeline. As a result, we provide comprehensive structural genomics and interactomics roadmaps of SARS-CoV-2 and use this information to infer the possible functional differences and similarities with the related SARS coronavirus. All data are made publicly available to the research community

    Molecular architecture and dynamics of SARS-CoV-2 envelope by integrative modeling

    Get PDF
    Despite tremendous efforts, the exact structure of SARS-CoV-2 and related betacoronaviruses remains elusive. SARS-CoV-2 envelope is a key structural component of the virion that encapsulates viral RNA. It is composed of three structural proteins, spike, membrane (M), and envelope, which interact with each other and with the lipids acquired from the host membranes. Here, we developed and applied an integrative multi-scale computational approach to model the envelope structure of SARS-CoV-2 with near atomistic detail, focusing on studying the dynamic nature and molecular interactions of its most abundant, but largely understudied, M protein. The molecular dynamics simulations allowed us to test the envelope stability under different configurations and revealed that the M dimers agglomerated into large, filament-like, macromolecular assemblies with distinct molecular patterns. These results are in good agreement with current experimental data, demonstrating a generic and versatile approach to model the structure of a virus de novo

    Assessment of network module identification across complex diseases

    No full text
    International audienc
    corecore