Search CORE

66 research outputs found

Recommended from our members

Mathematical Modeling of Viral Evolution and Epidemiology

Author: Moshiri Alexander Niema
Publication venue: eScholarship, University of California
Publication date: 01/01/2019
Field of study

Phylogenetic trees can be used to study the evolution of any sequence that evolves, including viruses. In a viral epidemic, the history of transmission events defines constraints on the evolutionary history of the viral population. The spread of many viruses is driven by social and sexual networks, and because of the relationship between their evolutionary and transmission histories, phylogenetic inference from viral sequences can be used to improve the inference of patterns of the epidemic, which in turn may be able to enhance epidemiological intervention. The simultaneous simulation of viral transmission networks, phylogenetic trees, and sequences can provide a method to observe the effects of virus model parameters on the epidemic as well as to study the accuracies and errors of transmission inference tools, but the success of such simulations relies on the existence of appropriate models. Further, the development of massively-scalable tools to analyze ultra-large datasets of viral sequences can aid epidemiologists in the real-time surveillance of the spread of disease. To enable viral epidemic simulation analyses, I developed FAVITES: a novel framework to simulate viral transmission networks, phylogenetic trees, and sequences, and I used FAVITES to study the effects of model parameters on epidemic outcomes. In an effort to better capture the unbalanced topologies commonly observed in retroviral phylogenies, I developed a novel evolutionary model (dual-birth), derived probabilistic distributions and theoretical expectations of trees sampled under the model, developed an approach to estimate model parameters given real data, and used the model to analyze Alu retrotransposons in the human genome. In order to potentially aid public health officials, I developed a scalable and non-parametric phylogenetic method of viral transmission risk prioritization, which I evaluated against current best-practice methods via simulation and real data. Lastly, I contributed to Bioinformatics education by developing multiple publicly-accessible adaptive online interactive texts

eScholarship - University of California

Ten Simple Rules for Reproducible Research in Jupyter Notebooks

Author: Altintas Ilkay
Birmingham Amanda
Huang Shih-Cheng
Knight Rob
Moshiri Niema
Nguyen Mai H.
Pérez Fernando
Rose Peter W.
Rosenthal Sara Brin
Rule Adam
Zuniga Cristal
Publication venue
Publication date: 13/10/2018
Field of study

Reproducibility of computational studies is a hallmark of scientific methodology. It enables researchers to build with confidence on the methods and findings of others, reuse and extend computational pipelines, and thereby drive scientific progress. Since many experimental studies rely on computational analyses, biologists need guidance on how to set up and document reproducible data analyses or simulations. In this paper, we address several questions about reproducibility. For example, what are the technical and non-technical barriers to reproducible computational studies? What opportunities and challenges do computational notebooks offer to overcome some of these barriers? What tools are available and how can they be used effectively? We have developed a set of rules to serve as a guide to scientists with a specific focus on computational notebook systems, such as Jupyter Notebooks, which have become a tool of choice for many applications. Notebooks combine detailed workflows with narrative text and visualization of results. Combined with software repositories and open source licensing, notebooks are powerful tools for transparent, collaborative, reproducible, and reusable data analyses

arXiv.org e-Print Archive

eScholarship - University of California

HD-Bind: Encoding of Molecular Structure with Low Precision, Hyperdimensional Binary Representations

Author: Allen Jonathan E.
Jones Derek
Kang Jaeyoung
Khaleghi Behnam
Moshiri Niema
Rosing Tajana S.
Xu Weihong
Zhang Xiaohua
Publication venue
Publication date: 27/03/2023
Field of study

Publicly available collections of drug-like molecules have grown to comprise 10s of billions of possibilities in recent history due to advances in chemical synthesis. Traditional methods for identifying ``hit'' molecules from a large collection of potential drug-like candidates have relied on biophysical theory to compute approximations to the Gibbs free energy of the binding interaction between the drug to its protein target. A major drawback of the approaches is that they require exceptional computing capabilities to consider for even relatively small collections of molecules. Hyperdimensional Computing (HDC) is a recently proposed learning paradigm that is able to leverage low-precision binary vector arithmetic to build efficient representations of the data that can be obtained without the need for gradient-based optimization approaches that are required in many conventional machine learning and deep learning approaches. This algorithmic simplicity allows for acceleration in hardware that has been previously demonstrated for a range of application areas. We consider existing HDC approaches for molecular property classification and introduce two novel encoding algorithms that leverage the extended connectivity fingerprint (ECFP) algorithm. We show that HDC-based inference methods are as much as 90 times more efficient than more complex representative machine learning methods and achieve an acceleration of nearly 9 orders of magnitude as compared to inference with molecular docking. We demonstrate multiple approaches for the encoding of molecular data for HDC and examine their relative performance on a range of challenging molecular property prediction and drug-protein binding classification tasks. Our work thus motivates further investigation into molecular representation learning to develop ultra-efficient pre-screening tools

arXiv.org e-Print Archive

The molecular epidemiology of multiple zoonotic origins of SARS-CoV-2

Author: Andersen Kristian G.
Ching Zi Yan Katherine
Crits-Christoph Alexander
Gangavarapu Karthik
Garry Robert F.
Havens Jennifer L.
Holmes Edward C.
Hughes Scott
Izhikevich Katherine
Lee Jungmin
Levy Joshua I.
Lin Raymond Tzer Pin
Magee Andrew
Malpica Serrano Lorena Mariana
Mat Isa Mohd Noor
Matteson Nathaniel L.
Moshiri Niema
Noor Yusuf Muhammad
Park Heedo
Park Man Seong
Parker Edyth
Pekar Jonathan E.
Rambaut Andrew
Suchard Marc A.
Vasylyeva Tetyana I.
Wang Jade C.
Wertheim Joel O.
Worobey Michael
Zeller Mark
Publication venue: 'American Association for the Advancement of Science (AAAS)'
Publication date: 26/07/2022
Field of study

Understanding the circumstances that lead to pandemics is important for their prevention. Here, we analyze the genomic diversity of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) early in the coronavirus disease 2019 (COVID-19) pandemic. We show that SARS-CoV-2 genomic diversity before February 2020 likely comprised only two distinct viral lineages, denoted A and B. Phylodynamic rooting methods, coupled with epidemic simulations, reveal that these lineages were the result of at least two separate cross-species transmission events into humans. The first zoonotic transmission likely involved lineage B viruses around 18 November 2019 (23 October–8 December), while the separate introduction of lineage A likely occurred within weeks of this event. These findings indicate that it is unlikely that SARS-CoV-2 circulated widely in humans prior to November 2019 and define the narrow window between when SARS-CoV-2 first jumped into humans and when the first cases of COVID-19 were reported. As with other coronaviruses, SARS-CoV-2 emergence likely resulted from multiple zoonotic events

PubMed Central

Edinburgh Research Explorer

eScholarship - University of California

Recommended from our members

ViralConsensus: a fast and memory-efficient tool for calling viral consensus genome sequences directly from read alignment data

Author: Moshiri Niema
Publication venue: eScholarship, University of California
Publication date: 04/05/2023
Field of study

MotivationIn viral molecular epidemiology, reconstruction of consensus genomes from sequence data is critical for tracking mutations and variants of concern. However, as the number of samples that are sequenced grows rapidly, compute resources needed to reconstruct consensus genomes can become prohibitively large.ResultsViralConsensus is a fast and memory-efficient tool for calling viral consensus genome sequences directly from read alignment data. ViralConsensus is orders of magnitude faster and more memory-efficient than existing methods. Further, unlike existing methods, ViralConsensus can pipe data directly from a read mapper via standard input and performs viral consensus calling on-the-fly, making it an ideal tool for viral sequencing pipelines.Availability and implementationViralConsensus is freely available at https://github.com/niemasd/ViralConsensus as an open-source software project

eScholarship - University of California

NiemaGraphGen: A memory-efficient global-scale contact network simulation toolkit.

Author: Moshiri Niema,
Publication venue
Publication date: 28/06/2023
Field of study

Ezid

Recommended from our members

Niema Moshiri: Inferencia filogenética en tiempo real y análisis de clúster de transmisión de COVID-19

Author: Moshiri Niema
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2020
Field of study

Descripción de esta presentación: Esta presentación fue hecha por Niema Moshiri, University of California San Diego. El título de la presentación es: "Inferencia filogenética en tiempo real y análisis de clúster de transmisión de COVID-19." - Descripción de los seminarios web del CIC: Cada mes, el equipo del Centro de Información de COVID (junto con el Northeast Big Data Innovation Hub) reúne a un grupo de investigadores que estudian diversos aspectos de la pandemia actual, para compartir sus investigaciones y responder preguntas de nuestra comunidad. Los eventos muestran los esfuerzos continuos de los científicos en la lucha contra la COVID-19, incluyendo oportunidades de colaboración

Columbia University Academic Commons

FAVITES: simultaneous simulation of transmission networks, phylogenetic trees and sequences

Author: Moshiri Niema,
Publication venue
Publication date: 05/06/2023
Field of study

Ezid