Search CORE

22 research outputs found

Recommended from our members

A-Lister: a tool for analysis of differentially expressed omics entities across multiple pairwise comparisons.

Author: Listopad Stanislav A
Norden-Krichmar Trina M
Publication venue: eScholarship, University of California
Publication date: 01/11/2019
Field of study

BackgroundResearchers commonly analyze lists of differentially expressed entities (DEEs), such as differentially expressed genes (DEGs), differentially expressed proteins (DEPs), and differentially methylated positions/regions (DMPs/DMRs), across multiple pairwise comparisons. Large biological studies can involve multiple conditions, tissues, and timepoints that result in dozens of pairwise comparisons. Manually filtering and comparing lists of DEEs across multiple pairwise comparisons, typically done by writing custom code, is a cumbersome task that can be streamlined and standardized.ResultsA-Lister is a lightweight command line and graphical user interface tool written in Python. It can be executed in a differential expression mode or generic name list mode. In differential expression mode, A-Lister accepts as input delimited text files that are output by differential expression tools such as DESeq2, edgeR, Cuffdiff, and limma. To allow for the most flexibility in input ID types, to avoid database installation requirements, and to allow for secure offline use, A-Lister does not validate or impose restrictions on entity ID names. Users can specify thresholds to filter the input file(s) by column(s) such as p-value, q-value, and fold change. Additionally, users can filter the pairwise comparisons within the input files by fold change direction (sign). Queries composed of intersection, fuzzy intersection, difference, and union set operations can also be performed on any number of pairwise comparisons. Thus, the user can filter and compare any number of pairwise comparisons within a single A-Lister differential expression command. In generic name list mode, A-Lister accepts delimited text files containing lists of names as input. Queries composed of intersection, fuzzy intersection, difference, and union set operations can then be performed across these lists of names.ConclusionsA-Lister is a flexible tool that enables the user to rapidly narrow down large lists of DEEs to a small number of most significant entities. These entities can then be further analyzed using visualization, pathway analysis, and other bioinformatics tools

eScholarship - University of California

Identification of integrated proteomics and transcriptomics signature of alcohol-associated liver disease using machine learning.

Author: Aliya Asghar
Andrew Stolz
Christophe Magnan
John A Tayek
Jon M Jacobs
Le Z Day
Stanislav Listopad
Timothy R Morgan
Trina M Norden-Krichmar
Zhang-Xu Liu
Publication venue: Public Library of Science (PLoS)
Publication date: 01/02/2024
Field of study

Distinguishing between alcohol-associated hepatitis (AH) and alcohol-associated cirrhosis (AC) remains a diagnostic challenge. In this study, we used machine learning with transcriptomics and proteomics data from liver tissue and peripheral mononuclear blood cells (PBMCs) to classify patients with alcohol-associated liver disease. The conditions in the study were AH, AC, and healthy controls. We processed 98 PBMC RNAseq samples, 55 PBMC proteomic samples, 48 liver RNAseq samples, and 53 liver proteomic samples. First, we built separate classification and feature selection pipelines for transcriptomics and proteomics data. The liver tissue models were validated in independent liver tissue datasets. Next, we built integrated gene and protein expression models that allowed us to identify combined gene-protein biomarker panels. For liver tissue, we attained 90% nested-cross validation accuracy in our dataset and 82% accuracy in the independent validation dataset using transcriptomic data. We attained 100% nested-cross validation accuracy in our dataset and 61% accuracy in the independent validation dataset using proteomic data. For PBMCs, we attained 83% and 89% accuracy with transcriptomic and proteomic data, respectively. The integration of the two data types resulted in improved classification accuracy for PBMCs, but not liver tissue. We also identified the following gene-protein matches within the gene-protein biomarker panels: CLEC4M-CLC4M, GSTA1-GSTA2 for liver tissue and SELENBP1-SBP1 for PBMCs. In this study, machine learning models had high classification accuracy for both transcriptomics and proteomics data, across liver tissue and PBMCs. The integration of transcriptomics and proteomics into a multi-omics model yielded improvement in classification accuracy for the PBMC data. The set of integrated gene-protein biomarkers for PBMCs show promise toward developing a liquid biopsy for alcohol-associated liver disease

Directory of Open Access Journals

eScholarship - University of California

Identification of integrated proteomics and transcriptomics signature of alcohol-associated liver disease using machine learning

Author: Listopad Stanislav,
Publication venue
Publication date: 28/02/2024
Field of study

Ezid

Recommended from our members

Towards integrated genomics data analyses to facilitate identification of diagnostic biomarkers

Author: Listopad Stanislav
Publication venue: eScholarship, University of California
Publication date: 01/01/2022
Field of study

While the total amount of genomic data has rapidly increased over the past decade, most individual biomedical research studies are still limited to small numbers of participant samples due to the high costs of recruitment, sequencing, data storage, and data analysis. This results in many data sets with a low number of samples, but a very large number of features across multiple genomic data types. Appropriately handling the small sample size data sets and integrating multiple genomic data types is essential for identifying actionable diagnostic biomarkers. The overarching goal of my dissertation is to address some of these challenges using software engineering, bioinformatics, and machine learning methods. In this document, I will cover the three major projects of my dissertation. First, I will describe A-Lister, a software tool that I developed to filter, compare, and combine items across multiple differential expression files, to facilitate data integration and feature selection. Second, I implemented a multiclass machine learning approach to classify liver disease and identify gene expression biomarkers using a transcriptomics liver disease dataset. As part of this analysis, I have implemented a variety of bioinformatic pipelines, feature selection techniques, and machine learning classifiers to classify small sample size RNAseq data. Third, I created an integrated model using both transcriptomics and proteomics data to identify a combined gene and protein biomarker panel to classify liver disease. The tools and methods developed in my dissertation are not specific to liver disease, but are intended for use with any small sample size genomics datasets to aid in biomarker discovery

eScholarship - University of California

Towards integrated genomics data analyses to facilitate identification of diagnostic biomarkers

Author: Listopad Stanislav
Publication venue
Publication date: 01/01/2022
Field of study

Ezid

eScholarship - University of California

Differentiating between liver diseases by applying multiclass machine learning approaches to transcriptomics of liver tissue or blood-based samples.

Author: Listopad Stanislav,
Publication venue
Publication date: 18/10/2022
Field of study

Ezid

A-Lister: a tool for analysis of differentially expressed omics entities across multiple pairwise comparisons

Author: Listopad Stanislav A,
Publication venue
Publication date: 08/01/2020
Field of study

Ezid

Towards integrated genomics data analyses to facilitate identification of diagnostic biomarkers

Author: Listopad Stanislav
Publication venue
Publication date: 01/01/2022
Field of study

Ezid

Evolving Simple Models of Diverse Intrinsic Dynamics in Hippocampal Neuron Types

Author: Alexander O. Komendantov
Eric O. Scott
Giorgio A. Ascoli
Jeffrey L. Krichmar
Kenneth De Jong
Siva Venkadesh
Stanislav Listopad
Publication venue: 'Frontiers Media SA'
Publication date: 01/01/2018
Field of study

The diversity of intrinsic dynamics observed in neurons may enhance the computations implemented in the circuit by enriching network-level emergent properties such as synchronization and phase locking. Large-scale spiking network models of entire brain regions offer a platform to test theories of neural computation and cognitive function, providing useful insights on information processing in the nervous system. However, a systematic in-depth investigation requires network simulations to capture the biological intrinsic diversity of individual neurons at a sufficient level of accuracy. The computationally efficient Izhikevich model can reproduce a wide range of neuronal behaviors qualitatively. Previous studies using optimization techniques, however, were less successful in quantitatively matching experimentally recorded voltage traces. In this article, we present an automated pipeline based on evolutionary algorithms to quantitatively reproduce features of various classes of neuronal spike patterns using the Izhikevich model. Employing experimental data from Hippocampome.org, a comprehensive knowledgebase of neuron types in the rodent hippocampus, we demonstrate that our approach reliably fit Izhikevich models to nine distinct classes of experimentally recorded spike patterns, including delayed spiking, spiking with adaptation, stuttering, and bursting. Importantly, by leveraging the parameter-exploration capabilities of evolutionary algorithms, and by representing qualitative spike pattern class definitions in the error landscape, our approach creates several suitable models for each neuron type, exhibiting appropriate feature variabilities among neurons. Moreover, we demonstrate the flexibility of our methodology by creating multi-compartment Izhikevich models for each neuron type in addition to single-point versions. Although the results presented here focus on hippocampal neuron types, the same strategy is broadly applicable to any neural systems

Crossref

Directory of Open Access Journals

Frontiers - Publisher Connector

eScholarship - University of California

Differentiating between liver diseases by applying multiclass machine learning approaches to transcriptomics of liver tissue or blood-based samples.

Author: Asghar Aliya
Listopad Stanislav
Liu Zhang-Xu
Magnan Christophe
Morgan Timothy R
Norden-Krichmar Trina M
Stolz Andrew
Tayek John A
Publication venue: eScholarship, University of California
Publication date: 18/08/2022
Field of study

Background & aimsLiver disease carries significant healthcare burden and frequently requires a combination of blood tests, imaging, and invasive liver biopsy to diagnose. Distinguishing between inflammatory liver diseases, which may have similar clinical presentations, is particularly challenging. In this study, we implemented a machine learning pipeline for the identification of diagnostic gene expression biomarkers across several alcohol-associated and non-alcohol-associated liver diseases, using either liver tissue or blood-based samples.MethodsWe collected peripheral blood mononuclear cells (PBMCs) and liver tissue samples from participants with alcohol-associated hepatitis (AH), alcohol-associated cirrhosis (AC), non-alcohol-associated fatty liver disease, chronic HCV infection, and healthy controls. We performed RNA sequencing (RNA-seq) on 137 PBMC samples and 67 liver tissue samples. Using gene expression data, we implemented a machine learning feature selection and classification pipeline to identify diagnostic biomarkers which distinguish between the liver disease groups. The liver tissue results were validated using a public independent RNA-seq dataset. The biomarkers were computationally validated for biological relevance using pathway analysis tools.ResultsUtilizing liver tissue RNA-seq data, we distinguished between AH, AC, and healthy conditions with overall accuracies of 90% in our dataset, and 82% in the independent dataset, with 33 genes. Distinguishing 4 liver conditions and healthy controls yielded 91% overall accuracy in our liver tissue dataset with 39 genes, and 75% overall accuracy in our PBMC dataset with 75 genes.ConclusionsOur machine learning pipeline was effective at identifying a small set of diagnostic gene biomarkers and classifying several liver diseases using RNA-seq data from liver tissue and PBMCs. The methodologies implemented and genes identified in this study may facilitate future efforts toward a liquid biopsy diagnostic for liver diseases.Lay summaryDistinguishing between inflammatory liver diseases without multiple tests can be challenging due to their clinically similar characteristics. To lay the groundwork for the development of a non-invasive blood-based diagnostic across a range of liver diseases, we compared samples from participants with alcohol-associated hepatitis, alcohol-associated cirrhosis, chronic hepatitis C infection, and non-alcohol-associated fatty liver disease. We used a machine learning computational approach to demonstrate that gene expression data generated from either liver tissue or blood samples can be used to discover a small set of gene biomarkers for effective diagnosis of these liver diseases

PubMed Central

eScholarship - University of California