191 research outputs found

    Host Transcriptomics as a Tool to Identify Diagnostic and Mechanistic Immune Signatures of Tuberculosis

    Get PDF
    Tuberculosis (TB) is a major infectious disease worldwide, and is associated with several challenges for control and eradication. First, more accurate diagnostic tools that better represent the spectrum of infection states are required; in particular, identify the latent TB infected individuals with high risk of developing active TB. Second, we need to better understand, from a mechanistic point of view, why the immune system is unsuccessful in some cases for control and elimination of the pathogen. Host transcriptomics is a powerful approach to identify both diagnostic and mechanistic immune signatures of diseases. We have recently reported that optimal study design for these two purposes should be guided by different sets of criteria. Here, based on already published transcriptomics signatures of tuberculosis, we further develop these guidelines and identify additional factors to consider for obtaining diagnostic vs. mechanistic signatures in terms of cohorts, samples, data generation and analysis. Diagnostic studies should aim to identify small disease signatures with high discriminatory power across all affected populations, and against similar pathologies to TB. Specific focus should be made on improving the diagnosis of infected individuals at risk of developing active disease. Conversely, mechanistic studies should focus on tissues biopsies, immune relevant cell subsets, state of the art transcriptomic techniques and bioinformatics tools to understand the biological meaning of identified gene signatures that could facilitate therapeutic interventions. Finally, investigators should ensure their data are made publicly available along with complete annotations to facilitate metadata and cross-study analyses

    Defining immune system function in humans

    Get PDF
    In this thesis I design two novel methods for the analysis of human immunological data from blood and solid tissue respectively. I have developed a method capable of identifying poorly characterised disease-associated genes and defining mechanisms controlling their expression. By exploring multiple cohorts of public transcriptome data, I showed that CST7 is upregulated in the blood during a diverse set of inflammatory conditions. Interestingly, this upregulation was neutrophil-specific and is not induced by microbial products or cytokines commonly associated with inflammation but is associated with type I interferon signalling. In this chapter, I demonstrate the value of publicly available transcriptome data in knowledge generation and potential biomarker discovery. I have also developed a protocol for the single-cell analysis of lesions in multiplex images. I developed a metric called tCPI that can quantify how tight or loose the distribution of cells is in the lesion centre. An additional metric was developed called immCPI that measures the overall distribution of each immune cell population relative to the lesion centre. I developed a final method called lesion neighbourhood analysis that can quantify the relative location of hundreds of individual lesions within a tissue section and can identify patterns in their distributions. Together this workflow provides a robust tool for spatial quantification at the lesion level. I applied this lesion analysis protocol to lung tissue resection samples from patients with Tuberculosis. Interestingly, lesions did not fall into distinct clusters based on composition. Using both principal component data and tCPI, I grouped lesions into 4 types based on composition and cell distribution. Using immCPI revealed that the intra-lesion location macrophages / monocytes and B cells is dependent on lesion type. Lesion neighbourhood analysis revealed that B cell enriched lesions tend to locate twice as close to a necrotising lesion compared to other lesions. The robust analytic power of these new approaches offers important tools in the study of human immunology and will contribute to furthering our understanding of inflammatory diseases

    Resolving Biological Trajectories in Single-cell Data using Feature Selection and Multi-modal Integration

    Get PDF
    Single-cell technologies can readily measure the expression of thousands of molecular features from individual cells undergoing dynamic biological processes, such as cellular differentiation, immune response, and disease progression. While computational trajectory inference methods and RNA velocity approaches have been developed to study how subtle changes in gene or protein expression impact cell fate decision-making, identifying characteristic features that drive continuous biological processes remains difficult to detect due to the inherent biological or technical challenges associated with single-cell data. Here, we developed two data representation-based approaches for improving inference of cellular dynamics. First, we present DELVE, an unsupervised feature selection method for identifying a representative subset of dynamically-expressed molecular features that resolve cellular trajectories in noisy data. In contrast to previous work, DELVE uses a bottom-up approach to mitigate the effect of unwanted sources of variation confounding inference and models cell states from dynamic feature modules that constitute core regulatory complexes. Using simulations, single-cell RNA sequencing data, and iterative immunofluorescence imaging data in the context of cell cycle and cellular differentiation, we demonstrate that DELVE selects genes or proteins that more accurately characterize cell populations and improve the recovery of cell type transitions. Next, we present the first task-oriented benchmarking study that investigates integration of temporal gene expression modalities for dynamic cell state prediction. We benchmark ten multi-modal integration approaches on ten datasets spanning different biological contexts, sequencing technologies, and species. This study illustrates how temporal gene expression modalities can be optimally combined to improve inference of cellular trajectories and more accurately predict sample-associated perturbation and disease phenotypes. Lastly, we illustrate an application of these approaches and perform an integrative analysis of gene expression and RNA velocity data to study the crosstalk between signaling pathways that govern the mesendoderm fate decision during directed definitive endoderm differentiation. Results of this study suggest that lineage-specific, temporally expressed genes within the primitive streak may serve as a potential target for increasing definitive endoderm efficiency. Collectively, this work uses scalable data-driven approaches to effectively manage the inherent biological or technical challenges associated with single-cell data in order to improve inference of cellular dynamics.Doctor of Philosoph

    Incorporating standardised drift-tube ion mobility to enhance non-targeted assessment of the wine metabolome (LC×IM-MS)

    Get PDF
    Liquid chromatography with drift-tube ion mobility spectrometry-mass spectrometry (LCxIM-MS) is emerging as a powerful addition to existing LC-MS workflows for addressing a diverse range of metabolomics-related questions [1,2]. Importantly, excellent precision under repeatability and reproducibility conditions of drift-tube IM separations [3] supports the development of non-targeted approaches for complex metabolome assessment such as wine characterisation [4]. In this work, fundamentals of this new analytical metabolomics approach are introduced and application to the analysis of 90 authentic red and white wine samples originating from Macedonia is presented. Following measurements, intersample alignment of metabolites using non-targeted extraction and three-dimensional alignment of molecular features (retention time, collision cross section, and high-resolution mass spectra) provides confidence for metabolite identity confirmation. Applying a fingerprinting metabolomics workflow allows statistical assessment of the influence of geographic region, variety, and age. This approach is a state-of-the-art tool to assess wine chemodiversity and is particularly beneficial for the discovery of wine biomarkers and establishing product authenticity based on development of fingerprint libraries

    Mass Spectrometry and Nuclear Magnetic Resonance in the Chemometric Analysis of Cellular Metabolism

    Get PDF
    The development and awareness of Machine Learning and “big data” has led to a growing interest in applying these methods to bioanalytical research. Methods such as Mass Spectrometry (MS), and Nuclear Magnetic Resonance (NMR) can now obtain tens of thousands to millions of data points from a single sample, due to fundamental instrumental advances and ever-increasing resolution. Simple pairwise comparisons on datasets of this magnitude can obfuscate more complex underlying trends, and does a disservice to the richness of information contained within. This necessitates the need for multivariate approaches that can more fully take advantage of the complexity of these datasets. Performing these multivariate analyses takes high degree of expertise, requiring knowledge of such disparate areas as chemistry, physics, mathematics, statistics, software development and signal processing. As a result, this barrier to entry prevents many investigators from fully utilizing all the tools available to them, instead relying on a mix of commercial and free software, chained together with in-house developed solutions just to perform a single analysis. While there are numerous methods in published literature for statistical analysis of these larger datasets, most are still confined to the realm of theory due to them not being implemented into publicly available software for the research community. This dissertation outlines the development of routines for handling LC-MS data with freely available tools, including the Octave programming language. This presents, in combination with our previously developed software MVAPACK, a unified platform for metabolomics data analysis that will encourage the wider adoption of multi-instrument investigations and multiblock statistical analyses. Advisor: Robert Power

    The 26th Annual Boston University Undergraduate Research (UROP) Abstracts

    Full text link
    The file is available to be viewed by anyone in the BU community. To view the file, click on "Login" or the Person icon top-right with your BU Kerberos password. You will then be able to see an option to View.Abstracts for the 2023 UROP Symposium, held at Boston University on October 20, 2023 at GSU Metcalf Ballroom. Cover and logo design by Morgan Danna. Booklet compiled by Molly Power

    Exploring the Genomic Basis of Antibiotic Resistance in Wastewater E. coli: Positive Selection, GWAS, and AI Language Model Analyses

    Get PDF
    Antibiotic resistance is critical to global health. This thesis examines the relationship between antibiotic resistance and genomic variations in E. coli from wastewater. E. coli is of interest as it causes urinary tract and other infections. Wastewater is a good source because it is a melting pot for E. coli from diverse origins. The research delves into two key aspects: including or excluding antibiotic resistance data and the level of granularity in representing genomic variations. The former is important because there is more genomic data than antibiotic resistance data. Consequently, relying solely on genomic data, this thesis studies positive selection in E. coli to identify mutations and genes favored by evolution. This study demonstrates the preferential selection of known antibiotic resistance genes and mutations, particularly mutations located on functionally important locations of outer membrane porins, and may hence have a direct effect on structure and function. Encouraged by these results, the study was expanded to include antibiotic resistance data and to examine genomic variations at three resolution levels: single mutations, unitigs (genome words) that may contain multiple mutations, and whole coding genome using machine learning classifier models that capture dependencies among multiple mutations and other genomic variations. Representation of single mutations detects well-known resistance mutations as well as potentially novel mechanisms related to biofilm formation and translation. By exploring larger genomic units such as genome words, the analysis confirms the findings from single mutations and additionally uncovers joint mutations in both known and novel genes. Finally, machine learning models, including AI language models, were trained to predict antibiotic resistance based on the whole coding genome. This achieved an accuracy of over 90% in predicting antibiotic resistance when sufficient data were available. Overall, this thesis unveils new antibiotic resistance mechanisms, conducts one of the largest studies of positive selection in E. coli, and stands out as one of the pioneering studies that utilizes AI language models for antibiotic resistance prediction

    Washington University Senior Undergraduate Research Digest (WUURD), Spring 2018

    Get PDF
    From the Washington University Office of Undergraduate Research Digest (WUURD), Vol. 13, 05-01-2018. Published by the Office of Undergraduate Research. Joy Zalis Kiefer, Director of Undergraduate Research and Associate Dean in the College of Arts & Scien
    corecore