11 research outputs found

    Cautionary Tales of Inapproximability

    Get PDF
    Modeling biology as classical problems in computer science allows researchers to leverage the wealth of theoretical advancements in this field. Despite countless studies presenting heuristics that report improvement on specific benchmarking data, there has been comparatively little focus on exploring the theoretical bounds on the performance of practical (polynomial-time) algorithms. Conversely, theoretical studies tend to overstate the generalizability of their conclusions to physical biological processes. In this article we provide a fresh perspective on the concepts of NP-hardness and inapproximability in the computational biology domain, using popular sequence assembly and alignment (mapping) algorithms as illustrative examples. These algorithms exemplify how computer science theory can both (a) lead to substantial improvement in practical performance and (b) highlight areas ripe for future innovation. Importantly, we discuss caveats that seemingly allow the performance of heuristics to exceed their provable bounds

    Tailoring bioinformatics strategies for the characterization of the human microbiome in health and disease

    Get PDF
    The human microbiome is a very active area of research due to its potential to explain health and disease. Advances in high throughput DNA sequencing in the last decade have catalyzed the growth of microbiome research; DNA sequencing allows for a cost-effective method to characterize entire microbial communities directly, including unculturable microbes which were previously difficult to study. 16S rRNA sequencing and shotgun metagenomics, coupled with bioinformatics methods have powered the characterization of the human microbiome in different parts of the body. This has led to the discovery of novel links between the microbiome and diseases such as allergies, cancer, and autoimmune diseases. This thesis focuses on the application of both 16S rRNA sequencing and shotgun metagenomics for the characterization of the human microbiome and its relationship with health and disease. We established two methodologies to address these questions. The first methodology is a bench-to-bioinformatics pipeline to discover putative viral pathogens involved in disease using shotgun metagenomics technology. In paper I, we apply the proposed pipeline to explore the hypothesis of viral infection as a putative cause of childhood Acute Lymphoblastic Leukemia. In paper II, we propose a complementary method to the pipeline to improve the detection of unknown viruses, especially those with little or no homology to currently known viruses. We applied this method on a collection of viral-enriched libraries which resulted in the characterization of a new viral-like genome. The second methodology was developed to explore and generate hypothesis from a human skin microbiome dataset of Psoriasis and Atopic Dermatitis patients. The results of the analysis are presented in Paper III and Paper IV. Paper III is a pure data-driven exploration of the dataset to discover different aspects on how the microbiome is linked to both diseases. Paper IV follows up from the results of paper III but focuses on characterizing the skin site microbiome variability in Atopic Dermatitis

    Analysis of bronchoalveolar lavage transcriptome profiles of asthmatic horses by single-cell mRNA sequencing

    Get PDF
    Severe equine asthma (SEA) is a common respiratory condition of horses, whose underlying immune mechanisms remain to be elucidated. In this thesis project, we took advantage of the recently developed single-cell mRNA (scRNA-seq) technology to investigate the immunological landscape of equine bronchoalveolar lavage fluid (BALF) cells in both health and disease. Initially, we conducted a pilot experiment involving three horses to demonstrate the feasibility of scRNA-seq on cryopreserved equine BALF samples. Although the experiment was successful, the proportion of reads aligning to the annotated equine reference transcriptome was suboptimal. To address this, we generated a custom equine BALF transcriptome using long-read sequencing, aiming to improve the quality of 3'-UTR annotation and document BALF-specific isoforms. While we identified several novel isoforms, the read mapping percentage did not improve when aligning our scRNA-seq transcripts to the custom transcriptome. By extending the 3'-UTRs of the existing reference annotation, we achieved a satisfactory read mapping percentage, enabling subsequent qualitative downstream analysis. Our scRNA-seq dataset encompassed six major cell populations: monocytes-macrophages, neutrophils, T cells, B cells and dendritic cells. Within the monocyte-macrophage and T cell groups, we identified previously uncharacterized cell subtypes. Encouraged by these findings, we applied our optimized experimental protocol and analysis pipeline to study SEA. ScRNA-seq analysis of cryopreserved BALF cells from 6 asthmatic and 5 healthy controls revealed the same major cell populations as observed in the pilot study. In addition to T cells and monocytes-macrophages, we characterized several cell subtypes within the B cell, dendritic cell and neutrophil populations. Differential gene expression analysis revealed a strong T helper (Th)17 signature in SEA, primarily driven by monocytes-macrophages and T cells. Notably, BALF from SEA horses was enriched in B cells, with a lower proportion of activated plasma cells. Neutrophils in the SEA group displayed increased migratory capacity and a heightened propensity to form neutrophil extracellular traps (NETs). An intriguing finding in both scRNA-seq experiments was the detection of a dual monocyte-lymphocyte population, potentially representing genuine cellular complexes engaged in an immunological synapse. In summary, this thesis project represents pioneering work employing scRNA-seq in the field of equine pulmonology. Our findings support a predominant Th17 immune pathway in SEA, necessitating further investigation to improve diagnostic tools and therapeutic management of severely asthmatic horses

    Role of the antagonistic histone methylation marks H3K4me3 and H3K27me3 in the cold stress response of Arabidopsis thaliana

    Get PDF
    As sessile organisms, plants need to adapt to their changing environment, including temperature fluctuations. As low temperatures can have major noxious consequences on their development and survival, plants need to establish the proper defences in order to endure the stress. This requires a massive and very fast transcriptome reprogramming involving, among others, the induction of hundreds of cold-responsive (COR) genes. Following the immediate response to chilling stress, plants are also able to memorize cold spells, leading to an improved survival during a second stress episode. This process is associated with a revised transcriptomic response also called transcriptional memory. Overall, both the response to cold and the memory of this stress rely on the tight transcriptional regulation of the COR genes. While numerous transcription factors necessary for their induction were already identified, the role of chromatin modifications in this process remains largely undiscovered. As the combination of chromatin modifications (the “chromatin state”) is a key determinant of gene expression, this study aimed at uncovering the potential role of histone modifications in the transcriptional regulation of COR genes before, during and after a cold episode. First, a comprehensive in silico analysis of the chromatin state of COR genes prior to any cold occurrence revealed that a majority of those genes carry both the activating mark H3K4me3 and the silencing mark H3K27me3, forming a specific chromatin state called bivalency. The in vivo characterization of bivalent genes revealed that this chromatin state decorates not only cold-inducible genes but numerous reversibly silenced stress-responsive genes and might poise them for expression by maintaining them in an open chromatin conformation. Furthermore, the putative bivalency reader DEK2 was shown to prevent the over-induction of bivalent COR genes during a cold episode, suggesting that bivalency can also participate in transcriptional regulation in trans through the action of specific readers. In a second stage, the dynamics of H3K4me3 and H3K27me3 during a cold stress were analysed using genome-wide approaches, revealing that both marks underwent intensive redistribution already after three hours of low temperature. Those changes partially correlated with expression changes: in particular, the induction of COR genes was associated with a loss of the repressive mark H3K27me3 or a gain of the activating mark H3K4me3. However, each mark displayed different targets and dynamics, suggesting that they hold distinct roles in the cold response: H3K4me3 associated with immediate stress responses while H3K27me3 rather correlated with longer-term adaptation. Upon return to ambient temperature, the cold-induced variations reverted at a different pace depending on the gene and some changes were maintained for up to seven days. Both the maintenance of H3K4me3 and H3K27me3 changes were linked to transcriptional memory: higher levels of H3K4me3 were associated with sustained induction while lower levels of H3K27me3 were correlated with a faster re-induction during a second stress exposure. Finally, the H3K27me3 demethylase ELF6 was shown to be essential for cold stress memory. This led to the hypothesis that cold stress memory might rely on the maintained loss of H3K27me3 on specific COR genes, allowing a faster re-establishment of defences during a second stress episode. In conclusion, this study demonstrates that the antagonistic marks H3K4me3 and H3K27me3 jointly participate to the transcriptional regulation of COR genes and reveals a new role of bivalency in the plant cold stress response and memory.Da Pflanzen an ihren Standort gebunden sind, müssen sie sich ständig an veränderte Umweltbedingungen anpassen. Kälte kann schädliche Folgen für die Pflanzenentwicklung und sogar zum Absterben von Pflanzen führen. Daher müssen Pflanzen auf diesen Stress reagieren, indem sie eine geeignete Abwehr zum Überleben aufbauen. Dies erfordert eine erhebliche Umprogrammierung des Transkriptoms, welche die Induktion von zahlreichen kälteempfindlichen (COR) Genen enthält. Nach einer direkten Stressantwort sind Pflanzen in der Lage ein Kältegedächtnis aufzubauen, wodurch sie eine zweite Kälteepisode besser überstehen. Dieser Prozess ist mit massiven Änderungen auf Genexpressionsebene verbunden, die auch „transkriptionelles Gedächtnis“ genannt wird. Sowohl die unmittelbare Reaktion auf, als auch das Bilden eines längerfristigen Gedächtnisses an Kälte, sind auf eine präzise Transkriptionsregulation von COR Genen angewiesen. Obwohl der Chromatinzustand ein bestimmender Faktor für Genexpression ist, ist die Rolle von Chromatinmodifikationen in der Induktion von COR Genen noch weitgehend unbekannt. Deshalb war es das Ziel dieser Arbeit, die Rolle von Histonmodifikationen in der Transkriptionsregulation von COR Genen vor, während, und nach Kältestress zu analysieren. Zunächst offenbarte eine umfassende in silico Analyse des Chromatinzustands von COR Genen vor einem Kälteereignis, dass die Mehrheit dieser Gene sowohl die aktivierende Modifikation H3K4me3 als auch die repressive Modifikation H3K27me3 tragen. Dieser Chromatinzustand wird auch als bivalent bezeichnet. Die in vivo Charakterisierung bivalenter Gene zeigte, dass besonders stillgelegte, induzierbare Gene durch einen bivalenten Chromatinzustand markiert sind. Diese könnten dadurch für eine eventuelle Expression vorbereitet sein, indem diese Genbereiche in einer offenen Chromatin-Konformation verbleiben. Der vermeintliche Bivalenz-Leser DEK2 verhinderte die Überinduktion von bivalenten Genen während einer Kälteepisode. Dies weist darauf hin, dass Bivalenz auch an der Transkriptionsregulation in trans durch die Aktion von bestimmten Reader-Proteinen Anteil nehmen kann. Die Analyse der H3K4me3 und H3K27me3 Dynamik mittels genomweiter Methoden zeigte, dass bei niedrigen Temperaturen eine intensive Neuverteilung beider Modifikationen stattfindet, die teilweise mit Expressionsvariationen korrelierte. Insbesondere war die Induktion von COR Genen mit einem Verlust der repressiven Modifikation H3K27me3 oder einer Zunahme der aktivierenden Modifikation H3K4me3 assoziiert. Die Modifikationen haben jedoch distinkte Rollen in der Kälteantwort. H3K4me3 war mit der unmittelbaren Stressantwort assoziiert, während H3K27me3 eher mit Langzeitadaptation korreliert war. Nach der Rückkehr zu ambienter Umgebungstemperatur kehrte das Chromatin zu seinem Ausgangzustand in unterschiedlichem Tempo abhängig vom Gen zurück, wobei manche Veränderungen bis zu sieben Tage beibehalten wurden. Die Aufrechterhaltung von sowohl H3K4me3 als auch H3K27me3 Variationen waren mit transkriptionellem Gedächtnis assoziiert: höhere H3K4me3 Mengen korrelierten mit beständiger Induktion und niedrigere H3K27me3 Mengen waren mit einer schnelleren Re-Induktion während einer zweiten Kälteepisode assoziiert. Schließlich wurde gezeigt, dass die H3K27me3 Demethylase ELF6 unabdingbar für das Kältestressgedächtnis ist. Der fortbestehende Verlust von H3K27me3 auf spezifischen Genen könnte daher die molekulare Basis für das Kältestressgedächtnis sein, indem ein schnellerer Wiederaufbau der Abwehr während einer zweiten Stressepisode ermöglicht wird. Insgesamt zeigt diese Studie, dass die antagonistischen Modifikationen H3K4me3 und H3K27me3 gemeinsam an der Transkriptionsregulation von COR Genen teilhaben, und offenbart eine neue Rolle der Bivalenz bei der Kältestressreaktion von Pflanzen

    Desarrollo de técnicas bioinformáticas para el análisis de datos de secuenciación masiva en sistemática y genómica evolutiva: Aplicación en el análisis del sistema quimiosensorial en artrópodos

    Get PDF
    [spa] Las tecnologías de secuenciación de próxima generación (NGS) proporcionan datos potentes para investigar cuestiones biológicas y evolutivas fundamentales, como estudios relacionados con la genómica evolutiva de la adaptación y la filogenética. Actualmente, es posible llevar a cabo proyectos genómicos complejos analizando genomas completos y / o transcriptomas, incluso de organismos no modelo. En esta tesis, hemos realizado dos estudios complementarios utilizando datos NGS. En primer lugar, hemos analizado el transcriptoma (RNAseq) de los principales órganos quimiosensoriales del quelicerado Macrothele calpeiana, Walckenaer, 1805, la única araña protegida en Europa, para investigar el origen y la evolución del sistema quimiosensorial (SQ) en los artrópodos. El SQ es un proceso fisiológico esencial para la supervivencia de los organismos, y está involucrado en procesos biológicos vitales, como la detección de alimentos, parejas o depredadores y sitios de ovoposición. Este sistema, está relativamente bien caracterizado en hexápodos, pero existen pocos estudios en otros linajes de artrópodos. El análisis de nuestro transcriptoma permitió detectar algunos genes expresados en los supuestos órganos quimiosensoriales de los quelicerados, como cinco NPC2 y dos IR. Además, también detectamos 29 tránscritos adicionales después de incluir en los perfiles de HMM nuevos miembros del SQ de genomas de artrópodos recientemente disponibles, como algunos genes de las familias de los SNMP, ENaC, TRP, GR y una OBP-like. Desafortunadamente, muchos de ellos eran fragmentos parciales. En segundo lugar, también hemos desarrollado algunas herramientas bioinformáticas para analizar datos de RNAseq y desarrollar marcadores moleculares. Los investigadores interesados en la aplicación biológica de datos NGS pueden carecer de la experiencia bioinformática requerida para el tratamiento de la gran cantidad de datos generados. En este contexto, principalmente, es necesario el desarrollo de herramientas fáciles de usar para realizar todos los procesos relacionados con el procesamiento básico de datos NGS y la integración de utilidades para realizar análisis posteriores. En esta tesis, hemos desarrollado dos herramientas bioinformáticas con interfaz gráfica, que permite realizar todos los procesos comunes del procesamiento de datos NGS y algunos de los principales análisis posteriores: i) TRUFA (TRanscriptome User-Friendly Analysis), que permite analizar datos RNAseq de organismos que no modelos, incluyendo la anotación funcional y el análisis de expresión génica diferencial; y ii) DOMINO (Development Of Molecular markers In Non-model Organisms), que permite identificar y seleccionar marcadores moleculares apropiados para análisis de biología evolutiva. Estas herramientas han sido validadas utilizando simulaciones por ordenador y datos experimentales, principalmente de arañas.[eng] The Next Generation Sequencing (NGS) technologies are providing powerful data to investigate fundamental biological and evolutionary questions including phylogenetic and adaptive genomic topics. Currently, it is possible to carry out complex genomic projects analyzing the complete genomes and/or transcriptomes even in non-model organisms. In this thesis, we have performed two complementary studies using NGS data. Firstly, we have analyzed the transcriptome (RNAseq) of the main chemosensory organs of the chelicerate Macrothele calpeiana, Walckenaer, 1805, the only spider protected in Europe, to investigate the origin and evolution of the Chemosensory System (CS) in arthropods. The CS is an essential physiological process for the survival of organisms, and it is involved in vital biological processes, such as the detection of food, partners or predators and oviposition sites. This system, which has it relatively well characterized in hexapods, is completely unknown in other arthropod lineages. Our transcriptome analysis allowed to detect some genes expressed in the putative chemosensory organs of chelicerates, such as five NPC2s and two IRs. Furthermore, we detected 29 additional transcripts after including new CS members from recently available genomes in the HMM profiles, such as the SNMPs, ENaCs, TRPs, GRs and one OBP-like. Unfortunately, many of them were partial fragments. Secondly, we have also developed some bioinformatics tools to analyze RNAseq data, and to develop molecular markers. Researchers interested in the biological application of NGS data may lack the bioinformatic expertise required for the treatment of the large amount of data generated. In this context, the development of user-friendly tools for common data processing and the integration of utilities to perform downstream analysis is mostly needed. In this thesis, we have developed two bioinformatics tools with an easy to use graphical interface to perform all the basics processes of the NGS data processing: i) TRUFA (TRanscriptome User-Friendly Analysis), that allows analyzing RNAseq data from non-model organisms, including the functional annotation and differential gene expression analysis; and ii) DOMINO (Development of Molecular markers in Non-model Organisms), which allows identifying and selecting molecular markers appropriated for evolutionary biology analysis. These tools have been validated using computer simulations and experimental data, mainly from spiders

    Exploring interactions between host and gut microbiota in ulcerative colitis and primary sclerosing cholangitis associated inflammatory bowel disease: An appraisal through faecal microbiota transplantation and systems biology

    Get PDF
    Inflammatory bowel disease (IBD) has progressively become a global epidemic and now affects nearly 0.5% of the Western population. The aetiological factors that initiate and drive mechanisms associated with IBD remain unclear. A cure has been even more elusive. Changes in the gut microbial diversity and profiles in individuals with this disease is a characteristic feature, however a causal relationship has yet to be proven. In my PhD I have attempted to explore host-microbiota interactions and its influence on mechanisms of ulcerative colitis (UC) and primary sclerosing cholangitis associated inflammatory bowel disease (PSC-IBD). Patients with UC have a greater abundance of Clostridiaceae at inflamed compared to non-inflamed sites. Immunophenotyping demonstrated significantly higher proportions of colonic mucosal Th17 and IL-17 producing CD4 cells in patients with UC and PSC-IBD compared to healthy controls. Through an open label study (STOP-Colitis pilot phase), I demonstrated that faecal microbiota transplantation (FMT) resulted in a clinical response in 47% of patients (8/17; intention to treat). This response was associated with a significant increase in colonic mucosal regulatory T cells (Treg), effector memory Tregs, gut homing Tregs and IL-10 producing CD4 T cells population along with a concurrent decrease in Th17, IL-17 producing CD4 T cells and CD8 populations. Colonic mucosal transcriptomics revealed that responders to FMT had significant downregulation of antimicrobial defence and proinflammatory immunological pathways and an increase in butanoate metabolic pathways compared to both baseline and non-responders. Finally, through a multi-omic exploration of colonic mucosal biology, I demonstrated that the gene expression profiles in patients with PSC-IBD was significantly different to UC and was associated with dysregulation of bile acid homeostasis and signalling in association with colonic dysbiosis
    corecore