46 research outputs found

    Representing and analysing molecular and cellular function in the computer

    Get PDF
    Determining the biological function of a myriad of genes, and understanding how they interact to yield a living cell, is the major challenge of the post genome-sequencing era. The complexity of biological systems is such that this cannot be envisaged without the help of powerful computer systems capable of representing and analysing the intricate networks of physical and functional interactions between the different cellular components. In this review we try to provide the reader with an appreciation of where we stand in this regard. We discuss some of the inherent problems in describing the different facets of biological function, give an overview of how information on function is currently represented in the major biological databases, and describe different systems for organising and categorising the functions of gene products. In a second part, we present a new general data model, currently under development, which describes information on molecular function and cellular processes in a rigorous manner. The model is capable of representing a large variety of biochemical processes, including metabolic pathways, regulation of gene expression and signal transduction. It also incorporates taxonomies for categorising molecular entities, interactions and processes, and it offers means of viewing the information at different levels of resolution, and dealing with incomplete knowledge. The data model has been implemented in the database on protein function and cellular processes 'aMAZE' (http://www.ebi.ac.uk/research/pfbp/), which presently covers metabolic pathways and their regulation. Several tools for querying, displaying, and performing analyses on such pathways are briefly described in order to illustrate the practical applications enabled by the model

    Variation et évolution de la composition du venin des guêpes parasitoïdes Psyttalia (Hymenoptera, Braconidae) et Leptopilina (Hymenoptera, Figitidae) : une cause possible d'échec et de succès en lutte biologique ?

    Get PDF
    Endoparasitoid wasps lay eggs and develop inside arthropod hosts, leading to their death. They have evolved various strategies to ensure parasitism success, notably the injection with the eggs of venom that suppresses the host immunity. Although venom composition has been characterized in a growing number of parasitoid families and recent studies suggest that parasitoid virulence can rapidly evolve, the intraspecific variation of venom and its short-Term evolvability remained to be investigated. This information is however essential for understanding the evolution of parasitoid host range and may have implications in biological control. This thesis allowed to demonstrate the occurrence of inter-Individual variability of venom and to develop a method based on the analysis of electrophoretic 1D profiles and the use of “R” functions allowing statistic comparison of protein quantities from numerous individuals. Then, to study the effect of this variability of the venom composition, experimental evolution studies were performed on Psyttalia lounsburyi and Leptopilina boulardi. Overall, the thesis evidenced that parasitoid venom composition (i) is variable at all studied biological levels (ii) changes rapidly, confirming its high evolvability, and (iii) influences key parameters of the parasitoid biology. This may have important implications in biocontrol and raises the question of the mechanisms sustaining this variability.Les guêpes endoparasitoïdes effectuent leur développement dans un hôte arthropode, entraînant sa mort. Parmi les stratégies assurant leur succès parasitaire, la plus commune est l’injection de venin dans l’hôte lors de l’oviposition, provoquant la suppression de l’immunité de l’hôte. Il est connu que la composition du venin est variable entre espèces et que la virulence des parasitoïdes peut évoluer rapidement. Pourtant la variation intraspécifique de la composition du venin n’a jamais été étudiée alors qu’elle est essentielle pour comprendre l’évolution de la gamme d’hôte des parasitoïdes, un paramètre clé en lutte biologique. Cette thèse a permis de démontrer l’existence d’une variabilité inter-Individuelle du venin, et de développer une méthode basée sur l’analyse de profiles d’électrophorèse 1D à l’aide de fonctions “R” permettant la comparaison statistique de la composition protéique d’un grand nombre d’individus. Des évolutions expérimentales ont ensuite été réalisée sur Psyttalia lounsburyi et Leptopilina boulardi pour étudier les effets de la variabilité du venin lors d’un changement d’environnement brutal. Globalement, cette thèse a mis en évidence que la composition du venin (i) est très variable à tous les niveaux étudiés, (ii) évolue rapidement et (iii) impacte des paramètres clés de la biologie des parasitoïdes. Ceci pourrait avoir d’importantes implications en lutte biologique et pose la question des mécanismes de maintien de la variabilité du venin dans le milieu naturel

    Complex genetic approaches to neurodegenerative diseases.

    Get PDF
    Neurodegenerative diseases are fatal disorders in which disease pathogenesis results in the progressive degeneration of the central and/or the peripheral nervous systems. These diseases currently affect -2% of the population but are expected to increase in prevalence as average life expectancy increases. The majority of these diseases have a complex genetic basis. The work presented in this thesis aimed to investigate the genetic basis of two neurodegenerative diseases, amyotrophic lateral sclerosis (ALS) and the human prion diseases kuru and sporadic Creutzfeldt-Jakob disease (sCJD), using novel complex genetic approaches. ALS is a fatal neurodegenerative disease in which motor neurons are seen to degenerate. It is a complex disease with 10% of individuals having a family history and the remaining 90% of non-familial cases having some genetic component. The gene DYNC1H1 is involved in retrograde axonal transport and is a good candidate for ALS. In this thesis the genetic architecture of DYNC1H1 was elucidated and a mutation screen of exons 8, 13 and 14 was undertaken in familial forms of ALS and other motor neuron diseases. No mutations were found. A linkage disequilibrium (LD) based association study was conducted using two tagging single nucleotide polymorphisms (tSNPs) which were identified as sufficient to represent genetic variation across DYNC1HI. These tSNPs were tested for an association with sporadic ALS (SALS) in 261 cases and 225 matched controls but no association was identified. Kuru is a devastating epidemic prion disease which affected a highly geographically restricted area of the Papua New Guinea highlands, predominantly affected adult women and children. Its incidence has steadily declined since the cessation of its route of transmission, endocannibalism, in the late 1950's. Kuru imposed strong balancing selection on codon 129 of the prion gene (PRNP). Analysis of kuru-exposed and unexposed populations showed significant deviations from Hardy-Weinberg equilibrium (HWE) consistent with the known protective effect of codon 129 heterozygosity. Signatures of selection were investigated in the surviving populations, such as deviations from HWE and an increasing cline in codon 129 valine allele frequency, which covaried with disease exposure. A novel PRNP G127V polymorphism was detected which, while common in the area of highest kuru incidence, was absent from kuru patients and unexposed population groups. Genealogical analysis revealed that the heterozygous PRNP G127V genotype confers strong prion disease resistance, which has been selected by the kuru epidemic. Finally, PRNP copy number was investigated as a possible genetic mechanism for susceptibility to kuru and sCJD. No conclusive copy number changes were identified

    Functional genomics analysis of the fibroblast growth factor Branchless in the fruitfly Drosophila melanogaster

    Get PDF
    Die Embryonalentwicklung der Tracheen der Fruchtfliege Drosophila stellt ein Modell für genetische und molekulare Analysen der Verzweigungsmorphogenese dar. Durch die Arbeiten vieler Laboratorien konnten BRANCHLESS (BNL, ein Homolog des Fibroblastenwachstumsfaktors) und BREATHLESS (BTL, sein Rezeptor) als essentielle Faktoren für viele Schritte in der Entwicklung des Tracheensystems identifiziert werden. Die Zielgene der BNL/BTL-Signalkaskade waren bisher zumeist unbekannt. Für die vorliegende Arbeit wurden Embryonen nach Verlust von BNL-Aktivität als auch solche nach gewebespezifischer trachealer Überexpression von BNL morphologisch und physiologisch charakterisiert. Ebenso wurden Transkriptionsprofile für die funktionell-genomische Analyse erstellt. Durch im Rahmen der vorliegenden Arbeit neu entwickelte bioinformatische Methoden konnten bisher unbekannte Gene identifiziert werden, die von BNL abhängig sind und deren tracheale Expression durch in situ-Hybridisierung validiert werden konnte. Ebenso war es möglich, durch die Projektion der Messdaten auf genetische Interaktionsgraphen putative Wechselwirkungen der BNL/BTL-Signalkaskade mit anderen Signalwegen aufzuzeigen, und eventuelle Aktivierungsmechanismen beispielsweise für unter Überexpression von BNL auftretende Immunfaktoren zu erkennen. Es konnte gezeigt werden, dass der Signalweg über Stickstoffmonoxid von BNL unabhängig in den Tracheen agiert, wohingegen sich eine Interaktion mit der JAK/STAT-Signalkaskade andeutet.The embryonic development of the Drosophila tracheal system resembles a model for genetic and molecular analyses of branching morphogenesis. Efforts of many laboratories have identified BRANCHLESS (BNL, a homologue of the fibroblast growth factor) and BREATHLESS (BTL, its receptor) as essential factors for many steps in the development of the tracheal system. However, the transcriptional targets of the BNL/BTL signalling cascade remain mostly unkown. This thesis presents the characterisation of BNL gain- and loss-of-function on the level of morphology and physiology. Transcriptional profiles of wildtype, bnl mutant and bnl over-expressing embryos were obtained. Computational methods (projection of expression data onto chromosomal position, functional Gene Ontology-annotation and to graphs of protein or genetic interaction) were developed in order to identify novel BNL-dependent genes, which are expressed in the tracheal system. It was possible to suggest putative cross-talk between the BNL/BTL signalling cascade and other pathways, which may explain the expression of genes of the innate immune system in response to ectopic dosages of BNL. Signalling of nitric oxide in the trachea appears to be independent from BNL, whereas there is evidence for an interaction with the JAK/STAT signalling cascade

    Expanding the SnoRNA Interaction Network: Conservation of Guiding Function in Vertebrates

    Get PDF
    Small nucleolar RNAs (snoRNAs) are one of the most abundant and evolutionary ancient group of small non-coding RNAs. Their main function is to target chemical modifications of ribosomal RNAs (rRNAs) and small nuclear (snRNAs). They fall into two classes, box C/D snoRNAs and box H/ACA snoRNAs, which are clearly distinguished by conserved sequence motifs and the type of modification that they govern. The box H/ACA snoRNAs are responsible for targeting pseudouridylation sites and the box C/D snoRNAs for directing 2’-O-methylation of ribonucleotides. A subclass that localize to the Cajal bodies, termed scaRNAs, are responsible for methylation and pseudouridylation of snRNAs. In addition an amazing diversity of non-canonical functions of individual snoRNAs arose. The modification patterns in rRNAs and snRNAs are retained during evolution making it even possible to project them from yeast onto human. The stringent conservation of modification sites and the slow evolution of rRNAs and snRNAs contradicts the rapid evolution of snoRNA sequences. Recent studies that incorporate high-throughput sequencing experiments still identify undetected snoRNAs even in well studied organisms as human. The snoRNAbase, which has been the standard database for human snoRNAs has not been updated ince 2006 and misses these new data. Along with the lack of a centralized data collection across species, which incorporates also snoRNA class specific characteristics the need to integrate distributed data from literature and databases into a comprehensive snoRNA set arose. Although several snoRNA studies included pro forma target predictions in individual species and more and more studies focus on non-canonical functions of subclasses a systematic survey on the guiding function and especially functional homologies of snoRNAs was not available. To establish a sound set of snoRNAs a computational snoRNA annotation pipeline, named snoStrip that identifies homologous snoRNAs in related species was employed. For large scale investigation of the snoRNA function, state-of-the-art target pedictions were performed with our software RNAsnoop and PLEXY. Further, a new measure the Interaction Conservation Index (ICI) was developed to evaluate the conservation of snoRNA function. The snoStrip pipeline was applied to vertebrate species, where the genome sequence has been available. In addition, it was used in several ncRNA annotation studies (48 avian, spotted gar) of newly assembled genomes to contribute the snoRNA genes. Detailed target analysis of the new vertebrate snoRNA set revealed that in general functions of homologous snoRNAs are evolutionarily stable, thus, members of the same snoRNA family guide equivalent modifications. The conservation of snoRNA sequences is high at target binding regions while the remaining sequence varies significantly. In addition to elucidating principles of correlated evolution it was possible, with the help of the ICI measure, to assign functions to previously orphan snoRNAs and to associate snoRNAs as partners to known but so far unexplained chemical modifications. As further pattern redundant guiding became apparent. For many modification sites more than one snoRNA encodes the appropriate antisense element (ASE), which could ensure constant modification through snoRNAs that have different expression patterns. Furthermore, predictions of snoRNA functions in conjunction with sequence conservation could identify distant homologies. Due to the high overall entropy of snoRNA sequences, such relationships are hard to detect by means of sequence homology search methods alone. The snoRNA interaction network was further expanded through novel snoRNAs that were detected in data from high-throughput experiments in human and mouse. Through subsequent target analysis the new snoRNAs could immediately explain known modifications that had no appropriate snoRNA guide assigned before. In a further study a full catalog of expressed snoRNAs in human was provided. Beside canonical snoRNAs also recent findings like AluACAs, sno-lncRNAs and extraordinary short SNORD-like transcripts were taken into account. Again the target analysis workflow identified undetected connections between snoRNA guides and modifications. Especially some species/clade specific interactions of SNORD-like genes emerged that seem to act as bona fide snoRNA guides for rRNA and snRNA modifications. For all high confident new snoRNA genes identified during this work official gene names were requested from the HUGO Gene Nomenclature Committee (HGNC) avoiding further naming confusion

    Computational methods to create and analyze a digital gene expression atlas of embryo development from microscopy images

    Full text link
    Abstract The creation of atlases, or digital models where information from different subjects can be combined, is a field of increasing interest in biomedical imaging. When a single image does not contain enough information to appropriately describe the organism under study, it is then necessary to acquire images of several individuals, each of them containing complementary data with respect to the rest of the components in the cohort. This approach allows creating digital prototypes, ranging from anatomical atlases of human patients and organs, obtained for instance from Magnetic Resonance Imaging, to gene expression cartographies of embryo development, typically achieved from Light Microscopy. Within such context, in this PhD Thesis we propose, develop and validate new dedicated image processing methodologies that, based on image registration techniques, bring information from multiple individuals into alignment within a single digital atlas model. We also elaborate a dedicated software visualization platform to explore the resulting wealth of multi-dimensional data and novel analysis algo-rithms to automatically mine the generated resource in search of bio¬logical insights. In particular, this work focuses on gene expression data from developing zebrafish embryos imaged at the cellular resolution level with Two-Photon Laser Scanning Microscopy. Disposing of quantitative measurements relating multiple gene expressions to cell position and their evolution in time is a fundamental prerequisite to understand embryogenesis multi-scale processes. However, the number of gene expressions that can be simultaneously stained in one acquisition is limited due to optical and labeling constraints. These limitations motivate the implementation of atlasing strategies that can recreate a virtual gene expression multiplex. The developed computational tools have been tested in two different scenarios. The first one is the early zebrafish embryogenesis where the resulting atlas constitutes a link between the phenotype and the genotype at the cellular level. The second one is the late zebrafish brain where the resulting atlas allows studies relating gene expression to brain regionalization and neurogenesis. The proposed computational frameworks have been adapted to the requirements of both scenarios, such as the integration of partial views of the embryo into a whole embryo model with cellular resolution or the registration of anatom¬ical traits with deformable transformation models non-dependent on any specific labeling. The software implementation of the atlas generation tool (Match-IT) and the visualization platform (Atlas-IT) together with the gene expression atlas resources developed in this Thesis are to be made freely available to the scientific community. Lastly, a novel proof-of-concept experiment integrates for the first time 3D gene expression atlas resources with cell lineages extracted from live embryos, opening up the door to correlate genetic and cellular spatio-temporal dynamics. La creación de atlas, o modelos digitales, donde la información de distintos sujetos puede ser combinada, es un campo de creciente interés en imagen biomédica. Cuando una sola imagen no contiene suficientes datos como para describir apropiadamente el organismo objeto de estudio, se hace necesario adquirir imágenes de varios individuos, cada una de las cuales contiene información complementaria respecto al resto de componentes del grupo. De este modo, es posible crear prototipos digitales, que pueden ir desde atlas anatómicos de órganos y pacientes humanos, adquiridos por ejemplo mediante Resonancia Magnética, hasta cartografías de la expresión genética del desarrollo de embrionario, típicamente adquiridas mediante Microscopía Optica. Dentro de este contexto, en esta Tesis Doctoral se introducen, desarrollan y validan nuevos métodos de procesado de imagen que, basándose en técnicas de registro de imagen, son capaces de alinear imágenes y datos provenientes de múltiples individuos en un solo atlas digital. Además, se ha elaborado una plataforma de visualization específicamente diseñada para explorar la gran cantidad de datos, caracterizados por su multi-dimensionalidad, que resulta de estos métodos. Asimismo, se han propuesto novedosos algoritmos de análisis y minería de datos que permiten inspeccionar automáticamente los atlas generados en busca de conclusiones biológicas significativas. En particular, este trabajo se centra en datos de expresión genética del desarrollo embrionario del pez cebra, adquiridos mediante Microscopía dos fotones con resolución celular. Disponer de medidas cuantitativas que relacionen estas expresiones genéticas con las posiciones celulares y su evolución en el tiempo es un prerrequisito fundamental para comprender los procesos multi-escala característicos de la morfogénesis. Sin embargo, el número de expresiones genéticos que pueden ser simultáneamente etiquetados en una sola adquisición es reducido debido a limitaciones tanto ópticas como del etiquetado. Estas limitaciones requieren la implementación de estrategias de creación de atlas que puedan recrear un multiplexado virtual de expresiones genéticas. Las herramientas computacionales desarrolladas han sido validadas en dos escenarios distintos. El primer escenario es el desarrollo embrionario temprano del pez cebra, donde el atlas resultante permite constituir un vínculo, a nivel celular, entre el fenotipo y el genotipo de este organismo modelo. El segundo escenario corresponde a estadios tardíos del desarrollo del cerebro del pez cebra, donde el atlas resultante permite relacionar expresiones genéticas con la regionalización del cerebro y la formación de neuronas. La plataforma computacional desarrollada ha sido adaptada a los requisitos y retos planteados en ambos escenarios, como la integración, a resolución celular, de vistas parciales dentro de un modelo consistente en un embrión completo, o el alineamiento entre estructuras de referencia anatómica equivalentes, logrado mediante el uso de modelos de transformación deformables que no requieren ningún marcador específico. Está previsto poner a disposición de la comunidad científica tanto la herramienta de generación de atlas (Match-IT), como su plataforma de visualización (Atlas-IT), así como las bases de datos de expresión genética creadas a partir de estas herramientas. Por último, dentro de la presente Tesis Doctoral, se ha incluido una prueba conceptual innovadora que permite integrar los mencionados atlas de expresión genética tridimensionales dentro del linaje celular extraído de una adquisición in vivo de un embrión. Esta prueba conceptual abre la puerta a la posibilidad de correlar, por primera vez, las dinámicas espacio-temporales de genes y células
    corecore