98 research outputs found

    Directed Mutations Recode Mitochondrial Genes: From Regular to Stopless Genetic Codes

    Get PDF
    Mitochondrial genetic codes evolve as side effects of stop codon ambiguity: suppressor tRNAs with anticodons translating stops transform genetic codes to stopless genetic codes. This produces peptides from frames other than regular ORFs, potentially increasing protein numbers coded by single sequences. Previous descriptions of marine turtle Olive Ridley mitogenomes imply directed stop-depletion of noncoding +1 gene frames, stop-creation recodes regular ORFs to stopless genetic codes. In this analysis, directed stop codon depletion in usually noncoding gene frames of the spiraling whitefly Aleurodicus dispersusʼ mitogenome produces new ORFs, introduces stops in regular ORFs, and apparently increases coding redundancy between different gene frames. Directed stop codon mutations switch between peptides coded by regular and stopless genetic codes. This process seems opposite to directed stop creation in HIV ORFs within genomes of immunized elite HIV controllers. Unknown DNA replication/edition mechanisms probably direct stop creation/depletion beyond natural selection on stops. Switches between genetic codes regulate translation of different gene frames

    Differential evolution of non-coding DNA across eukaryotes and its close relationship with complex multicellularity on Earth

    Get PDF
    Here, I elaborate on the hypothesis that complex multicellularity (CM, sensu Knoll) is a major evolutionary transition (sensu Szathmary), which has convergently evolved a few times in Eukarya only: within red and brown algae, plants, animals, and fungi. Paradoxically, CM seems to correlate with the expansion of non-coding DNA (ncDNA) in the genome rather than with genome size or the total number of genes. Thus, I investigated the correlation between genome and organismal complexities across 461 eukaryotes under a phylogenetically controlled framework. To that end, I introduce the first formal definitions and criteria to distinguish ‘unicellularity’, ‘simple’ (SM) and ‘complex’ multicellularity. Rather than using the limited available estimations of unique cell types, the 461 species were classified according to our criteria by reviewing their life cycle and body plan development from literature. Then, I investigated the evolutionary association between genome size and 35 genome-wide features (introns and exons from protein-coding genes, repeats and intergenic regions) describing the coding and ncDNA complexities of the 461 genomes. To that end, I developed ‘GenomeContent’, a program that systematically retrieves massive multidimensional datasets from gene annotations and calculates over 100 genome-wide statistics. R-scripts coupled to parallel computing were created to calculate >260,000 phylogenetic controlled pairwise correlations. As previously reported, both repetitive and non-repetitive DNA are found to be scaling strongly and positively with genome size across most eukaryotic lineages. Contrasting previous studies, I demonstrate that changes in the length and repeat composition of introns are only weakly or moderately associated with changes in genome size at the global phylogenetic scale, while changes in intron abundance (within and across genes) are either not or only very weakly associated with changes in genome size. Our evolutionary correlations are robust to: different phylogenetic regression methods, uncertainties in the tree of eukaryotes, variations in genome size estimates, and randomly reduced datasets. Then, I investigated the correlation between the 35 genome-wide features and the cellular complexity of the 461 eukaryotes with phylogenetic Principal Component Analyses. Our results endorse a genetic distinction between SM and CM in Archaeplastida and Metazoa, but not so clearly in Fungi. Remarkably, complex multicellular organisms and their closest ancestral relatives are characterized by high intron-richness, regardless of genome size. Finally, I argue why and how a vast expansion of non-coding RNA (ncRNA) regulators rather than of novel protein regulators can promote the emergence of CM in Eukarya. As a proof of concept, I co-developed a novel ‘ceRNA-motif pipeline’ for the prediction of “competing endogenous” ncRNAs (ceRNAs) that regulate microRNAs in plants. We identified three candidate ceRNAs motifs: MIM166, MIM171 and MIM159/319, which were found to be conserved across land plants and be potentially involved in diverse developmental processes and stress responses. Collectively, the findings of this dissertation support our hypothesis that CM on Earth is a major evolutionary transition promoted by the expansion of two major ncDNA classes, introns and regulatory ncRNAs, which might have boosted the irreversible commitment of cell types in certain lineages by canalizing the timing and kinetics of the eukaryotic transcriptome.:Cover page Abstract Acknowledgements Index 1. The structure of this thesis 1.1. Structure of this PhD dissertation 1.2. Publications of this PhD dissertation 1.3. Computational infrastructure and resources 1.4. Disclosure of financial support and information use 1.5. Acknowledgements 1.6. Author contributions and use of impersonal and personal pronouns 2. Biological background 2.1. The complexity of the eukaryotic genome 2.2. The problem of counting and defining “genes” in eukaryotes 2.3. The “function” concept for genes and “dark matter” 2.4. Increases of organismal complexity on Earth through multicellularity 2.5. Multicellularity is a “fitness transition” in individuality 2.6. The complexity of cell differentiation in multicellularity 3. Technical background 3.1. The Phylogenetic Comparative Method (PCM) 3.2. RNA secondary structure prediction 3.3. Some standards for genome and gene annotation 4. What is in a eukaryotic genome? GenomeContent provides a good answer 4.1. Background 4.2. Motivation: an interoperable tool for data retrieval of gene annotations 4.3. Methods 4.4. Results 4.5. Discussion 5. The evolutionary correlation between genome size and ncDNA 5.1. Background 5.2. Motivation: estimating the relationship between genome size and ncDNA 5.3. Methods 5.4. Results 5.5. Discussion 6. The relationship between non-coding DNA and Complex Multicellularity 6.1. Background 6.2. Motivation: How to define and measure complex multicellularity across eukaryotes? 6.3. Methods 6.4. Results 6.5. Discussion 7. The ceRNA motif pipeline: regulation of microRNAs by target mimics 7.1. Background 7.2. A revisited protocol for the computational analysis of Target Mimics 7.3. Motivation: a novel pipeline for ceRNA motif discovery 7.4. Methods 7.5. Results 7.6. Discussion 8. Conclusions and outlook 8.1. Contributions and lessons for the bioinformatics of large-scale comparative analyses 8.2. Intron features are evolutionarily decoupled among themselves and from genome size throughout Eukarya 8.3. “Complex multicellularity” is a major evolutionary transition 8.4. Role of RNA throughout the evolution of life and complex multicellularity on Earth 9. Supplementary Data Bibliography Curriculum Scientiae Selbständigkeitserklärung (declaration of authorship

    La relation entre codes k-circulaires et codes circulaires

    Get PDF
    International audienceA code XX is kk-circular if any concatenation of at most kk words from XX, when read on a circle, admits exactly one partition into words from XX. It is circular if it is kk-circular for every integer kk. While it is not a priori clear from the definition, there exists, for every pair (nn,ℓ), an integer kk such that every kk-circular ℓ-letter code over an alphabet of cardinality n is circular, and we determine the least such integer kk for all values of nn and ℓ. The kk-circular codes may represent an important evolutionary step between the circular codes, such as the comma-free codes, and the genetic code.Un code XX est kk-circulaire si toute concaténation d'au plus kk mots de XX, lue de façon circulaire, admet une et une seule partition en mots appartenant à XX. Il est circulaire s'il est kk-circulaire pour tout entier kk. Bien que ce ne soit pas a priori clair à partir de la définition, il existe, pour toute paire (nn,ℓ), un entier kk tel que tout code kk-circulaire de mots à ℓ lettres sur un alphabet de taille nn est circulaire, et nous déterminons la plus petite valeur d'un tel entier kk pour toutes les paires (nn,ℓ). Les codes kk-circulaires représentent peut-être une importante étape d'évolution entre les codes circulaires, comme les codes comma-free, et le code génétique

    Dissecting Key Determinants for Calcium and Calmodulin Regulation of GAP Junction and Viral Protein

    Get PDF
    Calcium and calmodulin are implicated in mediating the Ca2+-dependent regulation of gap junctions that are essential for the intercellular transmission of molecules such as nutrients, metabolites, metal ions and signal messengers (\u3c 1000 Da) through its specialized cell membrane channels and communication to extracellular environment. To understand the key determinants for calcium and calmodulin regulation of gap junction, in this study, we identified a calmodulin binding domain in the second half of the intracellular loop of Cxonnexin50 (the major gap junction protein found in an eye lens) using peptide fragments that encompass predicted CaM binding sites and various biophysical methods. Our study provides the first direct evidence that CaM binds to a specific region of the ubiquitous gap junction protein Cx50 in a Ca2+-dependent manner. Furthermore, two novel CaM binding regions in cytosolic loop and C-termini of Connexin43 (the most ubiquitous connexin) have been shown to interact with CaM with different binding modes in the presence of Ca2+ using high resolution NMR. Our results also elucidate the molecular determinants of regulation of gap junction by multiple CaM targeting regions and provide insight into the molecular basis of gap junction gating mechanism and the binding of CaM to the cytoslic region Cx43-3p as the major regulation site. Upon response to the cytosolic calcium increase, CaM binds to the cytosolic loop to result in the conformational change of gap junction and close the channel. It is possible for CaM to use an adjacent region as an anchor close to the regulation site to allow for fast response. Since a large number of residues in the Cxs mutated in human diseases reside at the highly identified CaM binding sites in Cxs, our studies provide insights into define the critical cellular changes and molecular mechanisms contributing to human disease pathogenesis as part of an integrated molecular model for the calcium regulation of GJs. In addition, we have applied the grafting approach to probe the metal binding capability of predicted EF-hand motifs within the streptococcal hemoprotein receptor (Shr) of Streptococcus pyrogenes as well as the nonstructural protein 1 (nsP1) of Sindbis virus and Poxvirus. This fast and robust method allows us to analyze putative EF-hand proteins at genome-wide scale and to further visualize the evolutionary scenario of the EF-hand protein family. Further, mass spectrometry has also been applied to probe modification of proteins such as CaM labeling by florescence dye and 7E15 by PEG

    Human Promoter Prediction Using DNA Numerical Representation

    Get PDF
    With the emergence of genomic signal processing, numerical representation techniques for DNA alphabet set {A, G, C, T} play a key role in applying digital signal processing and machine learning techniques for processing and analysis of DNA sequences. The choice of the numerical representation of a DNA sequence affects how well the biological properties can be reflected in the numerical domain for the detection and identification of the characteristics of special regions of interest within the DNA sequence. This dissertation presents a comprehensive study of various DNA numerical and graphical representation methods and their applications in processing and analyzing long DNA sequences. Discussions on the relative merits and demerits of the various methods, experimental results and possible future developments have also been included. Another area of the research focus is on promoter prediction in human (Homo Sapiens) DNA sequences with neural network based multi classifier system using DNA numerical representation methods. In spite of the recent development of several computational methods for human promoter prediction, there is a need for performance improvement. In particular, the high false positive rate of the feature-based approaches decreases the prediction reliability and leads to erroneous results in gene annotation.To improve the prediction accuracy and reliability, DigiPromPred a numerical representation based promoter prediction system is proposed to characterize DNA alphabets in different regions of a DNA sequence.The DigiPromPred system is found to be able to predict promoters with a sensitivity of 90.8% while reducing false prediction rate for non-promoter sequences with a specificity of 90.4%. The comparative study with state-of-the-art promoter prediction systems for human chromosome 22 shows that our proposed system maintains a good balance between prediction accuracy and reliability. To reduce the system architecture and computational complexity compared to the existing system, a simple feed forward neural network classifier known as SDigiPromPred is proposed. The SDigiPromPred system is found to be able to predict promoters with a sensitivity of 87%, 87%, 99% while reducing false prediction rate for non-promoter sequences with a specificity of 92%, 94%, 99% for Human, Drosophila, and Arabidopsis sequences respectively with reconfigurable capability compared to existing system

    Genetic variation within the Daphnia pulex genome

    Get PDF
    Genetic variation within the diploid Daphnia pulex genome was examined using a high quality de novo assembly and shotgun reads from two distinct D. pulex clones. Patterns of variation and divergence at single nucleotides were examined in physical and functional regions of the genome using comparative assembly output and available annotations. Additionally, mitochondrial genomes of the same D. pulex clones were assembled and compared for patterns of divergence, and substitutional biases. Intron presence/absence polymorphisms were identified computationally and verified experimentally. Finally, gene duplicate demographics were examined for patterns of divergence and estimates of gene birth rates

    Novel Insights of Viroid Biology and Host Responses to Their Infection

    Full text link
    Tesis por compendio[ES] Los viroides son los patógenos con replicación autónoma más simples y sólo se han encontrado de forma natural infectando plantas superiores. Desde que se descubrieron en los años setenta, se ha adquirido un conocimiento considerable sobre su naturaleza y mecanismos de replicación en las plantas huésped. Sin embargo, aún quedan por descubrir muchos aspectos de la biología de los viroides. Por lo tanto, un conocimiento más profundo de la naturaleza y el modo de acción de los viroides han sido los objetivos principales que engloban esta tesis. Para ello, es esencial contar con procedimientos sencillos y eficientes para la obtención de clones de ADNc infecciosos. Se desarrolló un nuevo método eficiente para construir clones de viroides infecciosos y se probó con un viroide de cada familia: El viroide latente de la berenjena (ELVd, Avsunviroidae) y el viroide del lúpulo (HSVd, Pospiviroidae). Esta aproximación se basó en enzimas de restricción de tipo IIS que cortan fuera del sitio de reconocimiento y supone un procedimiento universal para obtener clones infecciosos de un viroide independientemente de su secuencia, con una alta eficiencia. A pesar de que los viroides han sido considerados como ARN no codificantes desde su descubrimiento, nuestro análisis computacional predijo pequeños marcos de lectura abiertos en cada uno de los genomas de HSVd y ELVd. No se encontraron similitudes significativas con las proteínas de la base de datos de plantas superiores, pero algunos de estos péptidos predichos estaban altamente conservados entre todas las variantes de HSVd y ELVd. Curiosamente, la fusión de estas secuencias conservadas con una proteína fluorescente reveló una localización subcelular específica en el correspondiente orgánulo donde tiene lugar la replicación/acumulación para cada viroide: nucleolo y cloroplasto para HSVd y ELVd, respectivamente. Las mutaciones que truncan el dominio nucleolar de HSVd fueron perjudiciales para el viroide, mientras que el truncamiento de cualquiera de los dos ORF de ELVd que contiene una señal de localización al cloroplasto también disminuyó (pero en menor medida) la eficiencia biológica del viroide, tal vez debido a la redundancia funcional. Se encontraron formas circulares de los ARN de HSVd y ELVd en fracciones polisómicas, lo que revela su interacción física con la maquinaria de traducción de la célula vegetal. En conjunto, estas observaciones experimentales indican que no se puede descartar la capacidad de codificación de los viroides, aunque la prueba definitiva (la detección de los péptidos codificados por los circRNAs) es un reto tecnológico que deberá abordarse en futuras líneas de investigación. Finalmente, para estudiar qué cambios se producen en el huésped durante la infección con un viroide sintomático, se realizó un análisis integrador de las alteraciones genómicas de plantas de pepino infectadas con HSVd. Se integraron los transcriptomas, el sRNAnomas y el metilomas para determinar la respuesta temporal a la infección por el viroide. Nuestros resultados apoyan que el HSVd promueve el rediseño de las vías reguladoras del pepino afectando predominantemente a capas reguladoras específicas en diferentes fases de la infección. La respuesta inicial se caracterizó por una reconfiguración del transcriptoma del hospedador mediante el uso diferencial de exones, seguido de una predominante regulación a la baja de la actividad transcripcional modulada por los cambios epigenéticos del hospedador asociados a la infección y caracterizada por un aumento de la hipermetilación. Las alteraciones en el metabolismo de los ARN pequeños y microARNs del huésped fueron marginales y se produjeron principalmente en la fase tardía. En general, estos datos constituyen el primer mapa exhaustivo de las respuestas de la planta a la infección de un viroide.[CA] Els viroids són els patògenes més simples amb replicació autònoma i només s'han identificat de forma natural infectant a plantes superiors. Des que es descobriren als anys setanta, s'ha adquirit un coneixement considerable sobre la seua natura i els mecanismes de replicació en plantes hoste. No obstant, encara queden per descobrir molts aspectes de la biologia dels viroids. Per tant, un coneixement més profund de la natura i el mode d'acció dels viroids han sigut els objectius principals que engloben aquesta tesi. Per a això, és essencial la disponibilitat de procediments senzills i eficients per a l'obtenció de clones infecciosos. Es va desenvolupar un nou mètode eficient per a construir clones infecciosos y es fa provar amb un viroid de cada família: el viroide latent de la albergínia (ELVd, Avsunviroidae) y el viroid del llúpol (HSVd, Pospiviroidae). Aquesta aproximació es basà en enzims de restricció de tipus IIS que tallen fora del lloc de reconeixement i suposa un procediment universal per obtenir clones infecciosos de un viroid independentment de la seua seqüencia amb una elevada eficiència. Tot i que els viroids s'han considerat com ARNs no codificants des del seu descobriment, el nostre anàlisi computacional va predir xicotets ORF als genomes de HSVd y ELVd. No es trobaren similituds significatives amb proteïnes depositades a les bases de dades, però alguns d'aquest pèptids estaven altament conservats a les variants de HSVd y ELVd. Curiosament, la fusió d'aquestes seqüencies conservades amb una proteïna fluorescent revelà una localització subcel·lular especifica al orgànul on te lloc la replicació/acumulació de cada viroid: nuclèol i cloroplast per a HSVd i ELVd, respectivament. Les mutacions que trunquen el domini nucleolar de HSVd foren perjudicials per al viroid, mentre que el truncament de qualsevol de les dos ORF de ELVd que contenen una senyal de localització al cloroplast també va disminuir (però en menor mesura) l'eficiència biològica del viroid, el que pot ser degut a una redundància funcional. Es detectaren formes d'ARN circular de HSVd i ELVd a les fraccions polisòmiques, el que revela la seua interacció física amb la maquinaria de traducció cel·lular. En conjunt, aquestes observacions experimentals indiquen que no es pot descartar la capacitat codificants dels viroids, encara que la evidencia definitiva (la detecció del pèptids codificats per ARN circulars) es un repte tecnològic que s'haurà d'adreçar en línies d'investigació futures. Finalment, per tal d'estudiar que canvis es produeixen a l'hoste durant la infecció amb un viroid simptomàtic, es va realitzar un anàlisi integrador de les alteracions genòmiques de les plantes de cogombre infectades amb HSVd. S'integraren els transcriptomes, sARNomes i metilomes per determinar la resposta temporal a la infecció per viroid. Els resultats obtinguts suporten que HSVd promou un redisseny de les vies reguladores de cogombre afectant predominantment a nivells reguladors específics a les diferents etapes de la infecció. La resposta inicial es caracteritzà per una reconfiguració del transcriptoma de l'hoste mitjançant l'ús diferencial d'exons, seguit d'una repressió transcripticional modulada per canvis epigenètics de l'hoste caracteritzats per una major hipermetilació. Les alteracions al metabolisme de ARN xicotets i microARNs de l'hoste van ser marginals i es produïren principalment al final de la infecció. En general, aquestes dades constitueixen el primer mapa exhaustiu de les respostes de la planta a la infecció per un viroid.[EN] Viroids are the simplest pathogens with autonomous replication and have only been found naturally infecting higher plants. Since viroids were discovered in the seventies, we have gained considerable knowledge about their nature and replication mechanisms in host plants. However, many aspects of viroid biology are yet to be discovered. Therefore, a deeper understanding of the nature and mode of action of viroids have been the encompassing main goals of this thesis. For this purpose, simple and efficient procedures for obtaining infectious cDNA clones are essential. A new efficient method for constructing infectious viroid clones was developed and tested with one viroid of each family: eggplant latent viroid (ELVd, Avsunviroidae) and hop stunt viroid (HSVd, Pospiviroidae). This procedure was based on type IIS restrictions enzymes that cut outside of the recognition site and supposes a universal procedure for obtaining infectious clones of a viroid independently of its sequence, with a high efficiency. Despite viroids have been considered as plant-pathogenic non-coding RNAs since their discovery, our computational analysis predicted small open reading frames in each of the HSVd and ELVd genomes. No significant similarities with proteins in the database of higher plants were found, but some of these predicted peptides were highly conserved among all HSVd and ELVd variants. Interestingly, the fusion of these conserved sequences to a fluorescent protein revealed a specific subcellular localization in the corresponding organelle where replication/accumulation takes place for each viroid: nucleolus and chloroplast for HSVd and ELVd, respectively. Mutations that truncate the nucleolar domain of HSVd were detrimental for the viroid while truncating any of the two ELVd ORF that contains a chloroplast transit signal also diminished (but to a lesser extent) viroid biological efficiency, maybe because of functional redundancy. Circular forms of both, HSVd and ELVd RNAs were found in polysome fractions, revealing their physical interaction with the translational machinery of the plant cell. Altogether, these experimental observations indicate that the coding capacity of viroids cannot be ruled out, although the definitive evidence (detection of the circRNA-encoded peptides) is a technological challenge to be addressed in future research lines. Finally, to study the host changes that are produced during a symptomatic viroid infection, an integrative analysis of the timing and intensity of the genome-wide alterations in cucumber plants infected with HSVd was performed. Differential host transcriptome, sRNAnome and methylome were integrated to determine the temporal response to viroid-infection. Our results support that HSVd promotes the redesign of the cucumber regulatory-pathways predominantly affecting specific regulatory layers at different infection-phases. The initial response was characterized by a reconfiguration of the host-transcriptome by differential exon usage, followed by a predominant down-regulation of the transcriptional activity modulated by the host epigenetic changes associated to infection and characterized by increased hypermethylation. The alterations in host sRNA and microRNA metabolism were marginal and mainly occurred at the late stage. Overall, these data constitute the first comprehensive map of the plant responses to a viroid infection.La Conselleria d’Educació, Investigació, Cultura i Esports (Generalitat Valenciana) y el Fondo Social Europeo (FSECV 2014-2020) han cofinanciado la contratación del doctorando como personal investigador de carácter predoctoral (ACIF/2017/114) y unas estancias predoctorales fuera de la Comunitat Valenciana (BEFPI/2020). La realización de esta tesis doctoral también se ha realizado en el marco de dos proyectos de investigación del Ministerio de Ciencia, Innovación y Universidades, con cofinanciación de fondos FEDER [BIO2017-88321-R y AGL2016-79825-R] .Márquez Molins, J. (2022). Novel Insights of Viroid Biology and Host Responses to Their Infection [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/183479TESISPremios Extraordinarios de tesis doctoralesCompendi

    Chromosome ends in plasmodium falciparum : Dynamics of telomeres, roles of subtelomeres and investigation of telomerase as an antimalarial drug target

    Get PDF
    Dissertação de Doutoramento em Ciências Biomédicas, área de especialização em Biologia Molecular, apresentada ao Instituto de Ciências Biomédicas de Abel Salazar da Universidade do Port
    corecore