Search CORE

96 research outputs found

A large-scale analysis of mRNA polyadenylation of human and mouse genes

Author: Hu Jun
Lutz Carol S.
Tian Bin
Zhang Haibo
Publication venue: Oxford University Press
Publication date: 01/01/2005
Field of study

mRNA polyadenylation is a critical cellular process in eukaryotes. It involves 3′ end cleavage of nascent mRNAs and addition of the poly(A) tail, which plays important roles in many aspects of the cellular metabolism of mRNA. The process is controlled by various cis-acting elements surrounding the cleavage site, and their binding factors. In this study, we surveyed genome regions containing cleavage sites [herein called poly(A) sites], for 13 942 human and 11 155 mouse genes. We found that a great proportion of human and mouse genes have alternative polyadenylation (∼54 and 32%, respectively). The conservation of alternative polyadenylation type or polyadenylation configuration between human and mouse orthologs is statistically significant, indicating that alternative polyadenylation is widely employed by these two species to produce alternative gene transcripts. Genes belonging to several functional groups, indicated by their Gene Ontology annotations, are biased with respect to polyadenylation configuration. Many poly(A) sites harbor multiple cleavage sites (51.25% human and 46.97% mouse sites), leading to heterogeneous 3′ end formation for transcripts. This implies that the cleavage process of polyadenylation is largely imprecise. Different types of poly(A) sites, with regard to their relative locations in a gene, are found to have distinct nucleotide composition in surrounding genomic regions. This large-scale study provides important insights into the mechanism of polyadenylation in mammalian species and represents a genomic view of the regulation of gene expression by alternative polyadenylation

CiteSeerX

Crossref

PubMed Central

Analysis and Annotation of Nucleic Acid Sequence

Author
Publication venue: 'Office of Scientific and Technical Information (OSTI)'
Publication date
Field of study

Crossref

PIntron: a fast method for detecting the gene structure due to alternative splicing via maximal pairings of a pattern and a text

Author: Bonizzoni Paola
Della Vedova Gianluca
Pesole Graziano
Picardi Ernesto
Pirola Yuri
Rizzi Raffaella
Publication venue: BioMed Central
Publication date: 01/01/2012
Field of study

Crossref

Springer - Publisher Connector

PubMed Central

A cause for consilience: Utilizing multiple genomic data types to resolve problematic nodes within Arthropoda and Ecdysozoa

Author: Campbell Lahcen I.
Publication venue
Publication date: 01/02/2012
Field of study

A major turning point in the study of metazoan evolution was the recognition of the existence of the Ecdysozoa in 1997. This is a group of eight animal phyla (Nematoda, Nematomorpha, Loricifera, Kinorhyncha, Priapulida, Tardigrada, Onychophora and Arthropoda). Ecdysozoa is the most specious clade of animals to ever exist and the relationships among its eight phyla are still heatedly debated. Similarly also the relationships among the three sub-phyla (Chelicerata, Pancrustacea and Myriapoda) within the most important ecdysozoan phylum (the Arthropoda) are still debated. Indeed, the two major problems in ecdysozoan phylogeny refer to the relationships of Myriapoda within Arthropoda, and of Tardigrada within Ecdysozoa. Difficulties in ecdysozoan relationships resides in lineages characterized by rapid, deep divergences and subsequently long periods of divergent evolution. Phylogenetic signal to resolve the relationships of these lineages is diluted, increasing the likelihood of recovery of phylogenetic artifacts. In an attempt to resolve the relationships within Ecdysozoa, consilience of three independent phylogenetic data sets was investigated. EST and rRNA and microRNA (miRNA) data were sampled across all major ecdysozoan phyla. In particular, a major contribution of this thesis is the first time sequencing of miRNAs for all the panarthropod phyla. MicroRNAs are genome regulatory elements that recently emerged as a source of useful phylogenetic data (Sempere et al. 2006) because of their low homoplasy levels. The considered data sets were analysed under phylogenetic methods and models, implemented to minimize the occurrence of phylogenetic reconstruction artifacts to understand the evolution of Ecdysozoa. Analyses of independent data types recovered well supported and corroborating evidence for the monophyly of Panarthropoda (Arthropoda, Onychophora and Tardigrada), a sister group relationships between Myriapoda and Pancrustacea within Arthropoda, and the paraphyly of Cycloneuralia (Nematoda, Nematomorpha, Loricifera, Kinorhyncha and Priapulida).

MURAL - Maynooth University Research Archive Library

NUI Maynooth Eprint Archive

Maynooth University ePrints and eTheses Archive

A cause for consilience: Utilizing multiple genomic data types to resolve problematic nodes within Arthropoda and Ecdysozoa

Author: Campbell Lahcen I.
Publication venue
Publication date: 01/02/2012
Field of study

MURAL - Maynooth University Research Archive Library

The Presence and Distribution of Crotoxin in the Rock Rattlesnake (Crotalus lepidus)

Author: Mellor Jade
Publication venue: Clemson University Libraries
Publication date: 01/08/2022
Field of study

Crotoxin and its homologs (hereafter all referred to as CTx) is a highly lethal heterodimeric beta-neurotoxin found in pitvipers (Crotalinae) and is the main driver of neurotoxic venom phenotypes (Type II). In contrast, hemorrhagic venom phenotypes (Type I) are characterized by high snake venom metalloproteinase expression and low toxicity. Although many rattlesnake species have been classified as either Type I or Type II, population level variation in venom phenotype has also been documented in several species. The presence or absence of CTx is the main component of this variation in venom phenotype and has been most widely studied in large-bodied lowland rattlesnakes (Crotalus scutulatus, C. helleri, and C. horridus). While it has been suspected to be in C. lepidus, a small-bodied montane rattlesnake, there has been no genetic confirmation. We used genomics and transcriptomics to test for the presence, distribution, and evolution of CTx in C. lepidus. We genomically and transcriptomically confirmed the presence and expression of CTx in C. lepidus and found it in 17 out of 104 samples across their range. CTx presence was not significantly associated with longitude, latitude, subspecies, or elevation. However, we did identify several climatic variables associated with CTx presence, including ones that have been identified in previous studies on CTx expression providing insights on the phylogenetic distribution of CTx across rattlesnakes, the variation in crotoxin expression, and highlighting environments to which CTx may be locally adapted. Our results likely support previous hypotheses of an ancestral origin for crotoxin followed by independent sorting in lineages; therefore, future studies should focus on testing for the presence of CTx in other species of montane rattlesnakes

Clemson University: TigerPrints

Modélisation et comparaison de la structure de gènes

Author: Jammali Safa
Publication venue: 'Universite de Sherbrooke'
Publication date: 01/01/2022
Field of study

La bio-informatique est un domaine de recherche multi-disciplinaire, à la croisée de différents domaines : biologie, médecine, mathématiques, statistiques, chimie, physique et informatique. Elle a pour but de concevoir et d’appliquer des modèles et outils statistiques et computationnels visant l’avancement des connaissances en biologie et dans les sciences connexes. Dans ce contexte, la compréhension du fonctionnement et de l’évolution des gènes fait l’objet de nombreuses études en bio-informatique. Ces études sont majoritairement fondées sur la comparaison des gènes et en particulier sur l’alignement de séquences génomiques. Cependant, dans leurs calculs d’alignement de séquences génomiques, les méthodes existantes se basent uniquement sur la similarité des séquences et ne tiennent pas compte de la structure des gènes. L’alignement prenant en compte la structure des séquences offre l’opportunité d’en améliorer la précision ainsi que les résultats des méthodes développées à partir de ces alignements. C’est dans cette hypothèse que s’inscrit l’objectif de cette thèse de doctorat : proposer des modèles tenant compte de la structure des gènes lors de l’alignement des séquences de familles de gènes. Ainsi, par cette thèse, nous avons contribué à accroître les connaissances scientifiques en développant des modèles d’alignement de séquences biologiques intégrant des informations sur la structure de codage et d’épissage des séquences. Nous avons proposé un algorithme et une nouvelle fonction du score pour l’alignement de séquences codantes d’ADN (CDS) en tenant compte de la longueur des décalages du cadre de traduction. Nous avons aussi proposé un algorithme pour aligner des paires de séquences d’une famille de gènes en considérant leurs structures d’épissage. Nous avons également développé un algorithme pour assembler des alignements épissés par paire en alignements multiples de séquences. Enfin, nous avons développé un outil pour la visualisation d’alignements épissés multiples de famille de gènes. Dans cette thèse, nous avons souligné l’importance et démontré l’utilité de tenir compte de la structure des séquences en entrée lors du calcul de leur alignement

Savoirs UdeS

Recommended from our members

Advances in faba bean genetics and genomics

Author: Abdelwahd
Afgan
Arun-Chinnappa
Avila
Avila
Avila
Bishop
Bishop
Böttinger
Cernay
Cottage
Crepon
Cruz-Izquierdo
Denton
Diaz-Ruiz
Diaz-Ruiz
Duc
Duc
Duc
Duc
El-Rodeny
Ellwood
Erith
Gaj
Gnanasambandam
Gong
Gressel
Gutierrez
Hanafy
Hanafy
Kaur
Kaur
Kaur
Khamassi
Khan
Khazaei
Khazaei
Koboldt
Kovarova
Krzywinski
Kwon
Li
Link
Ma
Ma
Mao
Maxted
Menke
Milan
Mobini
Multari
Nayak
Ocaña
Patrick
Patto
Preissel
Pérez-de-Luque
Ramsay
Ray
Ray
Rybaczek
Sallam
Sanz
Satovic
Schmutz
Sillero
Sjödin
Sjödin
Sobita
Soltis
Stoddard
Suresh
Tan
Tanno
Tavakkoli
Till
Torres
Torres
van de Wouw
Wang
Wang
Webb
Yang
Young
Zeid
Zeid
Zhang
Zong
Publication venue: 'Frontiers Media SA'
Publication date: 01/01/2016
Field of study

Vicia faba L, is a globally important grain legume whose main centers of diversity are the Fertile Crescent and Mediterranean basin. Because of its small number (six) of exceptionally large and easily observed chromosomes it became a model species for plant cytogenetics the 70s and 80s. It is somewhat ironic therefore, that the emergence of more genomically tractable model plant species such as Arabidopsis and Medicago coincided with a marked decline in genome research on the formerly favored plant cytogenetic model. Thus, as ever higher density molecular marker coverage and dense genetic and even complete genome sequence maps of key crop and model species emerged through the 1990s and early 2000s, genetic and genome knowledge of Vicia faba lagged far behind other grain legumes such as soybean, common bean and pea. However, cheap sequencing technologies have stimulated the production of deep transcriptome coverage from several tissue types and numerous distinct cultivars in recent years. This has permitted the reconstruction of the faba bean meta-transcriptome and has fueled development of extensive sets of Simple Sequence Repeat and Single Nucleotide Polymorphism (SNP) markers. Genetics of faba bean stretches back to the 1930s, but it was not until 1993 that DNA markers were used to construct genetic maps. A series of Random Amplified Polymorphic DNA-based genetic studies mainly targeted at quantitative loci underlying resistance to a series of biotic and abiotic stresses were conducted during the 1990's and early 2000s. More recently, SNP-based genetic maps have permitted chromosome intervals of interest to be aligned to collinear segments of sequenced legume genomes such as the model legume Medicago truncatula, which in turn opens up the possibility for hypotheses on gene content, order and function to be translated from model to crop. Some examples of where knowledge of gene content and function have already been productively exploited are discussed. The bottleneck in associating genes and their functions has therefore moved from locating gene candidates to validating their function and the last part of this review covers mutagenesis and genetic transformation, two complementary routes to validating gene function and unlocking novel trait variation for the improvement of this important grain legume

Central Archive at the University of Reading

Crossref

Directory of Open Access Journals

Frontiers - Publisher Connector

PubMed Central

Assembly and Compositional Analysis of Human Genomic DNA - Doctoral Dissertation, August 2002

Author: Rouchka Eric C.
Publication venue: Washington University Open Scholarship
Publication date: 18/11/2002
Field of study

In 1990, the United States Human Genome Project was initiated as a fifteen-year endeavor to sequence the approximately three billion bases making up the human genome (Vaughan, 1996).As of December 31, 2001, the public sequencing efforts have sequenced a total of 2.01 billion finished bases representing 63.0% of the human genome (http://www.ncbi.nlm.nih.gov/genome/seq/page.cgi?F=HsProgress.shtml&&ORG=Hs) to a Bermuda quality error rate of 1/10000 (Smith and Carrano, 1996). In addition, 1.11 billion bases representing 34.8% of the human genome has been sequenced to a rough-draft level. Efforts such as UCSC\u27s GoldenPath (Kent and Haussler, 2001) and NCBI\u27s contig assembly (Jang et al., 1999) attempt to assemble the human genome by incorporating both finished and rough-draft sequence. The availability of the human genome data allows us to ask questions concerning the maintenance of specific regions of the human genome. We consider two hypotheses for maintenance of high G+C regions: the presence of specific repetitive elements and compositional mutation biases. Our results rule out the possibility of the G+C content of repetitive elements determining regions of high and low G+C regions in the human genome. We determine that there is a compositional bias for mutation rates. However, these biases are not responsible for the maintenance of high G+C regions. In addition, we show that regions of the human under less selective pressure will mutate towards a higher A+T composition, regardless of the surrounding G+C composition. We also analyze sequence organization and show that previous studies of isochore regions (Bernardi,1993) cannot be generalized within the human genome. In addition, we propose a method to assemble only those parts of the human genome that are finished into larger contigs. Analysis of the contigs can lead to the mining of meaningful biological data that can give insights into genetic variation and evolution. I suggest a method to help aid in single nucleotide polymorphism (SNP)detection, which can help to determine differences within a population. I also discuss a dynamic-programming based approach to sequence assembly validation and detection of large-scale polymorphisms within a population that is made possible through the availability of large human sequence contigs

Washington University St. Louis: Open Scholarship

The development and application of informatics-based systems for the analysis of the human transcriptome

Author: Kelso Janet
Publication venue: 'University of the Western Cape Library Service'
Publication date: 01/01/2003
Field of study

Philosophiae Doctor - PhDDespite the fact that the sequence of the human genome is now complete it has become clear that the elucidation of the transcriptome is more complicated than previously expected. There is mounting evidence for unexpected and previously underestimated phenomena such as alternative splicing in the transcriptome. As a result, the identification of novel transcripts arising from the genome continues. Furthermore, as the volume of transcript data grows it is becoming increasingly difficult to integrate expression information which is from different sources, is stored in disparate locations, and is described using differing terminologies. Determining the function of translated transcripts also remains a complex task. Information about the expression profile – the location and timing of transcript expression – provides evidence that can be used in understanding the role of the expressed transcript in the organ or tissue under study, or in developmental pathways or disease phenotype observed. In this dissertation I present novel computational approaches with direct biological applications to two distinct but increasingly important areas of research in gene expression research. The first addresses detection and characterisation of alternatively spliced transcripts. The second is the construction of an hierarchical controlled vocabulary for gene expression data and the annotation of expression libraries with controlled terms from the hierarchies. In the final chapter the biological questions that can be approached, and the discoveries that can be made using these systems are illustrated with a view to demonstrating how the application of informatics can both enable and accelerate biological insight into the human transcriptome.South Afric

UWC Theses and Dissertations