45 research outputs found
Extraction of Transcript Diversity from Scientific Literature
Transcript diversity generated by alternative splicing and associated mechanisms contributes heavily to the functional complexity of biological systems. The numerous examples of the mechanisms and functional implications of these events are scattered throughout the scientific literature. Thus, it is crucial to have a tool that can automatically extract the relevant facts and collect them in a knowledge base that can aid the interpretation of data from high-throughput methods. We have developed and applied a composite text-mining method for extracting information on transcript diversity from the entire MEDLINE database in order to create a database of genes with alternative transcripts. It contains information on tissue specificity, number of isoforms, causative mechanisms, functional implications, and experimental methods used for detection. We have mined this resource to identify 959 instances of tissue-specific splicing. Our results in combination with those from EST-based methods suggest that alternative splicing is the preferred mechanism for generating transcript diversity in the nervous system. We provide new annotations for 1,860 genes with the potential for generating transcript diversity. We assign the MeSH term “alternative splicing” to 1,536 additional abstracts in the MEDLINE database and suggest new MeSH terms for other events. We have successfully extracted information about transcript diversity and semiautomatically generated a database, LSAT, that can provide a quantitative understanding of the mechanisms behind tissue-specific gene expression. LSAT (Literature Support for Alternative Transcripts) is publicly available at http://www.bork.embl.de/LSAT/
Statistical distributions in the folding of elastic structures
The behaviour of elastic structures undergoing large deformations is the
result of the competition between confining conditions, self-avoidance and
elasticity. This combination of multiple phenomena creates a geometrical
frustration that leads to complex fold patterns. By studying the case of a rod
confined isotropically into a disk, we show that the emergence of the
complexity is associated with a well defined underlying statistical measure
that determines the energy distribution of sub-elements,``branches'', of the
rod. This result suggests that branches act as the ``microscopic'' degrees of
freedom laying the foundations for a statistical mechanical theory of this
athermal and amorphous system
SETD7 regulates the differentiation of human embryonic stem cells
The successful use of specialized cells in regenerative medicine requires an optimization in the differentiation protocols that are currently used. Understanding the molecular events that take place during the differentiation of human pluripotent cells is essential for the improvement of these protocols and the generation of high quality differentiated cells. In an effort to understand the molecular mechanisms that govern differentiation we identify the methyltransferase SETD7 as highly induced during the differentiation of human embryonic stem cells and differentially expressed between induced pluripotent cells and somatic cells. Knock-down of SETD7 causes differentiation defects in human embryonic stem cell including delay in both the silencing of pluripotency-related genes and the induction of differentiation genes. We show that SETD7 methylates linker histone H1 in vitro causing conformational changes in H1. These effects correlate with a decrease in the recruitment of H1 to the pluripotency genes OCT4 and NANOG during differentiation in the SETD7 knockdown that might affect the proper silencing of these genes during differentiation.M.J.B. was partially supported by the Ramón y Cajal program of MEC (RYC-2007-01510). B.S. was a recipient of a predoctoral fellowship from MEC (BES-2008-009567). C.M. was supported by PT13/0001/0041 PRB2-ISCIII-SGEFI- FEDER-PE I+D+i 2013-2016. J.C. was partially supported by Fundación CELLEX. This work was partially supported by grant RD12/0019/0034 TERCEL-RETICS-ISCIII-MINECO-FEDER, grant SAF2009-08588 from MICINN to M.J.B and grant BFU2014-52237 to A.J.Peer Reviewe
Analysis of Human and Mouse Reprogramming of Somatic Cells to Induced Pluripotent Stem Cells. What Is in the Plate?
After the hope and controversy brought by embryonic stem cells two decades ago for regenerative medicine, a new turn has been taken in pluripotent cells research when, in 2006, Yamanaka's group reported the reprogramming of fibroblasts to pluripotent cells with the transfection of only four transcription factors. Since then many researchers have managed to reprogram somatic cells from diverse origins into pluripotent cells, though the cellular and genetic consequences of reprogramming remain largely unknown. Furthermore, it is still unclear whether induced pluripotent stem cells (iPSCs) are truly functionally equivalent to embryonic stem cells (ESCs) and if they demonstrate the same differentiation potential as ESCs. There are a large number of reprogramming experiments published so far encompassing genome-wide transcriptional profiling of the cells of origin, the iPSCs and ESCs, which are used as standards of pluripotent cells and allow us to provide here an in-depth analysis of transcriptional profiles of human and mouse cells before and after reprogramming. When compared to ESCs, iPSCs, as expected, share a common pluripotency/self-renewal network. Perhaps more importantly, they also show differences in the expression of some genes. We concentrated our efforts on the study of bivalent domain-containing genes (in ESCs) which are not expressed in ESCs, as they are supposedly important for differentiation and should possess a poised status in pluripotent cells, i.e. be ready to but not yet be expressed. We studied each iPSC line separately to estimate the quality of the reprogramming and saw a correlation of the lowest number of such genes expressed in each respective iPSC line with the stringency of the pluripotency test achieved by the line. We propose that the study of expression of bivalent domain-containing genes, which are normally silenced in ESCs, gives a valuable indication of the quality of the iPSC line, and could be used to select the best iPSC lines out of a large number of lines generated in each reprogramming experiment
Transcripts in Space and Time
Les biologistes moléculaires cherchent à comprendre comment fonctionnent les organismes au niveau moléculaire. Le but ultime de ces recherches est d’offrir la possibilité de manipuler sans risque des cellules et/ou des organismes afin de combattre des maladies génétiques, d’éradiquer les maladies contagieuses ou par example d’améliorer les qualités nutritives de l’alimentation. Actuellement, la manière la plus précise et pratique de comprendre le fonctionnement d’un organisme est d’étudier son transcriptome et ses variations dans l’espace et le temps. Suivant cette logique, le but de ma thèse de doctorat a été double: (1) estimer l’importance de l’épissage alternatif qui engendre une diversité des transcripts (2) étudier les transcriptomes de deux organismes modèles : Mus musculus et Drosophila melanogaster, respectivement dans l’espace et le temps.
Durant ces années de recherche, j’ai rassemblé des découvertes intéressantes concernant l’expression des gènes et sa régulation. D’abord, l’épissage alternatif s’est avéré être un méchanisme important non seulement en terme de fréquence (des transcripts alternatifs sont générés pour une vaste majorité des gènes, et ce dans de multiples espèces), mais aussi en terme d’évolution (l’épissage alternatif semble permettre à un gène d’évoluer sans conséquences trop négatives pour l’organisme). Par ailleurs nous avons prouvé que le niveau d’expression de transcripts n’est pas en soi synonyme de fonction: il y a en effet une quantité non négligeable d’expression neutre, qui doit être prise en compte lors de l’assignation d’une fonction à un gène, uniquement basée sur la similarité de son profil d’expression par rapport à celui d’un gène de fonction connue. Enfin, nous avons étudié des séries de puces à ADN appliquées à l’embryogenèse de la mouche dans le temps, en utilisant une technique non conventionnelle pour ce type d’approche. Nous avons réparti les gènes en différentes classes selon leurs profils d’expression. Nous avons pu prouver que ces classes de gènes ont des critères biologiques en commun, ce qui laisse supposer que les gènes inconnus ou mal caractérisés qui tombent dans ces classes sont d’interessants points de départ pour de futures recherches.
Des découvertes inestimables ont été et seront encore faites en biologie moléculaire grâce à l’étude des transcriptomes dans des organismes variés, analysés dans différentes conditions. Cependant, il est devenu clair qu’à cause de la présence de nombreuses étapes de régulation après la transcription, dont l’épissage alternatif, seule l’analyse des protéomes permettra d’obtenir une vision complète de la biologie de la cellule.Molecular biologists aim at the understanding of organisms at the molecular level. The ultimate goal is to have the possibility to safely manipulate cells and/or organisms in order to heal genetic diseases, eradicate contagious diseases or for example improve nutrient qualities of food. Currently the most accurate and practical way to capture the functioning of an organism is to look at its transcriptome and its spatial and temporal variations. Following this logic, the focus of my PhD thesis has been two folds: (1) estimate the importance of alternative splicing in the generation of transcript diversity (2) study the transcriptomes of two model organisms: Mus musculus and Drosophila melanogaster, respectively in a spatial and in a temporal dimension.
Along these years of research I gathered interesting findings on gene expression and its regulation. First, alternative splicing proved to be an important mechanism both in terms of frequency (alternative transcripts are generated for a vast majority of genes and in many species) and evolution (it seems to allow a gene to evolve with manageable consequences for the organism). Moreover we were able to prove that levels of gene expression at the transcript level do not automatically imply function: there is a non negligible amount of neutral expression which has to be taken into account when inferring function according to similarities in expression patterns.
Lastly we investigated time series microarray data by applying an innovative technique which allowed grouping of genes into classes according to an original expression profiles criterion (”consistent changes”), and could show that this grouping makes biological sense, and hence that unknown or poorly characterized genes within these groups might be worth investigating further.
An inestimable insight on molecular biology has been and will be gained thanks to studies of the transcriptomes of different organisms in various conditions. However, the full picture seems to only be accessible with proteomics data due to the number of regulatory steps still present after the transcript level, among which alternative splicing
Transcripts dans l'espace et le temps
Les biologistes moléculaires cherchent à comprendre comment fonctionnent les organismes au niveau moléculaire. Le but ultime de ces recherches est d'offrir la possibilité de manipuler sans risque des cellules et/ou des organismes afin de combattre des malPas de résum
Transcripts dans l'espace et le temps
Les biologistes moléculaires cherchent à comprendre comment fonctionnent les organismes au niveau moléculaire. Le but ultime de ces recherches est d'offrir la possibilité de manipuler sans risque des cellules et/ou des organismes afin de combattre des maladies génétiques, d'éradiquer les maladies contagieuses ou par example d'améliorer les qualités nutritives de l'alimentation. Actuellement, la manière la plus précise et pratique de comprendre le fonctionnement d'un organisme est d'étudier son transcriptome et ses variations dans l'espace et le temps. Suivant cette logique, le but de ma thèse de doctorat a été double: (1) estimer l'importance de l'épissage alternatif qui engendre une diversité des transcripts (2) étudier les transcriptomes de deux organismes modèles : Mus musculus et Drosophila melanogaster, respectivement dans l'espace et le temps. Durant ces années de recherche, j ai rassemblé des découvertes intéressantes concernant l'expression des gènes et sa régulation. D abord, l'épissage alternatif s'est avéré être un méchanisme important non seulement en terme de fréquence (des transcripts alternatifs sont générés pour une vaste majorité des gènes, et ce dans de multiples espèces), mais aussi en terme d'évolution (l épissage alternatif semble permettre à un gène d'évoluer sans conséquences trop négatives pour l'organisme). Par ailleurs nous avons prouvé que le niveau d'expression de transcripts n est pas en soi synonyme de fonction: il y a en effet une quantité non négligeable d'expression neutre, qui doit être prise en compte lors de l'assignation d'une fonction à un gène, uniquement basée sur la similarité de son profil d'expression par rapport à celui d'un gène de fonction connue. Enfin, nous avons étudié des séries de puces à ADN appliquées à l'embryogenèse de la mouche dans le temps, en utilisant une technique non conventionnelle pour ce type d'approche. Nous avons réparti les gènes en différentes classes selon leurs profils d'expression. Nous avons pu prouver que ces classes de gènes ont des critères biologiques en commun, ce qui laisse supposer que les gènes inconnus ou mal caractérisés qui tombent dans ces classes sont d'interessants points de départ pour de futures recherches. Des découvertes inestimables ont été et seront encore faites en biologie moléculaire grâce à l'étude des transcriptomes dans des organismes variés, analysés dans différentes conditions. Cependant, il est devenu clair qu à cause de la présence de nombreuses étapes de régulation après la transcription, dont l'épissage alternatif, seule l'analyse des protéomes permettra d'obtenir une vision complète de la biologie de la cellule.Pas de résum
Theoretical analysis of alternative splice forms using computational methods
Nowadays understanding alternative splicing is one of the greatest challenges in biology, because it is a genetic process much more important than thought at the time of its discovery. In this paper, we explain the approach of using the different available databases and software tools to start a large scale investigation of alternative splice forms. To collect information about alternative splicing we investigated known data in the databases using different computational methods. The investigations proceeded from the genomic sequence data to structural protein data. Then, we interpreted those data to find the relationship between alternative splice forms and protein function and structure. We found some interesting features of alternative splicing which are presented here. We discuss the results of one chosen example. They concern the coverage quality of the protein sequence of a known structure, an EST analysis, the validation of splice variants, the determination of the alternative splice type, and finally the link between alternative splicing and disease
Recent amplification and impact of MITEs on the genome of grapevine (Vitis vinifera L.)
Miniature inverted-repeat transposable elements (MITEs) are a particular type of defective class II transposons present in genomes as highly homogeneous populations of small elements. Their high copy number and close association to genes make their potential impact on gene evolution particularly relevant. Here, we present a detailed analysis of the MITE families directly related to grapevine "cut-and-paste" transposons. Our results show that grapevine MITEs have transduplicated and amplified genomic sequences, including gene sequences and fragments of other mobile elements. Our results also show that although some of the MITE families were already present in the ancestor of the European and American Vitis wild species, they have been amplified and have been actively transposing accompanying grapevine domestication and breeding. We show that MITEs are abundant in grapevine and some of them are frequently inserted within the untranslated regions of grapevine genes. MITE insertions are highly polymorphic among grapevine cultivars, which frequently generate transcript variability. The data presented here show that MITEs have greatly contributed to the grapevine genetic diversity which has been used for grapevine domestication and breeding