Search CORE

A Re-Annotation of the Saccharomyces Cerevisiae Genome

Author: A Ivens
Altschul
B. Barrell
Bairoch
Bairoch
Bateman
Berbee
Birney
Blandin
DeRisi
Dujon
Gaillardin
Goffeau
Hieter
K. M. Rutherford
Lowe
M-A Rajandream
Mackiewicz
Malpertuy
Mewes
Oliver
Oliver
Pearson
Rutherford
Sharp
Sonnhammer
Stoesser
V. Wood
Xiang
Zhang
Publication venue: Hindawi Publishing Corporation
Publication date: 01/01/2001
Field of study

Discrepancies in gene and orphan number indicated by previous analyses suggest that S. cerevisiae would benefit from a consistent re-annotation. In this analysis three new genes are identified and 46 alterations to gene coordinates are described. 370 ORFs are defined as totally spurious ORFs which should be disregarded. At least a further 193 genes could be described as very hypothetical, based on a number of criteria. It was found that disparate genes with sequence overlaps over ten amino acids (especially at the N-terminus) are rare in both S. cerevisiae and Sz. pombe. A new S. cerevisiae gene number estimate with an upper limit of 5804 is proposed, but after the removal of very hypothetical genes and pseudogenes this is reduced to 5570. Although this is likely to be closer to the true upper limit, it is still predicted to be an overestimate of gene number. A complete list of revised gene coordinates is available from the Sanger Centre (S. cerevisiae reannotation: ftp://ftp/pub/yeast/SCreannotation)

Megasatellites: a peculiar class of giant minisatellites in genes involved in cell adhesion and pathogenicity in Candida glabrata

Author: A. Thierry
Appelgren
B. Dujon
Bowen
C. Bouchier
Casta o
Crollius
Debrauw re
Dujon
Durrens
Ecker
Frieman
G.-F. Richard
Haber
Jeffreys
Jeffreys
Kobayashi
Kolpakov
Lopes
Malpertuy
Marck
Mathew
Paques
Richard
Rigden
Verstrepen
Welch
Zhang
Zupancic
Publication venue: Oxford University Press
Publication date
Field of study

Minisatellites are DNA tandem repeats that are found in all sequenced genomes. In the yeast Saccharomyces cerevisiae, they are frequently encountered in genes encoding cell wall proteins. Minisatellites present in the completely sequenced genome of the pathogenic yeast Candida glabrata were similarly analyzed, and two new types of minisatellites were discovered: minisatellites that are composed of two different intermingled repeats (called compound minisatellites), and minisatellites containing unusually long repeated motifs (126–429 bp). These long repeat minisatellites may reach unusual length for such elements (up to 10 kb). Due to these peculiar properties, they have been named ‘megasatellites’. They are found essentially in genes involved in cell–cell adhesion, and could therefore be involved in the ability of this opportunistic pathogen to colonize the human host. In addition to megasatellites, found in large paralogous gene families, there are 93 minisatellites with simple shorter motifs, comparable to those found in S. cerevisiae. Most of the time, these minisatellites are not conserved between C. glabrata and S. cerevisiae, although their host genes are well conserved, raising the question of an active mechanism creating minisatellites de novo in hemiascomycetes

Unravelling the ORFan Puzzle

Author: Alimi
Alm
Andersson
Andrade
Aravind
Balasubramanian
Barabasi
Basrai
Bloom
Boucher
Brenner
Coulson
Doolittle
Doolittle
Doolittle
Dujon
Fischer
Fischer
Fischer
Fraser
Gardner
Goulding
Hayashi
Hirsh
Hurst
Hutchison
Huynen
Iliopoulos
Jain
Jordan
Jordan
Karev
Koonin
Kunin
Lawrence
Mackiewicz
Malpertuy
Mira
Mira
Monchois
Ochman
Pellegrini
Petrov
Qian
Rost
Schmid
Siew
Siew
Skovgaard
Unger
Vitkup
Wolf
Wolfe
Wood
Wren
Yanai
Zdobnov
Publication venue: Hindawi Publishing Corporation
Publication date: 01/01/2003
Field of study

ORFans are open reading frames (ORFs) with no detectable sequence similarity to any other sequence in the databases. Each newly sequenced genome contains a significant number of ORFans. Therefore, ORFans entail interesting evolutionary puzzles. However, little can be learned about them using bioinformatics tools, and their study seems to have been underemphasized. Here we present some of the questions that the existence of so many ORFans have raised and review some of the studies aimed at understanding ORFans, their functions and their origins. These works have demonstrated that ORFans are an untapped source of research, requiring further computational and experimental studies

Elsevier - Publisher Connector

Genomic Exploration of the Hemiascomycetous Yeasts: 1. A set of yeast species for molecular evolution studies11Sequences and annotations are accessible at: Génoscope (http://www.genoscope.cns.fr), FEBS Letters Website (http://www.elsevier.nl/febs/show/), Bordeaux (http://cbi.genopole-bordeaux.fr/Genolevures) and were deposited into the EMBL database (accession number from AL392203 to AL441602).

Author: Aigle Michel
Artiguenave François
Blandin Gaëlle
Bolotin-Fukuhara Monique
Bon Elisabeth
Brottier Philippe
Casaregola Serge
de Montigny Jacky
Dujon Bernard
Durrens Pascal
Gaillardin Claude
Llorente Bertrand
Lépingle Andrée
Malpertuy Alain
Neuvéglise Cécile
Ozier-Kalogéropoulos Odile
Potier Serge
Saurin William
Souciet Jean-Luc
Tekaia Fredj
Toffano-Nioche Claire
Weissenbach Jean
Wincker Patrick
Wésolowski-Louvel Micheline
Publication venue: Federation of European Biochemical Societies. Published by Elsevier B.V.
Publication date
Field of study

AbstractThe identification of molecular evolutionary mechanisms in eukaryotes is approached by a comparative genomics study of a homogeneous group of species classified as Hemiascomycetes. This group includes Saccharomyces cerevisiae, the first eukaryotic genome entirely sequenced, back in 1996. A random sequencing analysis has been performed on 13 different species sharing a small genome size and a low frequency of introns. Detailed information is provided in the 20 following papers. Additional tables available on websites describe the ca. 20 000 newly identified genes. This wealth of data, so far unique among eukaryotes, allowed us to examine the conservation of chromosome maps, to identify the ‘yeast-specific’ genes, and to review the distribution of gene families into functional classes. This project conducted by a network of seven French laboratories has been designated ‘Génolevures’

Genomic Exploration of the Hemiascomycetous Yeasts: 19. Ascomycetes-specific genes

Author: Aigle Michel
Artiguenave Francois
Blandin Gaëlle
Bolotin-Fukuhara Monique
Bon Elisabeth
Brottier Philippe
Casarégola Serge
de Montigny Jacky
Dujon Bernard
Durrens Pascal
Gaillardin Claude
Llorente Bertrand
Lépingle Andrée
Malpertuy Alain
Neuvéglise Cécile
Ozier-Kalogeropoulos Odile
Potier Serge
Saurin William
Souciet Jean-Luc
Tekaia Fredj
Toffano-Nioche Claire
Weissenbach Jean
Wincker Patrick
Wésolowski-Louvel Micheline
Publication venue: Federation of European Biochemical Societies. Published by Elsevier B.V.
Publication date: 22/12/2000
Field of study

AbstractComparisons of the 6213 predicted Saccharomyces cerevisiae open reading frame (ORF) products with sequences from organisms of other biological phyla differentiate genes commonly conserved in evolution from ‘maverick’ genes which have no homologue in phyla other than the Ascomycetes. We show that a majority of the ‘maverick’ genes have homologues among other yeast species and thus define a set of 1892 genes that, from sequence comparisons, appear ‘Ascomycetes-specific’. We estimate, retrospectively, that the S. cerevisiae genome contains 5651 actual protein-coding genes, 50 of which were identified for the first time in this work, and that the present public databases contain 612 predicted ORFs that are not real genes. Interestingly, the sequences of the ‘Ascomycetes-specific’ genes tend to diverge more rapidly in evolution than that of other genes. Half of the ‘Ascomycetes-specific’ genes are functionally characterized in S. cerevisiae, and a few functional categories are over-represented in them

Elsevier - Publisher Connector

Migraine et contraception orale

Author: DURAND Alain
MALPERTUY Karine
Publication venue
Publication date: 01/01/2002
Field of study

AIX-MARSEILLE2-BU Pharmacie (130552105) / SudocSudocFranceF

OpenGrey Repository

Influence of microarrays experiments missing values on the stability of gene groups by hierarchical clustering.

Author: de Brevern Alexandre
Hazout Serge,
Malpertuy Alain
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2004
Field of study

12 pages + sup. dataBACKGROUND: Microarray technologies produced large amount of data. The hierarchical clustering is commonly used to identify clusters of co-expressed genes. However, microarray datasets often contain missing values (MVs) representing a major drawback for the use of the clustering methods. Usually the MVs are not treated, or replaced by zero or estimated by the k-Nearest Neighbor (kNN) approach. The topic of the paper is to study the stability of gene clusters, defined by various hierarchical clustering algorithms, of microarrays experiments including or not MVs. RESULTS: In this study, we show that the MVs have important effects on the stability of the gene clusters. Moreover, the magnitude of the gene misallocations is depending on the aggregation algorithm. The most appropriate aggregation methods (e.g. complete-linkage and Ward) are highly sensitive to MVs, and surprisingly, for a very tiny proportion of MVs (e.g. 1%). In most of the case, the MVs must be replaced by expected values. The MVs replacement by the kNN approach clearly improves the identification of co-expressed gene clusters. Nevertheless, we observe that kNN approach is less suitable for the extreme values of gene expression. CONCLUSION: The presence of MVs (even at a low rate) is a major factor of gene cluster instability. In addition, the impact depends on the hierarchical clustering algorithm used. Some methods should be used carefully. Nevertheless, the kNN approach constitutes one efficient method for restoring the missing expression gene values, with a low error level. Our study highlights the need of statistical treatments in microarray data to avoid misinterpretation

Springer - Publisher Connector

HAL-Inserm

arXiv.org e-Print Archive

Hal-Diderot

Global analysis of VHHs framework regions with a structural alphabet: VHH FRs structures

Author: de Brevern Alexandre
Malpertuy Alain
Noël Floriane
Publication venue: 'Elsevier BV'
Publication date: 01/01/2016
Field of study

International audienceThe VHHs are antigen-binding region/domain of camelid heavy chain antibodies (HCAb). They have many interesting biotechnological and biomedical properties due to their small size, high solubility and stability, and high affinity and specificity for their antigens. HCAb and classical IgGs are evolutionary related and share a common fold. VHHs are composed of regions considered as constant, called the frameworks (FRs) connected by Complementarity Determining Regions (CDRs), a highly variable region that provide interaction with the epitope. Actually, no systematic structural analyses had been performed on VHH structures despite a significant number of structures. This work is the first study to analyse the structural diversity of FRs of VHHs. Using a structural alphabet that allows approximating the local conformation, we show that each of the four FRs do not have a unique structure but exhibit many structural variant patterns. Moreover, no direct simple link between the local conformational change and amino acid composition can be detected. These results indicate that long-range interactions affect the local conformation of FRs and impact the building of structural models

HAL-Inserm

Hal-Diderot

Trends in IT Innovation to Build a Next Generation Bioinformatics Solution to Manage and Analyse Biological Big Data Produced by NGS Technologies

Author: Alain Malpertuy
Alexandre G. de Brevern
Cécile Fairhead
Cécile Neuvéglise
Jean-Philippe Meyniel
Publication venue: Hindawi Publishing Corporation
Publication date: 01/01/2015
Field of study

Sequencing the human genome began in 1994, and 10 years of work were necessary in order to provide a nearly complete sequence. Nowadays, NGS technologies allow sequencing of a whole human genome in a few days. This deluge of data challenges scientists in many ways, as they are faced with data management issues and analysis and visualization drawbacks due to the limitations of current bioinformatics tools. In this paper, we describe how the NGS Big Data revolution changes the way of managing and analysing data. We present how biologists are confronted with abundance of methods, tools, and data formats. To overcome these problems, focus on Big Data Information Technology innovations from web and business intelligence. We underline the interest of NoSQL databases, which are much more efficient than relational databases. Since Big Data leads to the loss of interactivity with data during analysis due to high processing time, we describe solutions from the Business Intelligence that allow one to regain interactivity whatever the volume of data is. We illustrate this point with a focus on the Amadea platform. Finally, we discuss visualization challenges posed by Big Data and present the latest innovations with JavaScript graphic libraries