300 research outputs found
Gene expression drives the evolution of dominance.
Dominance is a fundamental concept in molecular genetics and has implications for understanding patterns of genetic variation, evolution, and complex traits. However, despite its importance, the degree of dominance in natural populations is poorly quantified. Here, we leverage multiple mating systems in natural populations of Arabidopsis to co-estimate the distribution of fitness effects and dominance coefficients of new amino acid changing mutations. We find that more deleterious mutations are more likely to be recessive than less deleterious mutations. Further, this pattern holds across gene categories, but varies with the connectivity and expression patterns of genes. Our work argues that dominance arises as a consequence of the functional importance of genes and their optimal expression levels
A first version of the Caenorhabditis elegans Promoterome
An important aspect of the development of systems biology approaches in metazoans is the characterization of expression patterns of nearly all genes predicted from genome sequences. Such localizome maps should provide information on where (in what cells or tissues) and when (at what stage of development or under what conditions) genes are expressed. They should also indicate in what cellular compartments the corresponding proteins are localized. Caenorhabditis elegans is particularly suited for the development of a localizome map since all its 959 adult somatic cells can be visualized by microscopy, and its cell lineage has been completely described. Here we address one of the challenges of C. elegans localizome mapping projects: that of obtaining a genome-wide resource of C. elegans promoters needed to generate transgenic animals expressing localization markers such as the green fluorescent protein (GFP). To ensure high flexibility for future uses, we utilized the newly developed MultiSite Gateway system. We generated and validated version 1.1 of the Promoterome: a resource of approximately 6000 C. elegans promoters. These promoters can be transferred easily into various Gateway Destination vectors to drive expression of markers such as GFP, alone (promoter::GFP constructs), or in fusion with protein-encoding open reading frames available in ORFeome resources (promoter::ORF::GFP)
Improved annotation of 3' untranslated regions and complex loci by combination of strand-specific direct RNA sequencing, RNA-seq and ESTs
The reference annotations made for a genome sequence provide the framework
for all subsequent analyses of the genome. Correct annotation is particularly
important when interpreting the results of RNA-seq experiments where short
sequence reads are mapped against the genome and assigned to genes according to
the annotation. Inconsistencies in annotations between the reference and the
experimental system can lead to incorrect interpretation of the effect on RNA
expression of an experimental treatment or mutation in the system under study.
Until recently, the genome-wide annotation of 3-prime untranslated regions
received less attention than coding regions and the delineation of intron/exon
boundaries. In this paper, data produced for samples in Human, Chicken and A.
thaliana by the novel single-molecule, strand-specific, Direct RNA Sequencing
technology from Helicos Biosciences which locates 3-prime polyadenylation sites
to within +/- 2 nt, were combined with archival EST and RNA-Seq data. Nine
examples are illustrated where this combination of data allowed: (1) gene and
3-prime UTR re-annotation (including extension of one 3-prime UTR by 5.9 kb);
(2) disentangling of gene expression in complex regions; (3) clearer
interpretation of small RNA expression and (4) identification of novel genes.
While the specific examples displayed here may become obsolete as genome
sequences and their annotations are refined, the principles laid out in this
paper will be of general use both to those annotating genomes and those seeking
to interpret existing publically available annotations in the context of their
own experimental dataComment: 44 pages, 9 figure
Methods and strategies for gene structure curation in WormBase
The Caenorhabditis elegans genome sequence was published over a decade ago; this was the first published genome of a multi-cellular organism and now the WormBase project has had a decade of experience in curating this genome's sequence and gene structures. In one of its roles as a central repository for nematode biology, WormBase continues to refine the gene structure annotations using sequence similarity and other computational methods, as well as information from the literature- and community-submitted annotations. We describe the various methods of gene structure curation that have been tried by WormBase and the problems associated with each of them. We also describe the current strategy for gene structure curation, and introduce the WormBase ‘curation tool’, which integrates different data sources in order to identify new and correct gene structures
Human Gene and Protein Database (HGPD): a novel database presenting a large quantity of experiment-based results in human proteomics
Completion of human genome sequencing has greatly accelerated functional genomic research. Full-length cDNA clones are essential experimental tools for functional analysis of human genes. In one of the projects of the New Energy and Industrial Technology Development Organization (NEDO) in Japan, the full-length human cDNA sequencing project (FLJ project), nucleotide sequences of approximately 30 000 human cDNA clones have been analyzed. The Gateway system is a versatile framework to construct a variety of expression clones for various experiments. We have constructed 33 275 human Gateway entry clones from full-length cDNAs, representing to our knowledge the largest collection in the world. Utilizing these clones with a highly efficient cell-free protein synthesis system based on wheat germ extract, we have systematically and comprehensively produced and analyzed human proteins in vitro. Sequence information for both amino acids and nucleotides of open reading frames of cDNAs cloned into Gateway entry clones and in vitro expression data using those clones can be retrieved from the Human Gene and Protein Database (HGPD, http://www.HGPD.jp). HGPD is a unique database that stores the information of a set of human Gateway entry clones and protein expression data and helps the user to search the Gateway entry clones
A systematic analysis of host factors reveals a Med23-interferon-λ regulatory axis against herpes simplex virus type 1 replication
Herpes simplex virus type 1 (HSV-1) is a neurotropic virus causing vesicular oral or genital skin lesions, meningitis and other diseases particularly harmful in immunocompromised individuals. To comprehensively investigate the complex interaction between HSV-1 and its host we combined two genome-scale screens for host factors (HFs) involved in virus replication. A yeast two-hybrid screen for protein interactions and a RNA interference (RNAi) screen with a druggable genome small interfering RNA (siRNA) library confirmed existing and identified novel HFs which functionally influence HSV-1 infection. Bioinformatic analyses found the 358 HFs were enriched for several pathways and multi-protein complexes. Of particular interest was the identification of Med23 as a strongly anti-viral component of the largely pro-viral Mediator complex, which links specific transcription factors to RNA polymerase II. The anti-viral effect of Med23 on HSV-1 replication was confirmed in gain-of-function gene overexpression experiments, and this inhibitory effect was specific to HSV-1, as a range of other viruses including Vaccinia virus and Semliki Forest virus were unaffected by Med23 depletion. We found Med23 significantly upregulated expression of the type III interferon family (IFN-λ) at the mRNA and protein level by directly interacting with the transcription factor IRF7. The synergistic effect of Med23 and IRF7 on IFN-λ induction suggests this is the major transcription factor for IFN-λ expression. Genotypic analysis of patients suffering recurrent orofacial HSV-1 outbreaks, previously shown to be deficient in IFN-λ secretion, found a significant correlation with a single nucleotide polymorphism in the IFN-λ3 (IL28b) promoter strongly linked to Hepatitis C disease and treatment outcome. This paper describes a link between Med23 and IFN-λ, provides evidence for the crucial role of IFN-λ in HSV-1 immune control, and highlights the power of integrative genome-scale approaches to identify HFs critical for disease progression and outcome
A Biomedically Enriched Collection of 7000 Human ORF Clones
We report the production and availability of over 7000 fully sequence verified plasmid ORF clones representing over 3400 unique human genes. These ORF clones were derived using the human MGC collection as template and were produced in two formats: with and without stop codons. Thus, this collection supports the production of either native protein or proteins with fusion tags added to either or both ends. The template clones used to generate this collection were enriched in three ways. First, gene redundancy was removed. Second, clones were selected to represent the best available GenBank reference sequence. Finally, a literature-based software tool was used to evaluate the list of target genes to ensure that it broadly reflected biomedical research interests. The target gene list was compared with 4000 human diseases and over 8500 biological and chemical MeSH classes in ∼15 Million publications recorded in PubMed at the time of analysis. The outcome of this analysis revealed that relative to the genome and the MGC collection, this collection is enriched for the presence of genes with published associations with a wide range of diseases and biomedical terms without displaying a particular bias towards any single disease or concept. Thus, this collection is likely to be a powerful resource for researchers who wish to study protein function in a set of genes with documented biomedical significance
Arabidopsis thaliana SPF1 and SPF2 are nuclear-located ULP2-like SUMO proteases that act downstream of SIZ1 in plant development
Post-translational modifiers such as the small ubiquitin-like modifier (SUMO) peptide act as fast and reversible protein regulators. Functional characterization of the sumoylation machinery has determined the key regulatory role that SUMO plays in plant development. Unlike components of the SUMO conjugation pathway, SUMO proteases (ULPs) are encoded by a relatively large gene family and are potential sources of specificity within the pathway. This study reports a thorough comparative genomics and phylogenetic characterization of plant ULPs, revealing the presence of one ULP1-like and three ULP2-like SUMO protease subgroups within plant genomes. As representatives of an under-studied subgroup, Arabidopsis SPF1 and SPF2 were subjected to functional characterization. Loss-of-function mutants implicated both proteins with vegetative growth, flowering time, and seed size and yield. Mutants constitutively accumulated SUMO conjugates, and yeast complementation assays associated these proteins with the function of ScUlp2 but not ScUlp1. Fluorescence imaging placed both proteins in the plant cell nucleoplasm. Transcriptomics analysis indicated strong regulatory involvement in secondary metabolism, cell wall remodelling, and nitrate assimilation. Furthermore, developmental defects of the spf1-1 spf2-2 (spf1/2) double-mutant opposed those of the major E3 ligase siz1 mutant and, most significantly, developmental and transcriptomic characterization of the siz1 spf1/2 triple-mutant placed SIZ1 as epistatic to SPF1 and SPF2.We thank Mark Hochstrasser (Department of Molecular Biophysics & Biochemistry, Yale University, New Haven, CT, USA) for kindly providing the ulp1-ts yeast mutant strain. This research was funded by FEDER (through COMPETE), and by Fundacao para a Ciencia e Tecnologia (FCT), within the scope of project SUMOdulator (FCOMP-01-0124-FEDER-028459 and PTDC/BIA-PLA/3850/2012). PHC was supported by FCT (SFRH/BD/44484/2008). HA and FF were supported by Norte Portugal Regional Operational Programme (NORTE 2020), under the PORTUGAL 2020 Partnership Agreement, through the European Regional Development Fund (FEDER) (NORTE-01-0145-FEDER-000007 and Norte-01-0145-FEDER-000008, respectively). The work was supported by FEDER through the COMPETE 2020-Operacional Programme for Competitiveness and Internationalisation (POCI), Portugal 2020, and by Portuguese funds through FCT, within the framework of projects 'Rede de Investigacao em Biodiversidade e Biologia Evolutiva' (POCI-01-0145-FEDER-006821) and 'Institute for Research and Innovation in Health Sciences' (POCI-01-0145-FEDER-007274). This research was also supported by a grant from the Spanish Ministerio de Ciencia y Tecnologia (AGL2016-75819-C2-1-R) and FEDER (PCQ, AGC, ERB)
- …