13,891 research outputs found
Coding limits on the number of transcription factors
Transcription factor proteins bind specific DNA sequences to control the
expression of genes. They contain DNA binding domains which belong to several
super-families, each with a specific mechanism of DNA binding. The total number
of transcription factors encoded in a genome increases with the number of genes
in the genome. Here, we examined the number of transcription factors from each
super-family in diverse organisms.
We find that the number of transcription factors from most super-families
appears to be bounded. For example, the number of winged helix factors does not
generally exceed 300, even in very large genomes. The magnitude of the maximal
number of transcription factors from each super-family seems to correlate with
the number of DNA bases effectively recognized by the binding mechanism of that
super-family. Coding theory predicts that such upper bounds on the number of
transcription factors should exist, in order to minimize cross-binding errors
between transcription factors. This theory further predicts that factors with
similar binding sequences should tend to have similar biological effect, so
that errors based on mis-recognition are minimal. We present evidence that
transcription factors with similar binding sequences tend to regulate genes
with similar biological functions, supporting this prediction.
The present study suggests limits on the transcription factor repertoire of
cells, and suggests coding constraints that might apply more generally to the
mapping between binding sites and biological function.Comment: http://www.weizmann.ac.il/complex/tlusty/papers/BMCGenomics2006.pdf
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1590034/
http://www.biomedcentral.com/1471-2164/7/23
Methods for protein complex prediction and their contributions towards understanding the organization, function and dynamics of complexes
Complexes of physically interacting proteins constitute fundamental
functional units responsible for driving biological processes within cells. A
faithful reconstruction of the entire set of complexes is therefore essential
to understand the functional organization of cells. In this review, we discuss
the key contributions of computational methods developed till date
(approximately between 2003 and 2015) for identifying complexes from the
network of interacting proteins (PPI network). We evaluate in depth the
performance of these methods on PPI datasets from yeast, and highlight
challenges faced by these methods, in particular detection of sparse and small
or sub- complexes and discerning of overlapping complexes. We describe methods
for integrating diverse information including expression profiles and 3D
structures of proteins with PPI networks to understand the dynamics of complex
formation, for instance, of time-based assembly of complex subunits and
formation of fuzzy complexes from intrinsically disordered proteins. Finally,
we discuss methods for identifying dysfunctional complexes in human diseases,
an application that is proving invaluable to understand disease mechanisms and
to discover novel therapeutic targets. We hope this review aptly commemorates a
decade of research on computational prediction of complexes and constitutes a
valuable reference for further advancements in this exciting area.Comment: 1 Tabl
Transcriptional Regulation: a Genomic Overview
The availability of the Arabidopsis thaliana genome sequence allows a comprehensive analysis of transcriptional regulation in plants using novel genomic approaches and methodologies. Such a genomic view of transcription first necessitates the compilation of lists of elements. Transcription factors are the most numerous of the different types of proteins involved in transcription in eukaryotes, and the Arabidopsis genome codes for more than 1,500 of them, or approximately 6% of its total number of genes. A genome-wide comparison of transcription factors across the three eukaryotic kingdoms reveals the evolutionary generation of diversity in the components of the regulatory machinery of transcription. However, as illustrated by Arabidopsis, transcription in plants follows similar basic principles and logic to those in animals and fungi. A global view and understanding of transcription at a cellular and organismal level requires the characterization of the Arabidopsis transcriptome and promoterome, as well as of the interactome, the localizome, and the phenome of the proteins involved in transcription
Hot and crispy : CRISPR-Cas systems in the hyperthermophile Sulfolobus solfataricus
The CRISPR (clustered regularly interspaced short palindromic repeats) and Cas (CRISPR-associated) genes are widely spread in bacteria and archaea, representing an intracellular defence system against invading viruses and plasmids. In the system, fragments from foreign DNA are captured and integrated into the host genome at the CRISPR locus. The locus is transcribed and the resulting RNAs are processed by Cas6 into small crRNAs (CRISPR RNAs) that guide a variety of effector complexes to degrade the invading genetic elements. Many bacteria and archaea have one major type of effector complex. However, Sulfolobus solfataricus strain P2 has six CRISPR loci with two families of repeats, four cas6 genes and three different types of effector complex. These features make S. solfataricus an important model for studying CRISPR-Cas systems. In the present article, we review our current understanding of crRNA biogenesis and its effector complexes, subtype I-A and subtype III-B, in S. solfataricus. We also discuss the differences in terms of mechanisms between the subtype III-B systems in S. solfataricus and Pyrococcus furiosus.PostprintPeer reviewe
Structural Prediction of Protein–Protein Interactions by Docking: Application to Biomedical Problems
A huge amount of genetic information is available thanks to the recent advances in sequencing technologies and the larger computational capabilities, but the interpretation of such genetic data at phenotypic level remains elusive. One of the reasons is that proteins are not acting alone, but are specifically interacting with other proteins and biomolecules, forming intricate interaction networks that are essential for the majority of cell processes and pathological conditions. Thus, characterizing such interaction networks is an important step in understanding how information flows from gene to phenotype. Indeed, structural characterization of protein–protein interactions at atomic resolution has many applications in biomedicine, from diagnosis and vaccine design, to drug discovery. However, despite the advances of experimental structural determination, the number of interactions for which there is available structural data is still very small. In this context, a complementary approach is computational modeling of protein interactions by docking, which is usually composed of two major phases: (i) sampling of the possible binding modes between the interacting molecules and (ii) scoring for the identification of the correct orientations. In addition, prediction of interface and hot-spot residues is very useful in order to guide and interpret mutagenesis experiments, as well as to understand functional and mechanistic aspects of the interaction. Computational docking is already being applied to specific biomedical problems within the context of personalized medicine, for instance, helping to interpret pathological mutations involved in protein–protein interactions, or providing modeled structural data for drug discovery targeting protein–protein interactions.Spanish Ministry of Economy grant number BIO2016-79960-R; D.B.B. is supported by a
predoctoral fellowship from CONACyT; M.R. is supported by an FPI fellowship from the
Severo Ochoa program. We are grateful to the Joint BSC-CRG-IRB Programme in
Computational Biology.Peer ReviewedPostprint (author's final draft
The mRNA-bound proteome of the human malaria parasite Plasmodium falciparum.
BackgroundGene expression is controlled at multiple levels, including transcription, stability, translation, and degradation. Over the years, it has become apparent that Plasmodium falciparum exerts limited transcriptional control of gene expression, while at least part of Plasmodium's genome is controlled by post-transcriptional mechanisms. To generate insights into the mechanisms that regulate gene expression at the post-transcriptional level, we undertook complementary computational, comparative genomics, and experimental approaches to identify and characterize mRNA-binding proteins (mRBPs) in P. falciparum.ResultsClose to 1000 RNA-binding proteins are identified by hidden Markov model searches, of which mRBPs encompass a relatively large proportion of the parasite proteome as compared to other eukaryotes. Several abundant mRNA-binding domains are enriched in apicomplexan parasites, while strong depletion of mRNA-binding domains involved in RNA degradation is observed. Next, we experimentally capture 199 proteins that interact with mRNA during the blood stages, 64 of which with high confidence. These captured mRBPs show a significant overlap with the in silico identified candidate RBPs (p < 0.0001). Among the experimentally validated mRBPs are many known translational regulators active in other stages of the parasite's life cycle, such as DOZI, CITH, PfCELF2, Musashi, and PfAlba1-4. Finally, we also detect several proteins with an RNA-binding domain abundant in Apicomplexans (RAP domain) that is almost exclusively found in apicomplexan parasites.ConclusionsCollectively, our results provide the most complete comparative genomics and experimental analysis of mRBPs in P. falciparum. A better understanding of these regulatory proteins will not only give insight into the intricate parasite life cycle but may also provide targets for novel therapeutic strategies
Recommended from our members
The RNA Polymerase II Core Promoter in Drosophila.
Transcription by RNA polymerase II initiates at the core promoter, which is sometimes referred to as the "gateway to transcription." Here, we describe the properties of the RNA polymerase II core promoter in Drosophila The core promoter is at a strategic position in the expression of genes, as it is the site of convergence of the signals that lead to transcriptional activation. Importantly, core promoters are diverse in terms of their structure and function. They are composed of various combinations of sequence motifs such as the TATA box, initiator (Inr), and downstream core promoter element (DPE). Different types of core promoters are transcribed via distinct mechanisms. Moreover, some transcriptional enhancers exhibit specificity for particular types of core promoters. These findings indicate that the core promoter is a central component of the transcriptional apparatus that regulates gene expression
- …