Search CORE

138 research outputs found

A Biomedically Enriched Collection of 7000 Human ORF Clones

Author: A Baross
AE Witt
Andreas Hoerlein
Andreas Rolfs
Bernhard Korn
Binghua Shen
Craig DeLoughery
Daniel A. Jepson
Dietmar Hoffmann
Dongmei Zuo
DS Gerhard
E Pennisi
E Taycher
Elena Taycher
Fontina Kelley
G Temple
J Park
Jacob Raphael
JE Collins
JF Rual
Joseph Pearlberg
Joshua LaBaer
KD Pruitt
KD Pruitt
KD Pruitt
KD Pruitt
Lars Ebert
Munira M. A. Baqui
N Ramachandran
Niro Ramachandran
OJ Harrison
P De Los Rios
P Lamesch
R Staden
RL Strausberg
RS Hegde
S Haas
Seamus McCarron
Suzannah Rutherford
T Murthy
Y Hu
Y Hu
Yanhui Hu
Publication venue: Public Library of Science
Publication date
Field of study

We report the production and availability of over 7000 fully sequence verified plasmid ORF clones representing over 3400 unique human genes. These ORF clones were derived using the human MGC collection as template and were produced in two formats: with and without stop codons. Thus, this collection supports the production of either native protein or proteins with fusion tags added to either or both ends. The template clones used to generate this collection were enriched in three ways. First, gene redundancy was removed. Second, clones were selected to represent the best available GenBank reference sequence. Finally, a literature-based software tool was used to evaluate the list of target genes to ensure that it broadly reflected biomedical research interests. The target gene list was compared with 4000 human diseases and over 8500 biological and chemical MeSH classes in ∼15 Million publications recorded in PubMed at the time of analysis. The outcome of this analysis revealed that relative to the genome and the MGC collection, this collection is enriched for the presence of genes with published associations with a wide range of diseases and biomedical terms without displaying a particular bias towards any single disease or concept. Thus, this collection is likely to be a powerful resource for researchers who wish to study protein function in a set of genes with documented biomedical significance

Crossref

Directory of Open Access Journals

PubMed Central

Assessment of clusters of transcription factor binding sites in relationship to human promoter, CpG islands and gene expression

Author: A Wagner
AE Kel
B Lenhard
B Shea
BP Berman
DA Papatsenko
DS Prestridge
DS Prestridge
F Larsen
GD Stormo
GG Loots
JA Warrington
JM Claverie
K Quandt
KD Pruitt
L Ponger
LL Hsiao
M Gardiner-Garden
MC Frith
MC Frith
MI Arnone
MS Halfon
N Rajewsky
O Johansson
R Ihaka
RR Sokal
S Aerts
S Hannenhalli
S Levy
S Levy
TD Schneider
V Matys
V Solovyev
W Krivan
WH Press
WJ Ewens
WJ Kent
WJ Kent
WW Wasserman
Y Suzuki
Y Suzuki
Publication venue: BioMed Central
Publication date: 01/01/2004
Field of study

BACKGROUND: Gene expression is regulated mainly by transcription factors (TFs) that interact with regulatory cis-elements on DNA sequences. To identify functional regulatory elements, computer searching can predict TF binding sites (TFBS) using position weight matrices (PWMs) that represent positional base frequencies of collected experimentally determined TFBS. A disadvantage of this approach is the large output of results for genomic DNA. One strategy to identify genuine TFBS is to utilize local concentrations of predicted TFBS. It is unclear whether there is a general tendency for TFBS to cluster at promoter regions, although this is the case for certain TFBS. Also unclear is the identification of TFs that have TFBS concentrated in promoters and to what level this occurs. This study hopes to answer some of these questions. RESULTS: We developed the cluster score measure to evaluate the correlation between predicted TFBS clusters and promoter sequences for each PWM. Non-promoter sequences were used as a control. Using the cluster score, we identified a PWM group called PWM-PCP, in which TFBS clusters positively correlate with promoters, and another PWM group called PWM-NCP, in which TFBS clusters negatively correlate with promoters. The PWM-PCP group comprises 47% of the 199 vertebrate PWMs, while the PWM-NCP group occupied 11 percent. After reducing the effect of CpG islands (CGI) against the clusters using partial correlation coefficients among three properties (promoter, CGI and predicted TFBS cluster), we identified two PWM groups including those strongly correlated with CGI and those not correlated with CGI. CONCLUSION: Not all PWMs predict TFBS correlated with human promoter sequences. Two main PWM groups were identified: (1) those that show TFBS clustered in promoters associated with CGI, and (2) those that show TFBS clustered in promoters independent of CGI. Assessment of PWM matches will allow more positive interpretation of TFBS in regulatory regions

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

ChromaSig: A Probabilistic Approach to Finding Common Chromatin Signatures in the Human Genome

Author: A Barski
A Siepel
AP Boyle
B Ren
BE Bernstein
Bing Ren
CB Millar
CFGA Benner
CL Liu
CL Wei
D Cimini
D Nathan
DK Pokholok
DS Johnson
E Segal
Gary Hon
GE Crawford
H Wang
H Xi
J Kim
KD Pruitt
M Blanchette
M Zheng
ND Heintzman
PA Grant
RJ Sims 3rd
S Bergink
SB Montgomery
T Jenuwein
TH Kim
TH Kim
TY Roh
Uwe Ohler
VR Iyer
W Wang
WE Johnson
Wei Wang
Y Qi
Publication venue: Public Library of Science
Publication date: 01/10/2008
Field of study

Computational methods to identify functional genomic elements using genetic information have been very successful in determining gene structure and in identifying a handful of cis-regulatory elements. But the vast majority of regulatory elements have yet to be discovered, and it has become increasingly apparent that their discovery will not come from using genetic information alone. Recently, high-throughput technologies have enabled the creation of information-rich epigenetic maps, most notably for histone modifications. However, tools that search for functional elements using this epigenetic information have been lacking. Here, we describe an unsupervised learning method called ChromaSig to find, in an unbiased fashion, commonly occurring chromatin signatures in both tiling microarray and sequencing data. Applying this algorithm to nine chromatin marks across a 1% sampling of the human genome in HeLa cells, we recover eight clusters of distinct chromatin signatures, five of which correspond to known patterns associated with transcriptional promoters and enhancers. Interestingly, we observe that the distinct chromatin signatures found at enhancers mark distinct functional classes of enhancers in terms of transcription factor and coactivator binding. In addition, we identify three clusters of novel chromatin signatures that contain evolutionarily conserved sequences and potential cis-regulatory elements. Applying ChromaSig to a panel of 21 chromatin marks mapped genomewide by ChIP-Seq reveals 16 classes of genomic elements marked by distinct chromatin signatures. Interestingly, four classes containing enrichment for repressive histone modifications appear to be locally heterochromatic sites and are enriched in quickly evolving regions of the genome. The utility of this approach in uncovering novel, functionally significant genomic elements will aid future efforts of genome annotation via chromatin modifications

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Discovery and characterization of chromatin states for systematic annotation of the human genome

Author: A Barski
A Siepel
AI Su
AP Boyle
B Schuettengruber
BD Strahl
BE Bernstein
C Zang
D Karolchik
DA Benson
DE Schones
DF Gudbjartsson
DS Johnson
G Hon
G Hon
H O'Geen
J Ernst
Jason Ernst
K Cui
KD Pruitt
KJ Won
L Guelen
L Jia
M Guttman
Manolis Kellis
N Day
ND Heintzman
ND Heintzman
P Carninci
P Kheradpour
P Kolasinska-Zwierz
R Andersson
RE Thurman
RM Neal
S Schwartz
SE Celniker
SL Schreiber
SP Sripathy
T Kouzarides
TS Furey
W Miller
WJ Kent
X Wang
Y Zhang
Z Wang
Z Wang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2010
Field of study

A plethora of epigenetic modifications have been described in the human genome and shown to play diverse roles in gene regulation, cellular differentiation and the onset of disease. Although individual modifications have been linked to the activity levels of various genetic functional elements, their combinatorial patterns are still unresolved and their potential for systematic de novo genome annotation remains untapped. Here, we use a multivariate Hidden Markov Model to reveal 'chromatin states' in human T cells, based on recurrent and spatially coherent combinations of chromatin marks. We define 51 distinct chromatin states, including promoter-associated, transcription-associated, active intergenic, large-scale repressed and repeat-associated states. Each chromatin state shows specific enrichments in functional annotations, sequence motifs and specific experimentally observed characteristics, suggesting distinct biological roles. This approach provides a complementary functional annotation of the human genome that reveals the genome-wide locations of diverse classes of epigenetic function.National Science Foundation (U.S.). (Award 0905968)National Human Genome Research Institute (U.S.) (Award U54-HG004570)National Human Genome Research Institute (U.S.) (Award RC1-HG005334

eScholarship - University of California

Sequence-Based Prediction of Type III Secreted Proteins

The type III secretion system (TTSS) is a key mechanism for host cell interaction used by a variety of bacterial pathogens and symbionts of plants and animals including humans. The TTSS represents a molecular syringe with which the bacteria deliver effector proteins directly into the host cell cytosol. Despite the importance of the TTSS for bacterial pathogenesis, recognition and targeting of type III secreted proteins has up until now been poorly understood. Several hypotheses are discussed, including an mRNA-based signal, a chaperon-mediated process, or an N-terminal signal peptide. In this study, we systematically analyzed the amino acid composition and secondary structure of N-termini of 100 experimentally verified effector proteins. Based on this, we developed a machine-learning approach for the prediction of TTSS effector proteins, taking into account N-terminal sequence features such as frequencies of amino acids, short peptides, or residues with certain physico-chemical properties. The resulting computational model revealed a strong type III secretion signal in the N-terminus that can be used to detect effectors with sensitivity of ∼71% and selectivity of ∼85%. This signal seems to be taxonomically universal and conserved among animal pathogens and plant symbionts, since we could successfully detect effector proteins if the respective group was excluded from training. The application of our prediction approach to 739 complete bacterial and archaeal genome sequences resulted in the identification of between 0% and 12% putative TTSS effector proteins. Comparison of effector proteins with orthologs that are not secreted by the TTSS showed no clear pattern of signal acquisition by fusion, suggesting convergent evolutionary processes shaping the type III secretion signal. The newly developed program EffectiveT3 (http://www.chlamydiaedb.org) is the first universal in silico prediction program for the identification of novel TTSS effectors. Our findings will facilitate further studies on and improve our understanding of type III secretion and its role in pathogen–host interactions

Crossref

University of Birmingham Research Portal

Directory of Open Access Journals

PubMed Central

Permanent Hosting, Archiving and Indexing of Digital Resources and Assets

PuSH

Gene Characterization Index: Assessing the Depth of Gene Annotation

Author: B Lenhard
Boris Lenhard
CH Wu
Claes Wahlestedt
CN Connolly
D Maglott
DA Benson
Danielle Kemmer
DB Resnik
Dimas Yusuf
DS Wishart
E Birney
E Camon
E Pennisi
E Pennisi
F Horn
F Horn
H Frohlich
H Gunes
HM Berman
IG Schulman
JF Rual
JH Friedman
JH Oh
Jochen Brumm
JP Overington
Juan Valcarcel
KD Pruitt
M Ashburner
M Donizelli
M Kanehisa
MA Bogue
MA Heller
ME Curran
N Hulo
NJ Mulder
P Resnik
Raf M. Podowski
RG Adler
S Mizielinska
T Mashimo
TJ Hubbard
TK Attwood
V Gewin
V Vapnik
Warren Cheung
Wyeth W. Wasserman
Y Eisenthal
Publication venue: Public Library of Science
Publication date: 01/01/2008
Field of study

We introduce the Gene Characterization Index, a bioinformatics method for scoring the extent to which a protein-encoding gene is functionally described. Inherently a reflection of human perception, the Gene Characterization Index is applied for assessing the characterization status of individual genes, thus serving the advancement of both genome annotation and applied genomics research by rapid and unbiased identification of groups of uncharacterized genes for diverse applications such as directed functional studies and delineation of novel drug targets.The scoring procedure is based on a global survey of researchers, who assigned characterization scores from 1 (poor) to 10 (extensive) for a sample of genes based on major online resources. By evaluating the survey as training data, we developed a bioinformatics procedure to assign gene characterization scores to all genes in the human genome. We analyzed snapshots of functional genome annotation over a period of 6 years to assess temporal changes reflected by the increase of the average Gene Characterization Index. Applying the Gene Characterization Index to genes within pharmaceutically relevant classes, we confirmed known drug targets as high-scoring genes and revealed potentially interesting novel targets with low characterization indexes. Removing known drug targets and genes linked to sequence-related patent filings from the entirety of indexed genes, we identified sets of low-scoring genes particularly suited for further experimental investigation.The Gene Characterization Index is intended to serve as a tool to the scientific community and granting agencies for focusing resources and efforts on unexplored areas of the genome. The Gene Characterization Index is available from http://cisreg.ca/gci/

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

University of Miami: Scholarship Miami

Adaptive Evolution in Zinc Finger Transcription Factors

Author: A Bateman
AJ Williams
AT Hamilton
B Gebelein
C Looman
C Underhill
CA Kim
CJ Krebs
CT Workman
D Lugtenberg
D Schmidt
DC Schultz
DC Schultz
DS Wuttke
EJ Bellefroid
EJ Bellefroid
ES Lander
GE Crooks
GN Filipova
GN Filipova
H Peng
James H. Thomas
JC Venter
JD Thompson
JF Margolin
JR Friedman
K Ayyanathan
K Ishihara
KD Pruitt
L Medugno
LC Edelstein
M Anisimova
M Elrod-Erickson
M Horiba
M Shannon
M Wiznerowicz
MJ Garcia-Garcia
MP Foster
O Albagli
O Lespinet
P Dehal
R Urrutia
R Witzgall
RF Ryan
Ryan O. Emerson
S Huntley
S Iuchi
SA Sawyer
SA Shoichet
SB Cannon
SB Carroll
Simon Myers
SP Sripathy
SR Eddy
T Kleefstra
TL Sander
VJ Bardwell
WSW Wong
Y Agata
Y Choo
Z Birtle
Z Yang
Z Yang
ZX Wang
Publication venue: Public Library of Science
Publication date: 01/01/2009
Field of study

The majority of human genes are conserved among mammals, but some gene families have undergone extensive expansion in particular lineages. Here, we present an evolutionary analysis of one such gene family, the poly–zinc-finger (poly-ZF) genes. The human genome encodes approximately 700 members of the poly-ZF family of putative transcriptional repressors, many of which have associated KRAB, SCAN, or BTB domains. Analysis of the gene family across the tree of life indicates that the gene family arose from a small ancestral group of eukaryotic zinc-finger transcription factors through many repeated gene duplications accompanied by functional divergence. The ancestral gene family has probably expanded independently in several lineages, including mammals and some fishes. Investigation of adaptive evolution among recent paralogs using dN/dS analysis indicates that a major component of the selective pressure acting on these genes has been positive selection to change their DNA-binding specificity. These results suggest that the poly-ZF genes are a major source of new transcriptional repression activity in humans and other primates

Crossref

Directory of Open Access Journals

PubMed Central

Inositol Hexakisphosphate-Induced Autoprocessing of Large Bacterial Protein Toxins

Author: A Eichinger
A Shen
AA Kembhavi
AL Cohen
AT Ma
B Geissler
B Henriques
C Busch
C von Eichel-Streiber
CL Cordero
CP Samlaska
CT Lee
D Lyras
DJ Buttle
DS Kudryashov
E Duchaud
G Pfeifer
Glenn F. Rall
H Barth
I Just
J Pei
J Pei
J Reineke
J Sanchez
JA Young
JG Bartlett
JH Lee
K Prochazkova
K Prochazkova
K Sandvig
Karla J. F. Satchell
KJ Chung
KJ Fullner
KJ Satchell
KL Sheahan
KL Sheahan
KL Sheahan
KP Wilson
L Li
LA Barroso
M Egerer
M Egerer
M Fischer
M Liu
M Qa'Dan
M Rupnik
Martina Egerer
MM Pearson
MR Baldwin
ND Rawlings
NR Thomson
P Wilkinson
PJ Lupardus
R Seshadri
RF Irvine
RN Pruitt
RP Miech
S Johnson
T Giesemann
V Olivier
V Olivier
V Olivier
VM Gordon
W Lin
Y Belyi
YR Kim
Publication venue: Public Library of Science
Publication date: 01/07/2010
Field of study

Large bacterial protein toxins autotranslocate functional effector domains to the eukaryotic cell cytosol, resulting in alterations to cellular functions that ultimately benefit the infecting pathogen. Among these toxins, the clostridial glucosylating toxins (CGTs) produced by Gram-positive bacteria and the multifunctional-autoprocessing RTX (MARTX) toxins of Gram-negative bacteria have distinct mechanisms for effector translocation, but a shared mechanism of post-translocation autoprocessing that releases these functional domains from the large holotoxins. These toxins carry an embedded cysteine protease domain (CPD) that is activated for autoprocessing by binding inositol hexakisphosphate (InsP6), a molecule found exclusively in eukaryotic cells. Thus, InsP6-induced autoprocessing represents a unique mechanism for toxin effector delivery specifically within the target cell. This review summarizes recent studies of the structural and molecular events for activation of autoprocessing for both CGT and MARTX toxins, demonstrating both similar and potentially distinct aspects of autoprocessing among the toxins that utilize this method of activation and effector delivery

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Cross-Species Analyses Identify the BNIP-2 and Cdc42GAP Homology (BCH) Domain as a Distinct Functional Subclass of the CRAL_TRIO/Sec14 Superfamily

Author: A Berken
A Fedorov
A Hall
A Roy
A Schaefer
A Stocker
Anjali Bansal Gupta
B Sha
BC Low
BC Low
BL Lua
BL Lua
Boon Chuan Low
C Cole
C Panagabko
CA Valencia
CJ Mousley
D Chivian
D Chivian
D-F Feng
DE Kim
DE Kim
DM Pirone
DS Marks
G Schaaf
G Wu
GA Tuskan
GB Scott
GP Wagner
I D'angelo
I Letunic
J-S Kang
J-U Hwang
JM Bomar
JP Buschdorf
JP Buschdorf
K Ouahchi
K Saito
KC Min
KD Pruitt
L Aravind
L Cavalier
L Holm
Liang En Wee
M Tassabehji
MA Larkin
Michael Hortsch
MM Ryan
MX Gu
N Saitou
RA Laskowski
S Etienne-Manneville
S Welti
SE Phillips
UJK Soh
VA Bankaitis
VA Bankaitis
Vladimir N. Uversky
W Li
W-G Qiu
X He
X Shang
Y Zhang
Y Zhang
Y Zhang
Yi Ting Zhou
YT Zhou
YT Zhou
YT Zhou
YT Zhou
Publication venue: Public Library of Science
Publication date: 27/03/2012
Field of study

The CRAL_TRIO protein domain, which is unique to the Sec14 protein superfamily, binds to a diverse set of small lipophilic ligands. Similar domains are found in a range of different proteins including neurofibromatosis type-1, a Ras GTPase-activating Protein (RasGAP) and Rho guanine nucleotide exchange factors (RhoGEFs). Proteins containing this structural protein domain exhibit a low sequence similarity and ligand specificity while maintaining an overall characteristic three-dimensional structure. We have previously demonstrated that the BNIP-2 and Cdc42GAP Homology (BCH) protein domain, which shares a low sequence homology with the CRAL_TRIO domain, can serve as a regulatory scaffold that binds to Rho, RhoGEFs and RhoGAPs to control various cell signalling processes. In this work, we investigate 175 BCH domain-containing proteins from a wide range of different organisms. A phylogenetic analysis with ∼100 CRAL_TRIO and similar domains from eight representative species indicates a clear distinction of BCH-containing proteins as a novel subclass within the CRAL_TRIO/Sec14 superfamily. BCH-containing proteins contain a hallmark sequence motif R(R/K)h(R/K)(R/K)NL(R/K)xhhhhHPs (‘h’ is large and hydrophobic residue and ‘s’ is small and weekly polar residue) and can be further subdivided into three unique subtypes associated with BNIP-2-N, macro- and RhoGAP-type protein domains. A previously unknown group of genes encoding ‘BCH-only’ domains is also identified in plants and arthropod species. Based on an analysis of their gene-structure and their protein domain context we hypothesize that BCH domain-containing genes evolved through gene duplication, intron insertions and domain swapping events. Furthermore, we explore the point of divergence between BCH and CRAL-TRIO proteins in relation to their ability to bind small GTPases, GAPs and GEFs and lipid ligands. Our study suggests a need for a more extensive analysis of previously uncharacterized BCH, ‘BCH-like’ and CRAL_TRIO-containing proteins and their significance in regulating signaling events involving small GTPases

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

ScholarBank@NUS

FigShare

Mutation of the Zebrafish Nucleoporin elys Sensitizes Tissue Progenitors to Replication Stress

The recessive lethal mutation flotte lotte (flo) disrupts development of the zebrafish digestive system and other tissues. We show that flo encodes the ortholog of Mel-28/Elys, a highly conserved gene that has been shown to be required for nuclear integrity in worms and nuclear pore complex (NPC) assembly in amphibian and mammalian cells. Maternal elys expression sustains zebrafish flo mutants to larval stages when cells in proliferative tissues that lack nuclear pores undergo cell cycle arrest and apoptosis. p53 mutation rescues apoptosis in the flo retina and optic tectum, but not in the intestine, where the checkpoint kinase Chk2 is activated. Chk2 inhibition and replication stress induced by DNA synthesis inhibitors were lethal to flo larvae. By contrast, flo mutants were not sensitized to agents that cause DNA double strand breaks, thus showing that loss of Elys disrupts responses to selected replication inhibitors. Elys binds Mcm2-7 complexes derived from Xenopus egg extracts. Mutation of elys reduced chromatin binding of Mcm2, but not binding of Mcm3 or Mcm4 in the flo intestine. These in vivo data indicate a role for Elys in Mcm2-chromatin interactions. Furthermore, they support a recently proposed model in which replication origins licensed by excess Mcm2-7 are required for the survival of human cells exposed to replication stress

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central