14,877 research outputs found

    Germline-encoded neutralization of a Staphylococcus aureus virulence factor by the human antibody repertoire.

    Get PDF
    Staphylococcus aureus is both an important pathogen and a human commensal. To explore this ambivalent relationship between host and microbe, we analysed the memory humoral response against IsdB, a protein involved in iron acquisition, in four healthy donors. Here we show that in all donors a heavily biased use of two immunoglobulin heavy chain germlines generated high affinity (pM) antibodies that neutralize the two IsdB NEAT domains, IGHV4-39 for NEAT1 and IGHV1-69 for NEAT2. In contrast to the typical antibody/antigen interactions, the binding is primarily driven by the germline-encoded hydrophobic CDRH-2 motifs of IGHV1-69 and IGHV4-39, with a binding mechanism nearly identical for each antibody derived from different donors. Our results suggest that IGHV1-69 and IGHV4-39, while part of the adaptive immune system, may have evolved under selection pressure to encode a binding motif innately capable of recognizing and neutralizing a structurally conserved protein domain involved in pathogen iron acquisition

    Regulatory motif discovery using a population clustering evolutionary algorithm

    Get PDF
    This paper describes a novel evolutionary algorithm for regulatory motif discovery in DNA promoter sequences. The algorithm uses data clustering to logically distribute the evolving population across the search space. Mating then takes place within local regions of the population, promoting overall solution diversity and encouraging discovery of multiple solutions. Experiments using synthetic data sets have demonstrated the algorithm's capacity to find position frequency matrix models of known regulatory motifs in relatively long promoter sequences. These experiments have also shown the algorithm's ability to maintain diversity during search and discover multiple motifs within a single population. The utility of the algorithm for discovering motifs in real biological data is demonstrated by its ability to find meaningful motifs within muscle-specific regulatory sequences

    CATHEDRAL: A Fast and Effective Algorithm to Predict Folds and Domain Boundaries from Multidomain Protein Structures

    Get PDF
    We present CATHEDRAL, an iterative protocol for determining the location of previously observed protein folds in novel multidomain protein structures. CATHEDRAL builds on the features of a fast secondary-structure–based method (using graph theory) to locate known folds within a multidomain context and a residue-based, double-dynamic programming algorithm, which is used to align members of the target fold groups against the query protein structure to identify the closest relative and assign domain boundaries. To increase the fidelity of the assignments, a support vector machine is used to provide an optimal scoring scheme. Once a domain is verified, it is excised, and the search protocol is repeated in an iterative fashion until all recognisable domains have been identified. We have performed an initial benchmark of CATHEDRAL against other publicly available structure comparison methods using a consensus dataset of domains derived from the CATH and SCOP domain classifications. CATHEDRAL shows superior performance in fold recognition and alignment accuracy when compared with many equivalent methods. If a novel multidomain structure contains a known fold, CATHEDRAL will locate it in 90% of cases, with <1% false positives. For nearly 80% of assigned domains in a manually validated test set, the boundaries were correctly delineated within a tolerance of ten residues. For the remaining cases, previously classified domains were very remotely related to the query chain so that embellishments to the core of the fold caused significant differences in domain sizes and manual refinement of the boundaries was necessary. To put this performance in context, a well-established sequence method based on hidden Markov models was only able to detect 65% of domains, with 33% of the subsequent boundaries assigned within ten residues. Since, on average, 50% of newly determined protein structures contain more than one domain unit, and typically 90% or more of these domains are already classified in CATH, CATHEDRAL will considerably facilitate the automation of protein structure classification

    Conformational and thermodynamic hallmarks of DNA operator site specificity in the copper sensitive operon repressor from Streptomyces lividans

    Get PDF
    Metal ion homeostasis in bacteria relies on metalloregulatory proteins to upregulate metal resistance genes and enable the organism to preclude metal toxicity. The copper sensitive operon repressor (CsoR) family is widely distributed in bacteria and controls the expression of copper efflux systems. CsoR operator sites consist of G-tract containing pseudopalindromes of which the mechanism of operator binding is poorly understood. Here, we use a structurally characterized CsoR from Streptomyces lividans (CsoRSl) together with three specific operator targets to reveal the salient features pertaining to the mechanism of DNA binding. We reveal that CsoRSl binds to its operator site through a 2-fold axis of symmetry centred on a conserved 5′-TAC/GTA-3′ inverted repeat. Operator recognition is stringently dependent not only on electropositive residues but also on a conserved polar glutamine residue. Thermodynamic and circular dichroic signatures of the CsoRSl-DNA interaction suggest selectivity towards the A-DNA-like topology of the G-tracts at the operator site. Such properties are enhanced on protein binding thus enabling the symmetrical binding of two CsoRSl tetramers. Finally, differential binding modes may exist in operator sites having more than one 5′-TAC/GTA-3′ inverted repeat with implications in vivo for a mechanism of modular control. © 2013 The Author(s)

    A unifying framework for seed sensitivity and its application to subset seeds

    Get PDF
    We propose a general approach to compute the seed sensitivity, that can be applied to different definitions of seeds. It treats separately three components of the seed sensitivity problem -- a set of target alignments, an associated probability distribution, and a seed model -- that are specified by distinct finite automata. The approach is then applied to a new concept of subset seeds for which we propose an efficient automaton construction. Experimental results confirm that sensitive subset seeds can be efficiently designed using our approach, and can then be used in similarity search producing better results than ordinary spaced seeds

    The physicist's guide to one of biotechnology's hottest new topics: CRISPR-Cas

    Full text link
    Clustered regularly interspaced short palindromic repeats (CRISPR) and CRISPR-associated proteins (Cas) constitute a multi-functional, constantly evolving immune system in bacteria and archaea cells. A heritable, molecular memory is generated of phage, plasmids, or other mobile genetic elements that attempt to attack the cell. This memory is used to recognize and interfere with subsequent invasions from the same genetic elements. This versatile prokaryotic tool has also been used to advance applications in biotechnology. Here we review a large body of CRISPR-Cas research to explore themes of evolution and selection, population dynamics, horizontal gene transfer, specific and cross-reactive interactions, cost and regulation, non-immunological CRISPR functions that boost host cell robustness, as well as applicable mechanisms for efficient and specific genetic engineering. We offer future directions that can be addressed by the physics community. Physical understanding of the CRISPR-Cas system will advance uses in biotechnology, such as developing cell lines and animal models, cell labeling and information storage, combatting antibiotic resistance, and human therapeutics.Comment: 75 pages, 15 figures, Physical Biology (2018

    Wide-Scale Analysis of Human Functional Transcription Factor Binding Reveals a Strong Bias towards the Transcription Start Site

    Get PDF
    We introduce a novel method to screen the promoters of a set of genes with shared biological function, against a precompiled library of motifs, and find those motifs which are statistically over-represented in the gene set. The gene sets were obtained from the functional Gene Ontology (GO) classification; for each set and motif we optimized the sequence similarity score threshold, independently for every location window (measured with respect to the TSS), taking into account the location dependent nucleotide heterogeneity along the promoters of the target genes. We performed a high throughput analysis, searching the promoters (from 200bp downstream to 1000bp upstream the TSS), of more than 8000 human and 23,000 mouse genes, for 134 functional Gene Ontology classes and for 412 known DNA motifs. When combined with binding site and location conservation between human and mouse, the method identifies with high probability functional binding sites that regulate groups of biologically related genes. We found many location-sensitive functional binding events and showed that they clustered close to the TSS. Our method and findings were put to several experimental tests. By allowing a "flexible" threshold and combining our functional class and location specific search method with conservation between human and mouse, we are able to identify reliably functional TF binding sites. This is an essential step towards constructing regulatory networks and elucidating the design principles that govern transcriptional regulation of expression. The promoter region proximal to the TSS appears to be of central importance for regulation of transcription in human and mouse, just as it is in bacteria and yeast.Comment: 31 pages, including Supplementary Information and figure

    The FEATURE framework for protein function annotation: modeling new functions, improving performance, and extending to novel applications

    Get PDF
    Structural genomics efforts contribute new protein structures that often lack significant sequence and fold similarity to known proteins. Traditional sequence and structure-based methods may not be sufficient to annotate the molecular functions of these structures. Techniques that combine structural and functional modeling can be valuable for functional annotation. FEATURE is a flexible framework for modeling and recognition of functional sites in macromolecular structures. Here, we present an overview of the main components of the FEATURE framework, and describe the recent developments in its use. These include automating training sets selection to increase functional coverage, coupling FEATURE to structural diversity generating methods such as molecular dynamics simulations and loop modeling methods to improve performance, and using FEATURE in large-scale modeling and structure determination efforts
    corecore