76,753 research outputs found

    Inferring processes underlying B-cell repertoire diversity

    Full text link
    We quantify the VDJ recombination and somatic hypermutation processes in human B-cells using probabilistic inference methods on high-throughput DNA sequence repertoires of human B-cell receptor heavy chains. Our analysis captures the statistical properties of the naive repertoire, first after its initial generation via VDJ recombination and then after selection for functionality. We also infer statistical properties of the somatic hypermutation machinery (exclusive of subsequent effects of selection). Our main results are the following: the B-cell repertoire is substantially more diverse than T-cell repertoires, due to longer junctional insertions; sequences that pass initial selection are distinguished by having a higher probability of being generated in a VDJ recombination event; somatic hypermutations have a non-uniform distribution along the V gene that is well explained by an independent site model for the sequence context around the hypermutation site.Comment: acknowledgement adde

    Three-helix-bundle Protein in a Ramachandran Model

    Full text link
    We study the thermodynamic behavior of a model protein with 54 amino acids that forms a three-helix bundle in its native state. The model contains three types of amino acids and five to six atoms per amino acid and has the Ramachandran torsional angles ϕi\phi_i, ψi\psi_i as its degrees of freedom. The force field is based on hydrogen bonds and effective hydrophobicity forces. For a suitable choice of the relative strength of these interactions, we find that the three-helix-bundle protein undergoes an abrupt folding transition from an expanded state to the native state. Also shown is that the corresponding one- and two-helix segments are less stable than the three-helix sequence.Comment: 15 pages, 7 figure

    Cold and Warm Denaturation of Proteins

    Full text link
    We introduce a simplified protein model where the water degrees of freedom appear explicitly (although in an extremely simplified fashion). Using this model we are able to recover both the warm and the cold protein denaturation within a single framework, while addressing important issues about the structure of model proteins

    The Genetic Code as a Periodic Table: Algebraic Aspects

    Get PDF
    The systematics of indices of physico-chemical properties of codons and amino acids across the genetic code are examined. Using a simple numerical labelling scheme for nucleic acid bases, data can be fitted as low-order polynomials of the 6 coordinates in the 64-dimensional codon weight space. The work confirms and extends recent studies by Siemion of amino acid conformational parameters. The connections between the present work, and recent studies of the genetic code structure using dynamical symmetry algebras, are pointed out.Comment: 26 pages Latex, 10 figures (4 ps, 6 Tex). Refereed version, small changes to discussion (conclusion unaltered). Minor alterations to format of figures and tables. To appear in BioSystem

    Selection of sequence motifs and generative Hopfield-Potts models for protein familiesilies

    Full text link
    Statistical models for families of evolutionary related proteins have recently gained interest: in particular pairwise Potts models, as those inferred by the Direct-Coupling Analysis, have been able to extract information about the three-dimensional structure of folded proteins, and about the effect of amino-acid substitutions in proteins. These models are typically requested to reproduce the one- and two-point statistics of the amino-acid usage in a protein family, {\em i.e.}~to capture the so-called residue conservation and covariation statistics of proteins of common evolutionary origin. Pairwise Potts models are the maximum-entropy models achieving this. While being successful, these models depend on huge numbers of {\em ad hoc} introduced parameters, which have to be estimated from finite amount of data and whose biophysical interpretation remains unclear. Here we propose an approach to parameter reduction, which is based on selecting collective sequence motifs. It naturally leads to the formulation of statistical sequence models in terms of Hopfield-Potts models. These models can be accurately inferred using a mapping to restricted Boltzmann machines and persistent contrastive divergence. We show that, when applied to protein data, even 20-40 patterns are sufficient to obtain statistically close-to-generative models. The Hopfield patterns form interpretable sequence motifs and may be used to clusterize amino-acid sequences into functional sub-families. However, the distributed collective nature of these motifs intrinsically limits the ability of Hopfield-Potts models in predicting contact maps, showing the necessity of developing models going beyond the Hopfield-Potts models discussed here.Comment: 26 pages, 16 figures, to app. in PR

    Thermodynamics of alpha- and beta-structure formation in proteins

    Full text link
    An atomic protein model with a minimalistic potential is developed and then tested on an alpha-helix and a beta-hairpin, using exactly the same parameters for both peptides. We find that melting curves for these sequences to a good approximation can be described by a simple two-state model, with parameters that are in reasonable quantitative agreement with experimental data. Despite the apparent two-state character of the melting curves, the energy distributions are found to lack a clear bimodal shape, which is discussed in some detail. We also perform a Monte Carlo-based kinetic study and find, in accord with experimental data, that the alpha-helix forms faster than the beta-hairpin.Comment: 18 pages, 4 figure
    • 

    corecore