1 research outputs found

    Fast pseudolikelihood maximization for direct-coupling analysis of protein structure from many homologous amino-acid sequences

    Full text link
    Direct-Coupling Analysis is a group of methods to harvest information about coevolving residues in a protein family by learning a generative model in an exponential family from data. In protein families of realistic size, this learning can only be done approximately, and there is a trade-off between inference precision and computational speed. We here show that an earlier introduced l2l_2-regularized pseudolikelihood maximization method called plmDCA can be modified as to be easily parallelizable, as well as inherently faster on a single processor, at negligible difference in accuracy. We test the new incarnation of the method on 148 protein families from the Protein Families database (PFAM), one of the largest tests of this class of algorithms to date.Comment: 33 pages, 4 figures; M. Ekeberg and T. Hartonen are joint first authors; code and supplementary information on http://plmdca.csc.kth.se
    corecore