1 research outputs found
Fast pseudolikelihood maximization for direct-coupling analysis of protein structure from many homologous amino-acid sequences
Direct-Coupling Analysis is a group of methods to harvest information about
coevolving residues in a protein family by learning a generative model in an
exponential family from data. In protein families of realistic size, this
learning can only be done approximately, and there is a trade-off between
inference precision and computational speed. We here show that an earlier
introduced -regularized pseudolikelihood maximization method called plmDCA
can be modified as to be easily parallelizable, as well as inherently faster on
a single processor, at negligible difference in accuracy. We test the new
incarnation of the method on 148 protein families from the Protein Families
database (PFAM), one of the largest tests of this class of algorithms to date.Comment: 33 pages, 4 figures; M. Ekeberg and T. Hartonen are joint first
authors; code and supplementary information on http://plmdca.csc.kth.se