Search CORE

1 research outputs found

Fast pseudolikelihood maximization for direct-coupling analysis of protein structure from many homologous amino-acid sequences

Author: Aurell Erik
Ekeberg Magnus
Hartonen Tuomo
Publication venue: 'Elsevier BV'
Publication date: 20/01/2014
Field of study

Direct-Coupling Analysis is a group of methods to harvest information about coevolving residues in a protein family by learning a generative model in an exponential family from data. In protein families of realistic size, this learning can only be done approximately, and there is a trade-off between inference precision and computational speed. We here show that an earlier introduced

l_2

-regularized pseudolikelihood maximization method called plmDCA can be modified as to be easily parallelizable, as well as inherently faster on a single processor, at negligible difference in accuracy. We test the new incarnation of the method on 148 protein families from the Protein Families database (PFAM), one of the largest tests of this class of algorithms to date.Comment: 33 pages, 4 figures; M. Ekeberg and T. Hartonen are joint first authors; code and supplementary information on http://plmdca.csc.kth.se

arXiv.org e-Print Archive