Exploiting distant homologues for phasing through the generation of compact fragments, local fold refinement and partial solution combination.

Abstract

Macromolecular structures can be solved by molecular replacement provided that suitable search models are available. Models from distant homologues may deviate too much from the target structure to succeed, notwithstanding an overall similar fold or even their featuring areas of very close geometry. Successful methods to make the most of such templates usually rely on the degree of conservation to select and improve search models. ARCIMBOLDO_SHREDDER uses fragments derived from distant homologues in a brute-force approach driven by the experimental data, instead of by sequence similarity. The new algorithms implemented in ARCIMBOLDO_SHREDDER are described in detail, illustrating its characteristic aspects in the solution of new and test structures. In an advance from the previously published algorithm, which was based on omitting or extracting contiguous polypeptide spans, model generation now uses three-dimensional volumes respecting structural units. The optimal fragment size is estimated from the expected log-likelihood gain (LLG) values computed assuming that a substructure can be found with a level of accuracy near that required for successful extension of the structure, typically below 0.6 Å root-mean-square deviation (r.m.s.d.) from the target. Better sampling is attempted through model trimming or decomposition into rigid groups and optimization through Phaser's gyre refinement. Also, after model translation, packing filtering and refinement, models are either disassembled into predetermined rigid groups and refined (gimble refinement) or Phaser's LLG-guided pruning is used to trim the model of residues that are not contributing signal to the LLG at the target r.m.s.d. value. Phase combination among consistent partial solutions is performed in reciprocal space with ALIXE. Finally, density modification and main-chain autotracing in SHELXE serve to expand to the full structure and identify successful solutions. The performance on test data and the solution of new structures are described

    Similar works