Debye-Waller factor, a measure of X-ray attenuation, can be experimentally
observed in protein X-ray crystallography. Previous theoretical models have
made strong inroads in the analysis of B-factors by linearly fitting protein
B-factors from experimental data. However, the blind prediction of B-factors
for unknown proteins is an unsolved problem. This work integrates machine
learning and advanced graph theory, namely, multiscale weighted colored graphs
(MWCGs), to blindly predict B-factors of unknown proteins. MWCGs are local
features that measure the intrinsic flexibility due to a protein structure.
Global features that connect the B-factors of different proteins, e.g., the
resolution of X-ray crystallography, are introduced to enable the cross-protein
B-factor predictions. Several machine learning approaches, including ensemble
methods and deep learning, are considered in the present work. The proposed
method is validated with hundreds of thousands of experimental B-factors.
Extensive numerical results indicate that the blind B-factor predictions
obtained from the present method are more accurate than the least squares
fittings using traditional methods.Comment: 5 figures, 23 page