Méthodes géométriques et statistiques pour l'analyse et la prédiction des interactions structurales de biomolécules

Abstract

The biological function of macromolecules, such as proteins and nucleic acids, relies heavily on their interactions with their partners. The prediction of how molecules interact and how they can create large assemblies acting as nanomachines is essential for our understanding of biology but also for therapeutics and nanotechnology design.Blind challenges in biology, such as the CAPRI worldwide experiment for docking, have shown that in silico studies and simulations, mainly using physics-based potentials and techniques, could give structural insights in atomic detail. They might however be of very limited accuracy, particularly in predicting the native molecular structure of proteins, RNAs and complexes.Resorting to simple geometric coarse-grained modelling and machine learning strategies, such as genetic algorithms and support vector machines, we have shown that scoring the putative complex structures can be very much improved to reach the accuracy needed for experiment design and analysis, at least in a semi-rigid body context. From that proof of concept studies, most of the prediction strategies for docking now use machine learning for scoring optimization.Being able to predict the structure and the way molecular partners deform upon binding is also key to obtain better predictions, in particular for non-coding RNAs that are essential to target oncogenes. Our efforts in RNA structure prediction techniques have shown that data based parameterization of energy functions and statistical techniques largely improve the accuracy of structure prediction.Reaching the large assemblies stage also requires to be able to assess the dynamics of molecules from partial experimental data. We developed an efficient sampling technique based on inverse kinematics that does not rely on constraint counting, and implicitly calculates the rigidity of the molecule. Paired with experimental data, it offers an integrative view of the dynamics of non-coding RNAs for biological processes. Combined with clustering techniques, it allows for efficient and flexible cross-docking analysis for protein-RNA complexes

    Similar works