50 research outputs found

    Exploration et analyse immersives de données moléculaires guidées par la tùche et la modélisation sémantique des contenus

    Get PDF
    In structural biology, the theoretical study of molecular structures has four main activities organized in the following scenario: collection of experimental and theoretical data, visualization of 3D structures, molecular simulation, analysis and interpretation of results. This pipeline allows the expert to develop new hypotheses, to verify them experimentally and to produce new data as a starting point for a new scenario.The explosion in the amount of data to handle in this loop has two problems. Firstly, the resources and time dedicated to the tasks of transfer and conversion of data between each of these four activities increases significantly. Secondly, the complexity of molecular data generated by new experimental methodologies greatly increases the difficulty to properly collect, visualize and analyze the data.Immersive environments are often proposed to address the quantity and the increasing complexity of the modeled phenomena, especially during the viewing activity. Indeed, virtual reality offers a high quality stereoscopic perception, useful for a better understanding of inherently three-dimensional molecular data. It also displays a large amount of information thanks to the large display surfaces, but also to complete the immersive feeling with other sensorimotor channels (3D audio, haptic feedbacks,...).However, two major factors hindering the use of virtual reality in the field of structural biology. On one hand, although there are literature on navigation and environmental realistic virtual scenes, navigating abstract science is still very little studied. The understanding of complex 3D phenomena is however particularly conditioned by the subject’s ability to identify themselves in a complex 3D phenomenon. The first objective of this thesis work is then to propose 3D navigation paradigms adapted to the molecular structures of increasing complexity. On the other hand, the interactive context of immersive environments encourages direct interaction with the objects of interest. But the activities of: results collection, simulation and analysis, assume a working environment based on command-line inputs or through specific scripts associated to the tools. Usually, the use of virtual reality is therefore restricted to molecular structures exploration and visualization. The second thesis objective is then to bring all these activities, previously carried out in independent and interactive application contexts, within a homogeneous and unique interactive context. In addition to minimizing the time spent in data management between different work contexts, the aim is also to present, in a joint and simultaneous way, molecular structures and analyses, and allow their manipulation through direct interaction.Our contribution meets these objectives by building on an approach guided by both the content and the task. More precisely, navigation paradigms have been designed taking into account the molecular content, especially geometric properties, and tasks of the expert, to facilitate spatial referencing in molecular complexes and make the exploration of these structures more efficient. In addition, formalizing the nature of molecular data, their analysis and their visual representations, allows to interactively propose analyzes adapted to the nature of the data and create links between the molecular components and associated analyzes. These features go through the construction of a unified and powerful semantic representation making possible the integration of these activities in a unique interactive context.En biologie structurale, l’étude thĂ©orique de structures molĂ©culaires comporte quatre activitĂ©s principales organisĂ©es selon le processus sĂ©quentiel suivant : la collecte de donnĂ©es expĂ©rimentales/thĂ©oriques, la visualisation des structures 3d, la simulation molĂ©culaire, l’analyse et l’interprĂ©tation des rĂ©sultats. Cet enchaĂźnement permet Ă  l’expert d’élaborer de nouvelles hypothĂšses, de les vĂ©rifier de maniĂšre expĂ©rimentale et de produire de nouvelles donnĂ©es comme point de dĂ©part d’un nouveau processus.L’explosion de la quantitĂ© de donnĂ©es Ă  manipuler au sein de cette boucle pose dĂ©sormais deux problĂšmes. PremiĂšrement, les ressources et le temps relatifs aux tĂąches de transfert et de conversion de donnĂ©es entre chacune de ces activitĂ©s augmentent considĂ©rablement. DeuxiĂšmement, la complexitĂ© des donnĂ©es molĂ©culaires gĂ©nĂ©rĂ©es par les nouvelles mĂ©thodologies expĂ©rimentales accroĂźt fortement la difficultĂ© pour correctement percevoir, visualiser et analyser ces donnĂ©es.Les environnements immersifs sont souvent proposĂ©s pour aborder le problĂšme de la quantitĂ© et de la complexitĂ© croissante des phĂ©nomĂšnes modĂ©lisĂ©s, en particulier durant l’activitĂ© de visualisation. En effet, la RĂ©alitĂ© Virtuelle offre entre autre une perception stĂ©rĂ©oscopique de haute qualitĂ© utile Ă  une meilleure comprĂ©hension de donnĂ©es molĂ©culaires intrinsĂšquement tridimensionnelles. Elle permet Ă©galement d’afficher une quantitĂ© d’information importante grĂące aux grandes surfaces d’affichage, mais aussi de complĂ©ter la sensation d’immersion par d’autres canaux sensorimoteurs.Cependant, deux facteurs majeurs freinent l’usage de la RĂ©alitĂ© Virtuelle dans le domaine de la biologie structurale. D’une part, mĂȘme s’il existe une littĂ©rature fournie sur la navigation dans les scĂšnes virtuelles rĂ©alistes et Ă©cologiques, celle-ci est trĂšs peu Ă©tudiĂ©e sur la navigation sur des donnĂ©es scientifiques abstraites. La comprĂ©hension de phĂ©nomĂšnes 3d complexes est pourtant particuliĂšrement conditionnĂ©e par la capacitĂ© du sujet Ă  se repĂ©rer dans l’espace. Le premier objectif de ce travail de doctorat a donc Ă©tĂ© de proposer des paradigmes navigation 3d adaptĂ©s aux structures molĂ©culaires complexes. D’autre part, le contexte interactif des environnements immersif favorise l’interaction directe avec les objets d’intĂ©rĂȘt. Or les activitĂ©s de collecte et d’analyse des rĂ©sultats supposent un contexte de travail en "ligne de commande" ou basĂ© sur des scripts spĂ©cifiques aux outils d’analyse. Il en rĂ©sulte que l’usage de la RĂ©alitĂ© Virtuelle se limite souvent Ă  l’activitĂ© d’exploration et de visualisation des structures molĂ©culaires. C’est pourquoi le second objectif de thĂšse est de rapprocher ces diffĂ©rentes activitĂ©s, jusqu’alors rĂ©alisĂ©es dans des contextes interactifs et applicatifs indĂ©pendants, au sein d’un contexte interactif homogĂšne et unique. Outre le fait de minimiser le temps passĂ© dans la gestion des donnĂ©es entre les diffĂ©rents contextes de travail, il s’agit Ă©galement de prĂ©senter de maniĂšre conjointe et simultanĂ©e les structures molĂ©culaires et leurs analyses et de permettre leur manipulation par des interactions directes.Notre contribution rĂ©pond Ă  ces objectifs en s’appuyant sur une approche guidĂ©e Ă  la fois par le contenu et la tĂąche. Des paradigmes de navigation ont Ă©tĂ© conçus en tenant compte du contenu molĂ©culaire, en particulier des propriĂ©tĂ©s gĂ©omĂ©triques, et des tĂąches de l’expert, afin de faciliter le repĂ©rage spatial et de rendre plus performante l’activitĂ© d’exploration. Par ailleurs, formaliser la nature des donnĂ©es molĂ©culaires, leurs analyses et leurs reprĂ©sentations visuelles, permettent notamment de proposer Ă  la demande et interactivement des analyses adaptĂ©es Ă  la nature des donnĂ©es et de crĂ©er des liens entre les composants molĂ©culaires et les analyses associĂ©es. Ces fonctionnalitĂ©s passent par la construction d’une reprĂ©sentation sĂ©mantique unifiĂ©e et performante rendant possible l’intĂ©gration de ces activitĂ©s dans un contexte interactif unique

    Final report on deployment of consolidated platform and the overall architecture

    Get PDF
    This document is the final report about the activities of the Work Package 4 (WP4), aiming at provisioning a consistent e-infrastructure gradually integrating the existing isolated software solutions in the structural biology field into a single computing and data processing environment, based on the state of the art grid and cloud open source software tools and frameworks. This report follows the documents D4.3, MS14, D4.5 and MS15, respectively delivered at project month 15, 24, 26, 34, so that mostly the progress achieved until project month 36 not already described in the previous D4.5 ten months ago will be reported here, with references to MS15 when possible. The document starts with an updated description of the resources potentially available for the project from the EGI e-infrastructure, on top of which we built the consolidated West-Life platform. It then presents a detailed view of resource usage and their geographical distribution in the third and last year of the project, as obtained from the EGI Accounting Portal. The remaining of the document reports in details the final achievements about the three main aspects of the platform: the consolidated job management mechanism, the programmatic access to datasets and the unified security and accounting model

    ExaViz: a Flexible Framework to Analyse, Steer and Interact with Molecular Dynamics Simulations

    Get PDF
    International audienceThe amount of data generated by molecular dynamics simulations of large molecular assemblies and the sheer size and complexity of the systems studied call for new ways to analyse, steer and interact with such calculations. Traditionally, the analysis is performed off-line once the huge amount of simulation results have been saved to disks, thereby stressing the supercomputer I/O systems, and making it increasingly difficult to handle post-processing and analysis from the scientist's office. The ExaViz framework is an alternative approach developed to couple the simulation with analysis tools to process the data as close as possible to their source of creation, saving a reduced, more manageable and pre-processed data set to disk. ExaViz supports a large variety of analysis and steering scenarios. Our framework can be used for live sessions (simulations short enough to be fully followed by the user) as well as batch sessions (long time batch executions). During interactive sessions, at run time, the user can display plots from analysis, visualise the molecular system and steer the simulation with a haptic device. We also emphasise how a Cave-like immersive environment could be used to leverage such simulations, offering a large display surface to view and intuitively navigate the molecular system

    An overview of data‐driven HADDOCK strategies in CAPRI rounds 38-45

    Get PDF
    Our information-driven docking approach HADDOCK has demonstrated a sustained performance since the start of its participation to CAPRI. This is due, in part, to its ability to integrate data into the modeling process, and to the robustness of its scoring function. We participated in CAPRI both as server and manual predictors. In CAPRI rounds 38-45, we have used various strategies depending on the available information. These ranged from imposing restraints to a few residues identified from literature as being important for the interaction, to binding pockets identified from homologous complexes or template-based refinement/CA-CA restraint-guided docking from identified templates. When relevant, symmetry restraints were used to limit the conformational sampling. We also tested for a large decamer target a new implementation of the MARTINI coarse-grained force field in HADDOCK. Overall, we obtained acceptable or better predictions for 13 and 11 server and manual submissions, respectively, out of the 22 interfaces. Our server performance (acceptable or higher-quality models when considering the top 10) was better (59%) than the manual (50%) one, in which we typically experiment with various combinations of protocols and data sources. Again, our simple scoring function based on a linear combination of intermolecular van der Waals and electrostatic energies and an empirical desolvation term demonstrated a good performance in the scoring experiment with a 63% success rate across all 22 interfaces. An analysis of model quality indicates that, while we are consistently performing well in generating acceptable models, there is room for improvement for generating/identifying higher quality models

    Sharing data from molecular simulations

    Get PDF
    Given the need for modern researchers to produce open, reproducible scientific output, the lack of standards and best practices for sharing data and workflows used to produce and analyze molecular dynamics (MD) simulations has become an important issue in the field. There are now multiple well-established packages to perform molecular dynamics simulations, often highly tuned for exploiting specific classes of hardware, each with strong communities surrounding them, but with very limited interoperability/transferability options. Thus, the choice of the software package often dictates the workflow for both simulation production and analysis. The level of detail in documenting the workflows and analysis code varies greatly in published work, hindering reproducibility of the reported results and the ability for other researchers to build on these studies. An increasing number of researchers are motivated to make their data available, but many challenges remain in order to effectively share and reuse simulation data. To discuss these and other issues related to best practices in the field in general, we organized a workshop in November 2018 (https://bioexcel.eu/events/workshop-on-sharing-data-from-molecular-simulations/). Here, we present a brief overview of this workshop and topics discussed. We hope this effort will spark further conversation in the MD community to pave the way toward more open, interoperable, and reproducible outputs coming from research studies using MD simulations

    West-Life: A Virtual Research Environment for structural biology

    Get PDF
    The West-Life project (https://about.west-life.eu/)is a Horizon 2020 project funded by the European Commission to provide data processing and data management services for the international community of structural biologists, and in particular to support integrative experimental approaches within the field of structural biology. It has developed enhancements to existing web services for structure solution and analysis, created new pipelines to link these services into more complex higher-level workflows, and added new data management facilities. Through this work it has striven to make the benefits of European e-Infrastructures more accessible to life-science researchers in general and structural biologists in particular

    Visual Analytics for molecular data in immersive environments

    No full text
    En biologie structurale, l’étude thĂ©orique de structures molĂ©culaires comporte quatre activitĂ©s principales organisĂ©es selon le processus sĂ©quentiel suivant : la collecte de donnĂ©es expĂ©rimentales/thĂ©oriques, la visualisation des structures 3d, la simulation molĂ©culaire, l’analyse et l’interprĂ©tation des rĂ©sultats. Cet enchaĂźnement permet Ă  l’expert d’élaborer de nouvelles hypothĂšses, de les vĂ©rifier de maniĂšre expĂ©rimentale et de produire de nouvelles donnĂ©es comme point de dĂ©part d’un nouveau processus.L’explosion de la quantitĂ© de donnĂ©es Ă  manipuler au sein de cette boucle pose dĂ©sormais deux problĂšmes. PremiĂšrement, les ressources et le temps relatifs aux tĂąches de transfert et de conversion de donnĂ©es entre chacune de ces activitĂ©s augmentent considĂ©rablement. DeuxiĂšmement, la complexitĂ© des donnĂ©es molĂ©culaires gĂ©nĂ©rĂ©es par les nouvelles mĂ©thodologies expĂ©rimentales accroĂźt fortement la difficultĂ© pour correctement percevoir, visualiser et analyser ces donnĂ©es.Les environnements immersifs sont souvent proposĂ©s pour aborder le problĂšme de la quantitĂ© et de la complexitĂ© croissante des phĂ©nomĂšnes modĂ©lisĂ©s, en particulier durant l’activitĂ© de visualisation. En effet, la RĂ©alitĂ© Virtuelle offre entre autre une perception stĂ©rĂ©oscopique de haute qualitĂ© utile Ă  une meilleure comprĂ©hension de donnĂ©es molĂ©culaires intrinsĂšquement tridimensionnelles. Elle permet Ă©galement d’afficher une quantitĂ© d’information importante grĂące aux grandes surfaces d’affichage, mais aussi de complĂ©ter la sensation d’immersion par d’autres canaux sensorimoteurs.Cependant, deux facteurs majeurs freinent l’usage de la RĂ©alitĂ© Virtuelle dans le domaine de la biologie structurale. D’une part, mĂȘme s’il existe une littĂ©rature fournie sur la navigation dans les scĂšnes virtuelles rĂ©alistes et Ă©cologiques, celle-ci est trĂšs peu Ă©tudiĂ©e sur la navigation sur des donnĂ©es scientifiques abstraites. La comprĂ©hension de phĂ©nomĂšnes 3d complexes est pourtant particuliĂšrement conditionnĂ©e par la capacitĂ© du sujet Ă  se repĂ©rer dans l’espace. Le premier objectif de ce travail de doctorat a donc Ă©tĂ© de proposer des paradigmes navigation 3d adaptĂ©s aux structures molĂ©culaires complexes. D’autre part, le contexte interactif des environnements immersif favorise l’interaction directe avec les objets d’intĂ©rĂȘt. Or les activitĂ©s de collecte et d’analyse des rĂ©sultats supposent un contexte de travail en "ligne de commande" ou basĂ© sur des scripts spĂ©cifiques aux outils d’analyse. Il en rĂ©sulte que l’usage de la RĂ©alitĂ© Virtuelle se limite souvent Ă  l’activitĂ© d’exploration et de visualisation des structures molĂ©culaires. C’est pourquoi le second objectif de thĂšse est de rapprocher ces diffĂ©rentes activitĂ©s, jusqu’alors rĂ©alisĂ©es dans des contextes interactifs et applicatifs indĂ©pendants, au sein d’un contexte interactif homogĂšne et unique. Outre le fait de minimiser le temps passĂ© dans la gestion des donnĂ©es entre les diffĂ©rents contextes de travail, il s’agit Ă©galement de prĂ©senter de maniĂšre conjointe et simultanĂ©e les structures molĂ©culaires et leurs analyses et de permettre leur manipulation par des interactions directes.Notre contribution rĂ©pond Ă  ces objectifs en s’appuyant sur une approche guidĂ©e Ă  la fois par le contenu et la tĂąche. Des paradigmes de navigation ont Ă©tĂ© conçus en tenant compte du contenu molĂ©culaire, en particulier des propriĂ©tĂ©s gĂ©omĂ©triques, et des tĂąches de l’expert, afin de faciliter le repĂ©rage spatial et de rendre plus performante l’activitĂ© d’exploration. Par ailleurs, formaliser la nature des donnĂ©es molĂ©culaires, leurs analyses et leurs reprĂ©sentations visuelles, permettent notamment de proposer Ă  la demande et interactivement des analyses adaptĂ©es Ă  la nature des donnĂ©es et de crĂ©er des liens entre les composants molĂ©culaires et les analyses associĂ©es. Ces fonctionnalitĂ©s passent par la construction d’une reprĂ©sentation sĂ©mantique unifiĂ©e et performante rendant possible l’intĂ©gration de ces activitĂ©s dans un contexte interactif unique.In structural biology, the theoretical study of molecular structures has four main activities organized in the following scenario: collection of experimental and theoretical data, visualization of 3D structures, molecular simulation, analysis and interpretation of results. This pipeline allows the expert to develop new hypotheses, to verify them experimentally and to produce new data as a starting point for a new scenario.The explosion in the amount of data to handle in this loop has two problems. Firstly, the resources and time dedicated to the tasks of transfer and conversion of data between each of these four activities increases significantly. Secondly, the complexity of molecular data generated by new experimental methodologies greatly increases the difficulty to properly collect, visualize and analyze the data.Immersive environments are often proposed to address the quantity and the increasing complexity of the modeled phenomena, especially during the viewing activity. Indeed, virtual reality offers a high quality stereoscopic perception, useful for a better understanding of inherently three-dimensional molecular data. It also displays a large amount of information thanks to the large display surfaces, but also to complete the immersive feeling with other sensorimotor channels (3D audio, haptic feedbacks,...).However, two major factors hindering the use of virtual reality in the field of structural biology. On one hand, although there are literature on navigation and environmental realistic virtual scenes, navigating abstract science is still very little studied. The understanding of complex 3D phenomena is however particularly conditioned by the subject’s ability to identify themselves in a complex 3D phenomenon. The first objective of this thesis work is then to propose 3D navigation paradigms adapted to the molecular structures of increasing complexity. On the other hand, the interactive context of immersive environments encourages direct interaction with the objects of interest. But the activities of: results collection, simulation and analysis, assume a working environment based on command-line inputs or through specific scripts associated to the tools. Usually, the use of virtual reality is therefore restricted to molecular structures exploration and visualization. The second thesis objective is then to bring all these activities, previously carried out in independent and interactive application contexts, within a homogeneous and unique interactive context. In addition to minimizing the time spent in data management between different work contexts, the aim is also to present, in a joint and simultaneous way, molecular structures and analyses, and allow their manipulation through direct interaction.Our contribution meets these objectives by building on an approach guided by both the content and the task. More precisely, navigation paradigms have been designed taking into account the molecular content, especially geometric properties, and tasks of the expert, to facilitate spatial referencing in molecular complexes and make the exploration of these structures more efficient. In addition, formalizing the nature of molecular data, their analysis and their visual representations, allows to interactively propose analyzes adapted to the nature of the data and create links between the molecular components and associated analyzes. These features go through the construction of a unified and powerful semantic representation making possible the integration of these activities in a unique interactive context

    Protein–Protein Modeling Using Cryo-EM Restraints

    No full text
    Recent improvements in cryo-electron microscopy (cryo-EM) in the past few years are now allowing to observe molecular complexes at atomic resolution. As a consequence, numerous structures derived from cryo-EM are now available in the Protein Data Bank. However, if for some complexes atomic resolution is reached, this is not true for all. This is also the case in cryo-electron tomography where the achievable resolution is still limited. Furthermore the resolution in a cryo-EM map is not a constant, with often outer regions being of lower resolution, possibly linked to conformational variability. Although those low- to medium-resolution EM maps (or regions thereof) cannot directly provide atomic structure of large molecular complexes, they provide valuable information to model the individual components and their assembly into them. Most approaches for this kind of modeling are performing rigid fitting of the individual components into the EM density map. While this would appear an obvious option, they ignore key aspects of molecular recognition, the energetics and flexibility of the interfaces. Moreover, this often restricts the modeling to a unique source of data, the EM density map. In this chapter, we describe a protocol where an EM map is used as restraint in HADDOCK to guide the modeling process. In the first step, rigid-body fitting is performed with PowerFit in order to identify the most likely locations of the molecules into the map. These are then used as centroids to which distance restraints are defined from the center of mass of the components of the complex for the initial rigid-body docking. The EM density is then directly used as an additional restraint energy term, which can be combined with all the other types of data supported by HADDOCK. This protocol relies on the new version 2.4 of both the HADDOCK webserver and software. Preparation steps consisting of cropping the EM map and rigid-body fitting of the atomic structure are explained. Then, the EM-driven docking protocol using HADDOCK is illustrated

    haddocking/haddock-tools: First stable release of the haddock-tools

    No full text
    This release include a variety of script utilities for the setup of HADDOCK calculations. They concentrate on PDB files manipulations and restraints generation and validation

    Protein–Protein Modeling Using Cryo-EM Restraints

    No full text
    Recent improvements in cryo-electron microscopy (cryo-EM) in the past few years are now allowing to observe molecular complexes at atomic resolution. As a consequence, numerous structures derived from cryo-EM are now available in the Protein Data Bank. However, if for some complexes atomic resolution is reached, this is not true for all. This is also the case in cryo-electron tomography where the achievable resolution is still limited. Furthermore the resolution in a cryo-EM map is not a constant, with often outer regions being of lower resolution, possibly linked to conformational variability. Although those low- to medium-resolution EM maps (or regions thereof) cannot directly provide atomic structure of large molecular complexes, they provide valuable information to model the individual components and their assembly into them. Most approaches for this kind of modeling are performing rigid fitting of the individual components into the EM density map. While this would appear an obvious option, they ignore key aspects of molecular recognition, the energetics and flexibility of the interfaces. Moreover, this often restricts the modeling to a unique source of data, the EM density map. In this chapter, we describe a protocol where an EM map is used as restraint in HADDOCK to guide the modeling process. In the first step, rigid-body fitting is performed with PowerFit in order to identify the most likely locations of the molecules into the map. These are then used as centroids to which distance restraints are defined from the center of mass of the components of the complex for the initial rigid-body docking. The EM density is then directly used as an additional restraint energy term, which can be combined with all the other types of data supported by HADDOCK. This protocol relies on the new version 2.4 of both the HADDOCK webserver and software. Preparation steps consisting of cropping the EM map and rigid-body fitting of the atomic structure are explained. Then, the EM-driven docking protocol using HADDOCK is illustrated
    corecore