22 research outputs found

    BioExcel Building Blocks Workflows (BioBB-Wfs), an integrated web-based plartform for biomolecular simulations.

    Get PDF
    We present BioExcel Building Blocks Workflows, a web-based graphical user interface (GUI) offering access to a collection of transversal pre-configured biomolecular simulation workflows assembled with the BioExcel Building Blocks library. Available workflows include Molecular Dynamics setup, protein-ligand docking, trajectory analyses and small molecule parameterization. Workflows can be launched in the platform or downloaded to be run in the users' own premises. Remote launching of long executions to user's available High-Performance computers is possible, only requiring configuration of the appropriate access credentials. The web-based graphical user interface offers a high level of interactivity, with integration with the NGL viewer to visualize and check 3D structures, MDsrv to visualize trajectories, and Plotly to explore 2D plots. The server requires no login but is recommended to store the users' projects and manage sensitive information such as remote credentials. Private projects can be made public and shared with colleagues with a simple URL. The tool will help biomolecular simulation users with the most common and repetitive processes by means of a very intuitive and interactive graphical user interface. The server is accessible at https://mmb.irbbarcelona.org/biobb-wfs

    Aprendiendo a programar. Nuevos retos, nuevas propuestas

    Get PDF
    La enseñanza de la programación en el ámbito universitario, a pesar de toda la experiencia acumulada, presenta muchos retos aún por alcanzar. Son diversos los elementos que añaden complejidad a este empeño y diversos también los estudios y propuestas didácticas que se proponen para abordarlos. El reto es aún mayor desde que, recientemente, se han incorporado al panorama universitario nuevos programas de especialidad con asignaturas de programación en ámbitos multidisciplinares como la bioinformática o la ciencia de datos. En este artículo se describe la experiencia de dos asignaturas de nivel de máster: “Programación para la bioinformática” y “Programación para la Ciencia de Datos”, ambas introductorias a la programación en el lenguaje Python, pero orientadas cada una de ellas a la resolución de los problemas específicos que se plantean en cada uno de estos dos ámbitos. Se trata de un objetivo tremendamente complejo, sobre todo si tenemos en cuenta el perfil heterogéneo de entrada de los estudiantes, poco o nada acostumbrados a la programación.In the university context and despite all the accumulated experience over the past decades, teaching computer programming is still challenging. The different approaches to accomplish this goal are diverse and complex, with many different didactic proposals. New challenges have aroused in recent times with the development of new and more specialized courses for multidisciplinary programs, such as bioinformatics or data science. In this work we describe the experience obtained in two MSc programs: Programming for Bioinformatics and Programming for Data Science, both of them with an introductory aim at programming in the Python language and oriented to solve specific problems and challenges in the two different scopes. This is an extremely complex goal, considering the heterogeneous background of the students, not familiar with coding

    The BioExcel methodology for developing dynamic, scalable, reliable and portable computational biomolecular workflows

    Get PDF
    Developing complex biomolecular workflows is not always straightforward. It requires tedious developments to enable the interoperability between the different biomolecular simulation and analysis tools. Moreover, the need to execute the pipelines on distributed systems increases the complexity of these developments. To address these issues, we propose a methodology to simplify the implementation of these workflows on HPC infrastructures. It combines a library, the BioExcel Building Blocks (BioBBs), that allows scientists to implement biomolecular pipelines as Python scripts, and the PyCOMPSs programming framework which allows to easily convert Python scripts into task-based parallel workflows executed in distributed computing systems such as HPC clusters, clouds, containerized platforms, etc. Using this methodology, we have implemented a set of computational molecular workflows and we have performed several experiments to validate its portability, scalability, reliability and malleability.This work has been supported by Spanish Ministry of Science and Innovation MCIN/AEI/10.13039/501100011033 under contract PID2019-107255GB-C21, by the Generalitat de Catalunya under contracts 2017-SGR-01414 and 2017-SGR1110, by the European Commission through the BioExcel Center of Excellence (Horizon 2020 Framework program) under contracts 823830, and 675728. This work is also partially supported by the CECH project which has been co-funded with 50% by the European Regional Development Fund under the framework of the ERFD Operative Programme for Catalunya 2014-2020, with a grant of 1.527.637,88€.Peer ReviewedPostprint (author's final draft

    BIGNASim: A NoSQL database structure and analysis portal for nucleic acids simulation data

    Get PDF
    Molecular dynamics simulation (MD) is, just behind genomics, the bioinformatics tool that generates the largest amounts of data, and that is using the largest amount of CPU time in supercomputing centres. MD trajectories are obtained after months of calculations, analysed in situ, and in practice forgotten. Several projects to generate stable trajectory databases have been developed for proteins, but no equivalence exists in the nucleic acids world. We present here a novel database system to store MD trajectories and analyses of nucleic acids. The initial data set available consists mainly of the benchmark of the new molecular dynamics force-field, parmBSC1. It contains 156 simulations, with over 120s of total simulation time. A deposition protocol is available to accept the submission of new trajectory data. The database is based on the combination of two NoSQL engines, Cassandra for storing trajectories and MongoDB to store analysis results and simulation metadata. The analyses available include backbone geometries, helical analysis, NMR observables and a variety of mechanical analyses. Individual trajectories and combined metatrajectories can be downloaded from the portal. The system is accessible through http://mmb.irbbarcelona.org/BIGNASim/. Supplementary Material is also available on-line at http://mmb.irbbarcelona.org/BIGNASim/SuppMaterial/

    High-Throughput Prediction of the Impact of Genetic Variability on Drug Sensitivity and Resistance Patterns for Clinically Relevant Epidermal Growth Factor Receptor Mutations from Atomistic Simulations

    Get PDF
    Mutations in the kinase domain of the epidermal growth factor receptor (EGFR) can be drivers of cancer and also trigger drug resistance in patients receiving chemotherapy treatment based on kinase inhibitors. A priori knowledge of the impact of EGFR variants on drug sensitivity would help to optimize chemotherapy and design new drugs that are effective against resistant variants before they emerge in clinical trials. To this end, we explored a variety of in silico methods, from sequence-based to "state-of-the-art" atomistic simulations. We did not find any sequence signal that can provide clues on when a drug-related mutation appears or the impact of such mutations on drug activity. Low-level simulation methods provide limited qualitative information on regions where mutations are likely to cause alterations in drug activity, and they can predict around 70% of the impact of mutations on drug efficiency. High-level simulations based on nonequilibrium alchemical free energy calculations show predictive power. The integration of these "state-of-the-art" methods into a workflow implementing an interface for parallel distribution of the calculations allows its automatic and high-throughput use, even for researchers with moderate experience in molecular simulations

    BioExcel Building Blocks, a software library for interoperable biomolecular simulation workflows.

    Get PDF
    In the recent years, the improvement of software and hardware performance has made biomolecular simulations a mature tool for the study of biological processes. Simulation length and the size and complexity of the analyzed systems make simulations both complementary and compatible with other bioinformatics disciplines. However, the characteristics of the software packages used for simulation have prevented the adoption of the technologies accepted in other bioinformatics fields like automated deployment systems, workflow orchestration, or the use of software containers. We present here a comprehensive exercise to bring biomolecular simulations to the "bioinformatics way of working". The exercise has led to the development of the BioExcel Building Blocks (BioBB) library. BioBB's are built as Python wrappers to provide an interoperable architecture. BioBB's have been integrated in a chain of usual software management tools to generate data ontologies, documentation, installation packages, software containers and ways of integration with workflow managers, that make them usable in most computational environments

    BIGNASim: a NoSQL database structure and analysis portal for nucleic acids simulation data.

    Get PDF
    Molecular dynamics simulation (MD) is, just behind genomics, the bioinformatics tool that generates the largest amounts of data, and that is using the largest amount of CPU time in supercomputing centres. MD trajectories are obtained after months of calculations, analysed in situ, and in practice forgotten. Several projects to generate stable trajectory databases have been developed for proteins, but no equivalence exists in the nucleic acids world. We present here a novel database system to store MD trajectories and analyses of nucleic acids. The initial data set available consists mainly of the benchmark of the new molecular dynamics force-field, parmBSC1. It contains 156 simulations, with over 120 of total simulation time. A deposition protocol is available to accept the submission of new trajectory data. The database is based on the combination of two NoSQL engines, Cassandra for storing trajectories and MongoDB to store analysis results and simulation metadata. The analyses available include backbone geometries, helical analysis, NMR observables and a variety of mechanical analyses. Individual trajectories and combined meta-trajectories can be downloaded from the portal. The system is accessible through http: //mmb.irbbarcelona.org/BIGNASim/. Supplementary Material is also available on-line at http://mmb. irbbarcelona.org/BIGNASim/SuppMaterial/.Spanish Ministry of Science [BIO2012-32868, SEV-2011-00067, TIN2012-34557]; Catalan Government [2014-SGR-134, 2014-SGR-1051]; Institut Català de Recerca I Estudis Avanc¸ats, ICREA Academia [to M.O.], Instituto de Salud Carlos III-Instituto Nacional de Bioinformática [PT13/0001/0019, PT13/0001/0028]; European Research Council [ERC SimDNA]; European Union, H2020 programme [Elixir-Excellerate: 676559; BioExcel: 674728, MuG: 676566]; PEDECIBA and SNI (ANII, Uruguay) [to P.D.D.]. Funding for open access charge: European Union [MuG: 676566].Peer ReviewedPostprint (published version

    Parmbsc1: A refined force-field for DNA simulations

    Get PDF
    We present parmbsc1, a force field for DNA atomistic simulation, which has been parameterized from high-level quantum mechanical data and tested for nearly 100 systems (representing a total simulation time of ∼140 μs) covering most of DNA structural space. Parmbsc1 provides high-quality results in diverse systems. Parameters and trajectories are available at http://mmb.irbbarcelona.org/ParmBSC1/

    BioExcel Building Blocks REST API (BioBB REST API), programmatic access to interoperable biomolecular simulation tools

    Full text link
    The BioExcel Building Blocks (BioBB) library offers a broad collection of wrappers on top of common biomolecular simulation and bioinformatics tools. The possibility to access the library remotely and programmatically increases its usability, allowing individual and sporadic executions and enabling remote workflows.BioBB REST API extends and complements the BioBB library offering programmatic access to the collection of biomolecular simulation tools included in the BioExcel Building Blocks library. Molecular Dynamics setup, docking, structure modeling, free energy simulations, and flexibility analyses are examples of functionalities included in the endpoints collection. All functionalities are accessible through standard REST API calls, voiding the need for tool installation.All the information related to the BioBB REST API endpoints is accessible from https://mmb.irbbarcelona.org/biobb-api/. Links to extended documentation, including OpenAPI endpoints specification and examples, Read-The-Docs documentation and a complete workflow tutorial can be found in the Suppl. Table 1

    Using interactive Jupyter Notebooks and BioConda for FAIR and reproducible biomolecular simulation workflows.

    No full text
    Interactive Jupyter Notebooks in combination with Conda environments can be used to generate FAIR (Findable, Accessible, Interoperable and Reusable/Reproducible) biomolecular simulation workflows. The interactive programming code accompanied by documentation and the possibility to inspect intermediate results with versatile graphical charts and data visualization is very helpful, especially in iterative processes, where parameters might be adjusted to a particular system of interest. This work presents a collection of FAIR notebooks covering various areas of the biomolecular simulation field, such as molecular dynamics (MD), protein-ligand docking, molecular checking/modeling, molecular interactions, and free energy perturbations. Workflows can be launched with myBinder or easily installed in a local system. The collection of notebooks aims to provide a compilation of demonstration workflows, and it is continuously updated and expanded with examples using new methodologies and tools
    corecore