1 research outputs found

    MetaboCloud : A catalog of microservices hosted on a Cloud infrastructure and addressing issues linked to FAIR principles and open science

    No full text
    International audienceMetabolomics, the study of small molecules called metabolites, is a field generating massive and complex data that needs to be processed and interpreted. However, this requires overcoming several challenges, such as data manipulation, where the heterogeneity of technologies makes it difficult to standardize the methods and tools, as well as molecule annotation which is still a major bottleneck nowadays. Bioinformatics therefore plays an important role in answering these issues. However, it also brings its own set of questions, in terms of interoperability and reproducibility for example.In this context, new resources must be put in place to address the needs related to these questions. Consideration about FAIR (Findable, Accessible, Interoperable, Reusable) principles and open science has led to the initiation of MetaboCloud; a collaborative project between teams from the “Exploration of Metabolism Platform” (PFEM) member of the MetaboHUB infrastructure, the Auvergne Bioinformatics (AuBi) platform and the Mesocentre from the Clermont Auvergne Univercity (UCA). It aims to provide a set of bioinformatics tools, in metabolomics as a start, in the form of microservices, hosted on a Cloud infrastructure. It also intends to serve as a proof of concept, and to create a recipe to share with the bioinformatics community, integrating best practices in terms of code and deployment, thus ensuring high service quality and easy maintainability. The MetaboCloud microservices infrastructure is based on (1) bioinformatics tools, (2) from scratch API development if tools are not available, (3) an advanced CI/CD work environment managing the construction of a docker image, (4) taken all together in an OpenStack cloud technology environment. A roadmap containing about ten microservices has been drawn up for this project. Three of them are currently open to the community for use. Two have been developed using the Java language and the SpringBoot web framework. One is based on the CDK tool, which offers several functionalities using structural information (InChI, MOL or SDF) as input data. Firstly, it can return chemical properties of a compound, such as masses, SMILES, InChIs, the logP and the formula. It can also convert the compound’s structural information into another format such as InChI Key, InChI, MOL or SDF. Finally, it can depict a molecule, which means returning its PNG or SVG image. The other microservice is based on the InChI tool, which can, using the same type of input that CDK, generate the InChI and the InChI Key of a compound. The third microservice, derived from the Goslin tool, has been implemented in Python using the Falcon web framework. It can be used to transform a common lipid name into a standardized one. All microservices methods are described in OpenAPI standardized format, which enable anyone to generate code to query them in a large panel of programming languages. Each microservice has its own Docker container.These microservices can be used in different ways. They can be integrated into an application, or they can be used on their own or combined with other microservices inside a script. Furthermore, web components which are a collection of functionalities establishing a standardized component model for the web, enabling the encapsulation and interoperability of individual HTML elements, also developed within PFEM, are available for use as clients to query each of the microservices. These web components are available in a npm library.The development of bioinformatics tools in the form of microservices therefore offers a number of advantages. In particular, from an interoperability point of view, as they can be queried from any programming language. It also addresses issues of reproducibility, since the versioning of a microservice is controlled by its containerization. Moreover, a web portal, referenced in bio.tools, has been created to make all developed applications accessible, associated with their documentation and metadata, thus addressing the "Accessible" dimension of the FAIR principles
    corecore