20 research outputs found
Running Experiments with Confidence and Sanity
Analyzing data from large experimental suites is a daily task for anyone doing experimental algorithmics. In this paper we report on several approaches we tried for this seemingly mundane task in a similarity search setting, reflecting on the challenges it poses. We conclude by proposing a workflow, which can be implemented using several tools, that allows to analyze experimental data with confidence. The extended version of this paper and the support code are provided at https://github.com/Cecca/running-experiments
A rigorous model of reflex function indicates that position and force feedback are flexibly tuned to position and force tasks
This study aims to quantify the separate contributions of muscle force feedback, muscle spindle activity and co-contraction to the performance of voluntary tasks (“reduce the influence of perturbations on maintained force or position”). Most human motion control studies either isolate only one contributor, or assume that relevant reflexive feedback pathways during voluntary disturbance rejection tasks originate mainly from the muscle spindle. Human ankle-control experiments were performed, using three task instructions and three perturbation characteristics to evoke a wide range of responses to force perturbations. During position tasks, subjects (n = 10) resisted the perturbations, becoming more stiff than when being relaxed (i.e., the relax task). During force tasks, subjects were instructed to minimize force changes and actively gave way to imposed forces, thus becoming more compliant than during relax tasks. Subsequently, linear physiological models were fitted to the experimental data. Inhibitory, as well as excitatory force feedback, was needed to account for the full range of measured experimental behaviors. In conclusion, force feedback plays an important role in the studied motion control tasks (excitatory during position tasks and inhibitory during force tasks), implying that spindle-mediated feedback is not the only significant adaptive system that contributes to the maintenance of posture or force
Are Cloud Platforms Ready for Multi-cloud?
Part 2: Cloud Service and Platform SelectionInternational audienceMulti-cloud computing is getting a momentum as it offers various advantages, including vendor lock-in avoidance, better client proximity and application performance improvement. As such, various multi-cloud platforms have been developed, each with its own strengths and limitations. This paper aims at comparing all these platforms to unveil the best one as well as ease the selection of the right platform based on the user requirements and preferences. Further, it identifies the current gaps in the platforms to be covered so as to enable the full potential of multi-cloud computing. Finally, it draws directions for further research
Performance comparison: virtual machines and containers running artificial intelligence applications
With the continuous growth of data that can be valuable for companies and scientific research, cloud computing has shown itself as one of the emerging technologies that can help solve many of these applications that need the right level of computing and ubiquitous access to them. Cloud Computing has a base technology that is virtualization,
which has evolved to provide users with features from which they can benefit. There are different types of virtualization and each of them has its own way of carrying out some processes and of managing computational resources. In this paper, we present the comparison of performance between virtual machines and containers, specifically between an instance of OpenStack and docker and singularity containers. The application used to measure performance is a real application of artificial intelligence. We present the obtained results and discuss the
Seamlessly Managing HPC Workloads Through Kubernetes
[EN] This paper describes an approach to integrate the jobs management of High Performance Computing (HPC) infrastructures in cloud architectures by managing HPC workloads seamlessly from the cloud job scheduler. The paper presents hpc-connector, an open source tool that is designed for managing the full life cycle of jobs in the HPC infrastructure from the cloud job scheduler interacting with the workload manager of the HPC system. The key point is that, thanks to running hpc-connector in the cloud infrastructure, it is possible to reflect in the cloud infrastructure, the execution of a job running in the HPC infrastructure managed by hpc-connector. If the user cancels the cloud-job, as hpc-connector catches Operating System (OS) signals (for example, SIGINT), it will cancel the job in the HPC infrastructure too. Furthermore, it can retrieve logs if requested. Therefore, by using hpc-connector, the cloud job scheduler can manage the jobs in the HPC infrastructure without requiring any special privilege, as it does not need changes on the Job scheduler. Finally, we perform an experiment training a neural network for automated segmentation of Neuroblastoma tumours in the Prometheus supercomputer using hpc-connector as a batch job from a Kubernetes infrastructure.The work presented in this article has been partially funded by the regional government
of the Comunitat Valenciana (Spain), co-funded by the European Union ERDF funds
(European Regional Development Fund) of the Comunitat Valenciana 2014¿2020, with
reference IDIFEDER/2018/032 (High-Performance Algorithms for the Modeling, Simulation and early Detection of diseases in Personalized Medicine). The work is also
co-funded by PRIMAGE (PRedictive In-silico Multiscale Analytics to support cancer
personalised diaGnosis and prognosis, empowered by imaging biomarkers) a Horizon
2020 RIA project funded under the topic SC1-DTH-07-2018 by the European Commission, with grant agreement no: 826494.López-Huguet, S.; Segrelles Quilis, JD.; Kasztelnik, M.; Bubak, M.; Blanquer Espert, I. (2020). Seamlessly Managing HPC Workloads Through Kubernetes. Springer. 310-320. https://doi.org/10.1007/978-3-030-59851-8_20S310320Azure for health. https://azure.microsoft.com/en-us/industries/healthcare/#security. Accessed 07 May 2020Cloud access to mammograms enables earlier breast cancer detection. https://www.itnonline.com/content/cloud-access-mammograms-enables-earlier-breast-cancer-detection. Accessed 07 May 2020Getting to the heart of the HPC and AI the edge in healthcare. https://www.nextplatform.com/2018/03/28/getting-to-the-heart-of-hpc-and-ai-at-the-edge-in-healthcare/. Accessed 07 May 2020High Performance Computing and deep learning in medicine: Enhancing physicians, helping patients. https://ec.europa.eu/digital-single-market/en/news/high-performance-computing-and-deep-learning-medicine-enhancing-physicians-helping-patients. Accessed 07 May 2020Medical Imaging Gets an AI Boost. https://www.hpcwire.com/2019/12/03/medical-imaging-gets-an-ai-boost/. Accessed 07 May 2020Bhatnagar, S.: An audit of malignant solid tumors in infants and neonates. J. Neonatal Surg. 1, 5 (2012)Cabellos, L., Campos, I., Fernández-Del-Castillo, E., Owsiak, M., Palak, B., Płóciennik, M.: Scientific workflow orchestration interoperating HTC and HPC resources. Comput. Phys. Commun. (2011). https://doi.org/10.1016/j.cpc.2010.12.020Callaghan, S., Maechling, P., Small, P., Milner, K., Juve, G., et al.: Metrics for heterogeneous scientific workflows: a case study of an earthquake science application. Int. J. High Perform. Comput. Appl. (2011). https://doi.org/10.1177/1094342011414743Chen, S., He, Z., Han, X., He, X., et al.: How big data and high-performance computing drive brain science (2019). https://doi.org/10.1016/j.gpb.2019.09.003Cyfronet Krakow, P.: Prometheus supercomputer. www.cyfronet.krakow.pl/computers/15226, artykul, prometheus.html. Accessed 07 May 2020Gulo, C.A.S.J., Sementille, A.C., Tavares, J.M.R.S.: Techniques of medical image processing and analysis accelerated by high-performance computing: a systematic literature review. J. Real-Time Image Process. 16(6), 1891–1908 (2017). https://doi.org/10.1007/s11554-017-0734-zHussain, T., Haider, A., Shafique, M., Taleb Ahmed, A.: A high-performance system architecture for medical imaging (2019). https://doi.org/10.5772/intechopen.83581Ivanova, D., Borovska, P., Zahov, S.: Development of PaaS using AWS and Terraform for medical imaging analytics. In: AIP Conference Proceedings (2018). https://doi.org/10.1063/1.5082133Jamalian, S., Rajaei, H.: Data-intensive HPC tasks scheduling with SDN to enable HPC-as-a-service. In: Proceedings - 2015 IEEE 8th International Conference on Cloud Computing, CLOUD 2015, pp. 596–603. Institute of Electrical and Electronics Engineers Inc., August 2015. https://doi.org/10.1109/CLOUD.2015.85Kao, H.Y., et al.: Cloud-based service information system for evaluating quality of life after breast cancer surgery. PLoS ONE (2015). https://doi.org/10.1371/journal.pone.0139252Kovacs, L., Kovacs, R., Hajdu, A.: High performance computing in medical image analysis HuSSaR, June 2018. http://arxiv.org/abs/1806.06171Kurtzer, G.M., Sochat, V., Bauer, M.W.: Singularity: scientific containers for mobility of compute. PLOS ONE 12(5), 1–20 (2017). https://doi.org/10.1371/journal.pone.0177459López-Huguet, S., García-Castro, F., Alberich-Bayarri, A., Blanquer, I.: A cloud architecture for the execution of medical imaging biomarkers. In: Rodrigues, J., et al. (eds.) ICCS 2019. LNCS, vol. 11538, pp. 130–144. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-22744-9_10López-Huguet, S., et al.: A self-managed Mesos cluster for data analytics with QoS guarantees. Future Gener. Comput. Syst., 449–461. https://doi.org/10.1016/j.future.2019.02.047Manuali, C., et al.: Efficient workload distribution bridging HTC and HPC in scientific computing. In: Murgante, B., et al. (eds.) ICCSA 2012. LNCS, vol. 7333, pp. 345–357. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-31125-3_27Martí-Bonmatí, L., et al.: PRIMAGE project: predictive in silico multiscale analytics to support childhood cancer personalised evaluation empowered by imaging biomarkers. Eur. Radiol. Exp. 4(1), 1–11 (2020). https://doi.org/10.1186/s41747-020-00150-9Yoo, A.B., Jette, M.A., Grondona, M.: SLURM: simple linux utility for resource management. In: Feitelson, D., Rudolph, L., Schwiegelshohn, U. (eds.) JSSPP 2003. LNCS, vol. 2862, pp. 44–60. Springer, Heidelberg (2003). https://doi.org/10.1007/10968987_
A Cloud Architecture for the Execution of Medical Imaging Biomarkers
Digital Medical Imaging is increasingly being used in clinical routine and research. As a consequence, the workload in medical imaging departments in hospitals has multiplied by over 20 in the last decade. Medical Image processing requires intensive computing resources not available at hospitals, but which could be provided by public clouds. The article analyses the requirements of processing digital medical images and introduces a cloud-based architecture centred on a DevOps approach to deploying resources on demand, adjusting them based on the request of resources and the expected execution time to deal with an unplanned workload. Results presented show a low overhead and high flexibility executing a lung disease biomarker on a public cloud.The work in this article has been co-funded by project SME Instrument Phase II -
778064, QUIBIM Precision, funded by the European Commission under the INDUSTRIAL LEADERSHIP - Leadership in enabling and industrial technologies - Information and Communication Technologies (ICT), Horizon 2020, project ATMOSPHERE,
funded jointly by the European Commission under the Cooperation Programme, Horizon 2020 grant agreement No 777154 and the Brazilian Ministerio de Ciencia, Tecnologia e Inovacao (MCTI), number 51119. The authors would like also to thank the
Spanish Ministerio de Economia, Industria y Competitividad¿ for the project BigCLOE with reference number TIN2016-79951-R.López-Huguet, S.; García-Castro, F.; Alberich-Bayarri, A.; Blanquer Espert, I. (2019). A Cloud Architecture for the Execution of Medical Imaging Biomarkers. Springer. 130-144. https://doi.org/10.1007/978-3-030-22744-9_10S130144Amazon EC2 web site. https://aws.amazon.com/es/ec2/. Accessed 29 Dec 2018Apache Mesos web site. http://mesos.apache.org/. Accessed 29 Dec 2018Chronos web site. https://mesos.github.io/chronos/. Accessed 29 Dec 2018Cloudbiolinux web site. http://cloudbiolinux.org/. Accessed 29 Dec 2018Cloudman web site. https://galaxyproject.org/cloudman. Accessed 29 Dec 2018Eucalyptus web site. https://www.eucalyptus.cloud/. Accessed 29 Dec 2018Galaxy Platform web site. https://galaxyproject.org. Accessed 29 Dec 2018Imagej web site. https://imagej.nih.gov/ij/. Accessed 29 Dec 2018Kubernetes web site. https://kubernetes.io. Accessed 29 Dec 2018Linux containers. https://linuxcontainers.org/. Accessed 29 Dec 2018LXD documentation. https://lxd.readthedocs.io/. Accessed 29 Dec 2018Marathon. https://mesosphere.github.io/marathon/. Accessed 29 Dec 2018Nomad web site. https://www.nomadproject.io/. Accessed 29 Dec 2018OpenFaas web site. https://www.openfaas.com/. Accessed 29 Dec 2018OpenStack web site. https://www.openstack.org/. Accessed 29 Dec 2018de Alfonso, C., Caballer, M., Calatrava, A., Moltó, G., Blanquer, I.: Multi-elastic Datacenters: auto-scaled virtual clusters on energy-aware physical infrastructures. J. Grid Comput. (2018). https://doi.org/10.1007/s10723-018-9449-zCaballer, M., Blanquer, I., Moltó, G., de Alfonso, C.: Dynamic management of virtual infrastructures. J. Grid Comput. 13(1), 53–70 (2015). https://doi.org/10.1007/s10723-014-9296-5Calatrava, A., Romero, E., Moltó, G., Caballer, M., Alonso, J.M.: Self-managed cost-efficient virtual elastic clusters on hybrid Cloud infrastructures. Future Gener. Comput. Syst. 61, 13–25 (2016). https://doi.org/10.1016/j.future.2016.01.018De Alfonso, C., Caballer, M., Alvarruiz, F., Hernández, V.: An energy management system for cluster infrastructures. Comput. Electr. Eng. 39, 2579–2590 (2013). https://doi.org/10.1016/j.compeleceng.2013.05.004Dutka, Ł., et al.: Onedata - a step forward towards globalization of data access for computing infrastructures. Procedia Comput. Sci. 51, 2843–2847 (2015). https://doi.org/10.1016/j.procs.2015.05.445. International Conference On Computational Science, ICCS 2015European Society of Radiology (ESR): White paper on imaging biomarkers. Insights Imaging 1(2), 42–45 (2010). https://doi.org/10.1007/s13244-010-0025-8Lee, H.: Using Bioinformatics Applications on the Cloud (2013). http://dsc.soic.indiana.edu/publications/bioinformatics.pdf. Accessed 29 Dec 2018Docker Inc.: Docker. https://www.docker.com/. Accessed 29 Dec 2018Kurtzer, G.M., Sochat, V., Bauer, M.W.: Singularity: scientific containers for mobility of compute. PLOS ONE 12(5), 1–20 (2017). https://doi.org/10.1371/journal.pone.0177459López-Huguet, S., et al.: A self-managed Mesos cluster for data analytics with QoS guarantees. Future Gener. Comput. Syst. 96, 449–461 (2019). https://doi.org/10.1016/j.future.2019.02.047Martí-Bonmatí, L., García-Martí, G., Alberich-Bayarri, A., Sanz-Requena, R.: QUIBIM SL.: Método de segmentación por umbral adaptativo variable para la obtención de valores de referencia del aire corte a corte en estudios de imagen por tomografía computarizada, ES 2530424B1, 02 September 2013Martí-Bonmatí, L., Alberich-Bayarri, A.: Imaging Biomarkers: Development and Clinical Integration. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-43504-6Marwan, M., Kartit, A., Ouahmane, H.: Using cloud solution for medical image processing: issues and implementation efforts. In: 2017 3rd International Conference of Cloud Computing Technologies and Applications (CloudTech). IEEE, October 2017. https://doi.org/10.1109/cloudtech.2017.8284703Mayorga-Ruiz, I., García-Juan, D., Alberich-Bayarri, A., García-Castro, F., Martí-Bonmatí, L.: Fully automated method for lung emphysema quantification for Multidetector CT images. http://quibim.com/wp-content/uploads/2018/02/ECR_Fully-automated-quantification-of-lung-emphysema-using-CT-images.pdf. Accessed 22 Mar 2019Mirarab, A., Fard, N.G., Shamsi, M.: A cloud solution for medical image processing. Int. J. Eng. Res. Appl. 4(7), 74–82 (2014)Pérez, A., Moltó, G., Caballer, M., Calatrava, A.: Serverless computing for container-based architectures. Future Gener. Comput. Syst. 83, 50–59 (2018). https://doi.org/10.1016/j.future.2018.01.022Shakil, K.A., Alam, M.: Cloud computing in bioinformatics and big data analytics: current status and future research. In: Aggarwal, V.B., Bhatnagar, V., Mishra, D.K. (eds.) Big Data Analytics. AISC, vol. 654, pp. 629–640. Springer, Singapore (2018). https://doi.org/10.1007/978-981-10-6620-7_60Weil, S.A., Brandt, S.A., Miller, E.L., Long, D.D.E., Maltzahn, C.: Ceph: a scalable, high-performance distributed file system. In: Proceedings of the 7th Symposium on Operating Systems Design and Implementation, OSDI 2006, pp. 307–320. USENIX Association, Berkeley (2006). http://dl.acm.org/citation.cfm?id=1298455.129848
Aligning Protein-Coding Nucleotide Sequences with MACSE
International audienceMost genomic and evolutionary comparative analyses rely on accurate multiple sequence alignments. With their underlying codon structure, protein-coding nucleotide sequences pose a specific challenge for multiple sequence alignment. Multiple Alignment of Coding Sequences (MACSE) is a multiple sequence alignment program that provided the first automatic solution for aligning protein-coding gene datasets containing both functional and nonfunctional sequences (pseudogenes). Through its unique features, reliable codon alignments can be built in the presence of frameshifts and stop codons suitable for subsequent analysis of selection based on the ratio of nonsynonymous to synonymous substitutions. Here we offer a practical overview and guidelines on the use of MACSE v2. This major update of the initial algorithm now comes with a graphical interface providing user-friendly access to different subprograms to handle multiple alignments of protein-coding sequences. We also present new pipelines based on MACSE v2 subprograms to handle large datasets and distributed as Singularity containers. MACSE and associated pipelines are available at: https://bioweb.supagro.inra.fr/macse/
Scalable workflows and reproducible data analysis for genomics
Biological, clinical, and pharmacological research now often involves analyses of genomes, transcriptomes, proteomes, and interactomes, within and between individuals and across species. Due to large volumes, the analysis and integration of data generated by such high-throughput technologies have become computationally intensive, and analysis can no longer happen on a typical desktop computer. In this chapter we show how to describe and execute the same analysis using a number of workflow systems and how these follow different approaches to tackle execution and reproducibility issues. We show how any researcher can create a reusable and reproducible bioinformatics pipeline that can be deployed and run anywhere. We show how to create a scalable, reusable, and shareable workflow using four different workflow engines: The Common Workflow Language (CWL), Guix Workflow Language (GWL), Snakemake, and Nextflow. Each of which can be run in parallel. We show how to bundle a number of tools used in evolutionary biology by using Debian, GNU Guix, and Bioconda software distributions, along with the use of container systems, such as Docker, GNU Guix, and Singularity. Together these distributions represent the overall majority of software packages relevant for biology, including PAML, Muscle, MAFFT, MrBayes, and BLAST. By bundling software in lightweight containers, they can be deployed on a desktop, in the cloud, and, increasingly, on compute clusters. By bundling software through these public software distributions, and by creating reproducible and shareable pipelines using these workflow engines, not only do bioinformaticians have to spend less time reinventing the wheel but also do we get closer to the ideal of making science reproducible. The examples in this chapter allow a quick comparison of different solutions.</p