81,664 research outputs found

    A service-oriented architecture for scientific computing on cloud infrastructures

    Full text link
    This paper describes a service-oriented architecture that eases the process of scientific application deployment and execution in IaaS Clouds, with a focus on High Throughput Computing applications. The system integrates i) a catalogue and repository of Virtual Machine Images, ii) an application deployment and configuration tool, iii) a meta-scheduler for job execution management and monitoring. The developed system significantly reduces the time required to port a scientific application to these computational environments. This is exemplified by a case study with a computationally intensive protein design application on both a private Cloud and a hybrid three-level infrastructure (Grid, private and public Cloud).The authors wish to thank the financial support received from the Generalitat Valenciana for the project GV/2012/076 and to the Ministerio de Econom´ıa y Competitividad for the project CodeCloud (TIN2010-17804)Moltó, G.; Calatrava Arroyo, A.; Hernández García, V. (2013). A service-oriented architecture for scientific computing on cloud infrastructures. En High Performance Computing for Computational Science - VECPAR 2012. Springer Verlag (Germany). 163-176. doi:10.1007/978-3-642-38718-0_18S163176Vaquero, L.M., Rodero-Merino, L., Caceres, J., Lindner, M.: A break in the clouds. ACM SIGCOMM Computer Communication Review 39(1), 50 (2008)Armbrust, M., Fox, A., Griffith, R., Joseph, A.: Above the clouds: A berkeley view of cloud computing. Technical report, UC Berkeley Reliable Adaptive Distributed Systems Laboratory (2009)Rehr, J., Vila, F., Gardner, J., Svec, L., Prange, M.: Scientific computing in the cloud. Computing in Science 99 (2010)Keahey, K., Figueiredo, R., Fortes, J., Freeman, T., Tsugawa, M.: Science Clouds: Early Experiences in Cloud Computing for Scientific Applications. In: Cloud Computing and its Applications (2008)Carrión, J.V., Moltó, G., De Alfonso, C., Caballer, M., Hernández, V.: A Generic Catalog and Repository Service for Virtual Machine Images. In: 2nd International ICST Conference on Cloud Computing (CloudComp 2010) (2010)Moltó, G., Hernández, V., Alonso, J.: A service-oriented WSRF-based architecture for metascheduling on computational Grids. Future Generation Computer Systems 24(4), 317–328 (2008)Krishnan, S., Clementi, L., Ren, J., Papadopoulos, P., Li, W.: Design and Evaluation of Opal2: A Toolkit for Scientific Software as a Service. In: 2009 IEEE Congress on Services (2009)Distributed Management Task Force (DMTF): The Open Virtualization Format Specification (Technical report)Raman, R., Livny, M., Solomon, M.: Matchmaking: Distributed Resource Management for High Throughput Computing. In: Proceedings of the Seventh IEEE International Symposium on High Performance Distributed Computing, pp. 28–31 (1998)Wei, J., Zhang, X., Ammons, G., Bala, V., Ning, P.: Managing security of virtual machine images in a cloud environment. ACM Press, New York (2009)Keahey, K., Freeman, T.: Contextualization: Providing One-Click Virtual Clusters. In: Fourth IEEE International Conference on eScience, pp. 301–308 (2008)Foster, I.: Globus toolkit version 4: Software for service-oriented systems. Journal of Computer Science and Technology 21(4), 513–520 (2006)Moltó, G., Suárez, M., Tortosa, P., Alonso, J.M., Hernández, V., Jaramillo, A.: Protein design based on parallel dimensional reduction. Journal of Chemical Information and Modeling 49(5), 1261–1271 (2009)Calatrava, A.: In: Use of Grid and Cloud Hybrid Infrastructures for Scientific Computing (M.Sc. Thesis in Spanish), Universitat Politècnica de València (2012)Keahey, K., Freeman, T., Lauret, J., Olson, D.: Virtual workspaces for scientific applications. Journal of Physics: Conference Series 78(1), 012038 (2007)Pallickara, S., Pierce, M., Dong, Q., Kong, C.: Enabling Large Scale Scientific Computations for Expressed Sequence Tag Sequencing over Grid and Cloud Computing Clusters. In: Eigth International Conference on Parallel Processing and Applied Mathematics (PPAM 2009), Citeseer (2009)Merzky, A., Stamou, K., Jha, S.: Application Level Interoperability between Clouds and Grids. In: 2009 Workshops at the Grid and Pervasive Computing Conference, pp. 143–150 (2009)Thain, D., Tannenbaum, T., Livny, M.: Distributed computing in practice: the Condor experience. Concurrency and Computation: Practice and Experience 17(2-4), 323–356 (2005)Simmhan, Y., van Ingen, C., Subramanian, G., Li, J.: Bridging the Gap between Desktop and the Cloud for eScience Applications. In: 2010 IEEE 3rd International Conference on Cloud Computing, pp. 474–481. IEEE (2010)Chappell, D.: Introducing windows azure. Technical report (2009

    A platform to deploy customized scientific virtual infrastructures on the cloud

    Full text link
    This paper presents a software platform to dynamically deploy complex scientific virtual computing infrastructures, on top of Infrastructure as a Service (IaaS) Clouds. The platform orchestrates different services to provision the virtual computing resources. It dynamically installs the appropriate software to satisfy the requirements of a researcher, both on public and on-premise Clouds. The platform provides a web interface to enable the users to easily management of the lifecycle of virtual infrastructures. It enables users to define infrastructures, share them with other users, deploy and relinquish them, add or remove resources dynamically, create and share application recipes, etc. The paper also describes three case studies to deploy complex infrastructures, namely a Hadoop cluster, a single-node to perform NGS sequencing and a gateway for users to access the European Grid Infrastructure (EGI). This platform promotes a better use of on-premise hardware resources of a research center by allocating the computing resources just-in-time to the specific life time of the virtual infrastructures as well as the deployment of the very same infrastructures on a public Cloud.The authors would to thank the Spanish "Ministerio de Economia y Competitividad" for the project "Clusters Virtuales Elasticos y Migrables sobre Infraestructuras Cloud Hibridas" with reference TIN2013-44390-R.Caballer Fernández, M.; Segrelles Quilis, JD.; Moltó, G.; Blanquer Espert, I. (2015). A platform to deploy customized scientific virtual infrastructures on the cloud. Concurrency and Computation: Practice and Experience. 27(16):4318-4329. https://doi.org/10.1002/cpe.3518S431843292716Mell P Grance T The NIST definition of Cloud computing. NIST Special Publication 800-145 (Final) Technical Report 2011 http://csrc.nist.gov/publications/nistpubs/800-145/SP800-145.pdfBuyya, R., Broberg, J., & Goscinski, A. (Eds.). (2011). Cloud Computing. doi:10.1002/9780470940105Sahoo J Mohapatra S Lath R Virtualization: a survey on concepts, taxonomy and associated security issues 2010 Second International Conference on Computer and Network Technology Bangkok, Thailand 2010 222 226OpenStack OpenStack 2013 http://openstack.orgNurmi D Wolski R Grzegorczyk C Obertelli G Soman S Youseff L Zagorodnov D The Eucalyptus open-source Cloud-computing system Proceedings of 9th IEEE International Symposium on Cluster Computing and the Grid Shanghai, China 2009 124 131Amazon Web Services AWS CloudFormation http://aws.amazon.com/cloudformation/Amazon Web Services AWS OpsWorks http://aws.amazon.com/opsworks/Keahey K Freeman T Contextualization: providing one-click virtual clusters Fourth IEEE International Conference on eScience Indianapolis, Indiana, USA 2008 301 308Keahey K Freeman T Architecting a large-scale elastic environment: recontextualization and adaptive Cloud services for scientific computing 2012Marshall P Keahey K Freeman T Elastic site: using Clouds to elastically extend site resources Proceedings of the 2010 IEEE/ACM 10th International Conference on Cluster, Cloud and Grid Computing CCGRID '10 IEEE Computer Society, Washington, DC, USA 2010 43 52Bresnahan J Freeman T LaBissoniere D Keahey K Managing appliance launches in infrastructure Clouds Proceedings of the 2011 TeraGrid Conference: Extreme Digital Discovery TG '11 ACM, New York, NY, USA 2011 12:1 12:7Apache Whirr 2013 from:http://whirr.apache.org/Juve G Deelman E Automating application deployment in infrastructure clouds Proceedings of the 2011 IEEE Third International Conference on Cloud Computing Technology and Science CLOUDCOM '11 IEEE Computer Society, Washington, DC, USA 2011 658 665OASIS Topology and orchestration specification for cloud applications version 1.0 2013 http://docs.oasis-open.org/tosca/TOSCA/v1.0/TOSCA-v1.0.htmlBinz T Breitenbcher U Haupt F Kopp O Leymann F Nowak A Wagner S OpenTOSCA - a runtime for TOSCA-based cloud applications ICSOC, Lecture Notes in Computer Science 8274 Springer 2013 692 695Puppet Labs IT automation software for system administrators 2013 http://www.puppetlabs.com/Opscode Chef 2013 http://www.opscode.com/chef/DeHaan M Ansible 2013 http://ansible.cc/Vogels, W. (2008). Beyond server consolidation. Queue, 6(1), 20. doi:10.1145/1348583.1348590Carrión JV Moltó G De Alfonso C Caballer M Hernández V A generic catalog and repository service for virtual machine images 2nd International ICST Conference on Cloud Computing (CloudComp 2010) Barcelona, Spain 2010 1 15de Alfonso C Caballer M Alvarruiz F Molto G Hernández V Infrastructure deployment over the Cloud 2011 IEEE Third International Conference on Cloud Computing Technology and Science Athens, Greece 2011 517 521Caballer, M., Blanquer, I., Moltó, G., & de Alfonso, C. (2014). Dynamic Management of Virtual Infrastructures. Journal of Grid Computing, 13(1), 53-70. doi:10.1007/s10723-014-9296-5Dean, J., & Ghemawat, S. (2008). MapReduce. Communications of the ACM, 51(1), 107. doi:10.1145/1327452.1327492Shvachko K Kuang H Radia S Chansler R The Hadoop distributed file system 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST) Incline Village, NV, USA 2010 1 10Altschul, S. F., Gish, W., Miller, W., Myers, E. W., & Lipman, D. J. (1990). Basic local alignment search tool. Journal of Molecular Biology, 215(3), 403-410. doi:10.1016/s0022-2836(05)80360-

    Grid Infrastructure for Domain Decomposition Methods in Computational ElectroMagnetics

    Get PDF
    The accurate and efficient solution of Maxwell's equation is the problem addressed by the scientific discipline called Computational ElectroMagnetics (CEM). Many macroscopic phenomena in a great number of fields are governed by this set of differential equations: electronic, geophysics, medical and biomedical technologies, virtual EM prototyping, besides the traditional antenna and propagation applications. Therefore, many efforts are focussed on the development of new and more efficient approach to solve Maxwell's equation. The interest in CEM applications is growing on. Several problems, hard to figure out few years ago, can now be easily addressed thanks to the reliability and flexibility of new technologies, together with the increased computational power. This technology evolution opens the possibility to address large and complex tasks. Many of these applications aim to simulate the electromagnetic behavior, for example in terms of input impedance and radiation pattern in antenna problems, or Radar Cross Section for scattering applications. Instead, problems, which solution requires high accuracy, need to implement full wave analysis techniques, e.g., virtual prototyping context, where the objective is to obtain reliable simulations in order to minimize measurement number, and as consequence their cost. Besides, other tasks require the analysis of complete structures (that include an high number of details) by directly simulating a CAD Model. This approach allows to relieve researcher of the burden of removing useless details, while maintaining the original complexity and taking into account all details. Unfortunately, this reduction implies: (a) high computational effort, due to the increased number of degrees of freedom, and (b) worsening of spectral properties of the linear system during complex analysis. The above considerations underline the needs to identify appropriate information technologies that ease solution achievement and fasten required elaborations. The authors analysis and expertise infer that Grid Computing techniques can be very useful to these purposes. Grids appear mainly in high performance computing environments. In this context, hundreds of off-the-shelf nodes are linked together and work in parallel to solve problems, that, previously, could be addressed sequentially or by using supercomputers. Grid Computing is a technique developed to elaborate enormous amounts of data and enables large-scale resource sharing to solve problem by exploiting distributed scenarios. The main advantage of Grid is due to parallel computing, indeed if a problem can be split in smaller tasks, that can be executed independently, its solution calculation fasten up considerably. To exploit this advantage, it is necessary to identify a technique able to split original electromagnetic task into a set of smaller subproblems. The Domain Decomposition (DD) technique, based on the block generation algorithm introduced in Matekovits et al. (2007) and Francavilla et al. (2011), perfectly addresses our requirements (see Section 3.4 for details). In this chapter, a Grid Computing infrastructure is presented. This architecture allows parallel block execution by distributing tasks to nodes that belong to the Grid. The set of nodes is composed by physical machines and virtualized ones. This feature enables great flexibility and increase available computational power. Furthermore, the presence of virtual nodes allows a full and efficient Grid usage, indeed the presented architecture can be used by different users that run different applications

    High-Performance Cloud Computing: A View of Scientific Applications

    Full text link
    Scientific computing often requires the availability of a massive number of computers for performing large scale experiments. Traditionally, these needs have been addressed by using high-performance computing solutions and installed facilities such as clusters and super computers, which are difficult to setup, maintain, and operate. Cloud computing provides scientists with a completely new model of utilizing the computing infrastructure. Compute resources, storage resources, as well as applications, can be dynamically provisioned (and integrated within the existing infrastructure) on a pay per use basis. These resources can be released when they are no more needed. Such services are often offered within the context of a Service Level Agreement (SLA), which ensure the desired Quality of Service (QoS). Aneka, an enterprise Cloud computing solution, harnesses the power of compute resources by relying on private and public Clouds and delivers to users the desired QoS. Its flexible and service based infrastructure supports multiple programming paradigms that make Aneka address a variety of different scenarios: from finance applications to computational science. As examples of scientific computing in the Cloud, we present a preliminary case study on using Aneka for the classification of gene expression data and the execution of fMRI brain imaging workflow.Comment: 13 pages, 9 figures, conference pape

    Enhancing Job Scheduling of an Atmospheric Intensive Data Application

    Get PDF
    Nowadays, e-Science applications involve great deal of data to have more accurate analysis. One of its application domains is the Radio Occultation which manages satellite data. Grid Processing Management is a physical infrastructure geographically distributed based on Grid Computing, that is implemented for the overall processing Radio Occultation analysis. After a brief description of algorithms adopted to characterize atmospheric profiles, the paper presents an improvement of job scheduling in order to decrease processing time and optimize resource utilization. Extension of grid computing capacity is implemented by virtual machines in existing physical Grid in order to satisfy temporary job requests. Also scheduling plays an important role in the infrastructure that is handled by a couple of schedulers which are developed to manage data automaticall
    corecore