1,093 research outputs found

    Meta-brokering solution for establishing Grid Interoperability

    Get PDF

    Distributed Computing in a Pandemic

    Get PDF
    The current COVID-19 global pandemic caused by the SARS-CoV-2 betacoronavirus has resulted in over a million deaths and is having a grave socio-economic impact, hence there is an urgency to find solutions to key research challenges. Much of this COVID-19 research depends on distributed computing. In this article, I review distributed architectures -- various types of clusters, grids and clouds -- that can be leveraged to perform these tasks at scale, at high-throughput, with a high degree of parallelism, and which can also be used to work collaboratively. High-performance computing (HPC) clusters will be used to carry out much of this work. Several bigdata processing tasks used in reducing the spread of SARS-CoV-2 require high-throughput approaches, and a variety of tools, which Hadoop and Spark offer, even using commodity hardware. Extremely large-scale COVID-19 research has also utilised some of the world's fastest supercomputers, such as IBM's SUMMIT -- for ensemble docking high-throughput screening against SARS-CoV-2 targets for drug-repurposing, and high-throughput gene analysis -- and Sentinel, an XPE-Cray based system used to explore natural products. Grid computing has facilitated the formation of the world's first Exascale grid computer. This has accelerated COVID-19 research in molecular dynamics simulations of SARS-CoV-2 spike protein interactions through massively-parallel computation and was performed with over 1 million volunteer computing devices using the Folding@home platform. Grids and clouds both can also be used for international collaboration by enabling access to important datasets and providing services that allow researchers to focus on research rather than on time-consuming data-management tasks

    A Chemistry-Inspired Workflow Management System for a Decentralized Composite Service Execution

    Get PDF
    With the recent widespread adoption of service-oriented architecture, the dynamic composition of such services is now a crucial issue in the area of distributed computing. The coordination and execution of composite Web services are today typically conducted by heavyweight centralized workflow engines, leading to an increasing probability of processing and communication bottleneck and failures. In addition, centralization induces higher deployment costs, such as the computing infrastructure to support the workflow engine, which is not affordable for a large number of small businesses and end-users. Last but not least, central workflow engines leads to diverse inadequate consequences dealing with privacy or energy consumption. In a world where platforms are more and more dynamic and elastic as promised by cloud computing, decentralized and dynamic interaction schemes are required. Addressing the characteristics of such platforms, nature-inspired analogies recently regained attention to provide autonomous service coordination on top of dynamic large scale platforms. In this report, we propose a decentralized approach for the execution of composite Web services based on an unconventional programming paradigm that relies on the chemical metaphor. It provides a high-level execution model that allows executing composite services in a fully decentralized manner. Composed of services communicating through a persistent shared space containing control and data flows between services, our architecture allows to distribute the composition among nodes without the need for any centralized coordination. A proof of concept is given, through the deployment of a software prototype implementing these concepts, showing the viability of an autonomic vision of service composition.Suite Ă  l'adoption grandissante des architectures orientĂ©es service, la composition dynamique de services est devenu un problĂšme important de la construction de plates-formes de calcul distribuĂ©. La coordination et l'exĂ©cutiuon de Web Service composites sont aujourd'hui typiquement conduits par des moteurs de "workflows" (graphes de composition de services, formant un "service composite") centralisĂ©s, entrainant diffĂ©rents problĂšmes, et notamment une probabilitĂ© grandissante d'apparition d'Ă©checs ou de goulots d'Ă©tranglement. Dans un monde oĂč les plate-formes sont de plus en plus dynamiques (ou "Ă©lastiques", comme envisagĂ© par les "clouds", de nouveaux mĂ©canismes de coordination dynamiques sont requis. Dans ce contexte, des mĂ©taphores naturelles ont gagnĂ© une attention particuliĂšre rĂ©cemment, car elles fournissent des abstractions pour la coordination autonome d'entitĂ©s (commes les services.) Dans ce rapport, une approche dĂ©centralisĂ©e pour l'exĂ©cution de Web Services composites fondĂ©e sur la mĂ©taphore chimique, qui fournit un modĂšle d'exĂ©cution haut-niveau pour l'exĂ©cution dĂ©centralisĂ©e, est prĂ©sentĂ©e. Dans cette architecture, les services communiquent Ă  travers un espace virtuellement partagĂ© persistant contenant l'information sur les flux de contrĂŽle et de donnĂ©es, permettant une coordination dĂ©centralisĂ©e des services. Un prototype logiciel a Ă©tĂ© dĂ©veloppĂ© et expĂ©rimentĂ©. Les rĂ©sultats de ces expĂ©riences sont prĂ©sentĂ©s Ă  la fin de ce rapport

    Virtual Cluster Management for Analysis of Geographically Distributed and Immovable Data

    Get PDF
    Thesis (Ph.D.) - Indiana University, Informatics and Computing, 2015Scenarios exist in the era of Big Data where computational analysis needs to utilize widely distributed and remote compute clusters, especially when the data sources are sensitive or extremely large, and thus unable to move. A large dataset in Malaysia could be ecologically sensitive, for instance, and unable to be moved outside the country boundaries. Controlling an analysis experiment in this virtual cluster setting can be difficult on multiple levels: with setup and control, with managing behavior of the virtual cluster, and with interoperability issues across the compute clusters. Further, datasets can be distributed among clusters, or even across data centers, so that it becomes critical to utilize data locality information to optimize the performance of data-intensive jobs. Finally, datasets are increasingly sensitive and tied to certain administrative boundaries, though once the data has been processed, the aggregated or statistical result can be shared across the boundaries. This dissertation addresses management and control of a widely distributed virtual cluster having sensitive or otherwise immovable data sets through a controller. The Virtual Cluster Controller (VCC) gives control back to the researcher. It creates virtual clusters across multiple cloud platforms. In recognition of sensitive data, it can establish a single network overlay over widely distributed clusters. We define a novel class of data, notably immovable data that we call "pinned data", where the data is treated as a first-class citizen instead of being moved to where needed. We draw from our earlier work with a hierarchical data processing model, Hierarchical MapReduce (HMR), to process geographically distributed data, some of which are pinned data. The applications implemented in HMR use extended MapReduce model where computations are expressed as three functions: Map, Reduce, and GlobalReduce. Further, by facilitating information sharing among resources, applications, and data, the overall performance is improved. Experimental results show that the overhead of VCC is minimum. The HMR outperforms traditional MapReduce model while processing a particular class of applications. The evaluations also show that information sharing between resources and application through the VCC shortens the hierarchical data processing time, as well satisfying the constraints on the pinned data
    • 

    corecore