14 research outputs found

    January 1 - December 31, 2012

    Get PDF
    This report summarizes training, education, and outreach activities for calendar 2012 of PTI and affiliated organizations, including the School of Informatics and Computing, Office of the Vice President for Information Technology, and Maurer School of Law. Reported activities include those led by PTI Research Centers (Center for Applied Cybersecurity Research, Center for Research in Extreme Scale Technologies, Data to Insight Center, Digital Science Center) and Service and Cyberinfrastructure Centers (Research Technologies Division of University Information Technology Services, National Center for Genome Assembly Support

    Virtual Cluster Management for Analysis of Geographically Distributed and Immovable Data

    Get PDF
    Thesis (Ph.D.) - Indiana University, Informatics and Computing, 2015Scenarios exist in the era of Big Data where computational analysis needs to utilize widely distributed and remote compute clusters, especially when the data sources are sensitive or extremely large, and thus unable to move. A large dataset in Malaysia could be ecologically sensitive, for instance, and unable to be moved outside the country boundaries. Controlling an analysis experiment in this virtual cluster setting can be difficult on multiple levels: with setup and control, with managing behavior of the virtual cluster, and with interoperability issues across the compute clusters. Further, datasets can be distributed among clusters, or even across data centers, so that it becomes critical to utilize data locality information to optimize the performance of data-intensive jobs. Finally, datasets are increasingly sensitive and tied to certain administrative boundaries, though once the data has been processed, the aggregated or statistical result can be shared across the boundaries. This dissertation addresses management and control of a widely distributed virtual cluster having sensitive or otherwise immovable data sets through a controller. The Virtual Cluster Controller (VCC) gives control back to the researcher. It creates virtual clusters across multiple cloud platforms. In recognition of sensitive data, it can establish a single network overlay over widely distributed clusters. We define a novel class of data, notably immovable data that we call "pinned data", where the data is treated as a first-class citizen instead of being moved to where needed. We draw from our earlier work with a hierarchical data processing model, Hierarchical MapReduce (HMR), to process geographically distributed data, some of which are pinned data. The applications implemented in HMR use extended MapReduce model where computations are expressed as three functions: Map, Reduce, and GlobalReduce. Further, by facilitating information sharing among resources, applications, and data, the overall performance is improved. Experimental results show that the overhead of VCC is minimum. The HMR outperforms traditional MapReduce model while processing a particular class of applications. The evaluations also show that information sharing between resources and application through the VCC shortens the hierarchical data processing time, as well satisfying the constraints on the pinned data

    Contributions to Desktop Grid Computing : From High Throughput Computing to Data-Intensive Sciences on Hybrid Distributed Computing Infrastructures

    Get PDF
    Since the mid 90’s, Desktop Grid Computing - i.e the idea of using a large number of remote PCs distributed on the Internet to execute large parallel applications - has proved to be an efficient paradigm to provide a large computational power at the fraction of the cost of a dedicated computing infrastructure.This document presents my contributions over the last decade to broaden the scope of Desktop Grid Computing. My research has followed three different directions. The first direction has established new methods to observe and characterize Desktop Grid resources and developed experimental platforms to test and validate our approach in conditions close to reality. The second line of research has focused on integrating Desk- top Grids in e-science Grid infrastructure (e.g. EGI), which requires to address many challenges such as security, scheduling, quality of service, and more. The third direction has investigated how to support large-scale data management and data intensive applica- tions on such infrastructures, including support for the new and emerging data-oriented programming models.This manuscript not only reports on the scientific achievements and the technologies developed to support our objectives, but also on the international collaborations and projects I have been involved in, as well as the scientific mentoring which motivates my candidature for the Habilitation `a Diriger les Recherches

    XSEDE: eXtreme Science and Engineering Discovery Environment Third Quarter 2012 Report

    Get PDF
    The Extreme Science and Engineering Discovery Environment (XSEDE) is the most advanced, powerful, and robust collection of integrated digital resources and services in the world. It is an integrated cyberinfrastructure ecosystem with singular interfaces for allocations, support, and other key services that researchers can use to interactively share computing resources, data, and expertise.This a report of project activities and highlights from the third quarter of 2012.National Science Foundation, OCI-105357

    Congestion control mechanism for sensor-cloud Infrastructure

    Full text link
     This thesis has developed a sensor-Cloud system that integrates WBANs with Cloud computing to enable real-time sensor data collection, storage, processing, sharing and management. As the main contribution of this study, a congestion detection and control protocol is proposed to ensure acceptable data flows are maintained during the network lifetime

    Explorando a elasticidade em nível de programação no desenvolvimento e na execução de aplicações científicas

    Get PDF
    Orientador : Prof. Dr. Luis Carlos Erpen de BonaTese (doutorado) - Universidade Federal do Paraná, Setor de Ciências Exatas, Programa de Pós-Graduação em Informática. Defesa: Curitiba, 28/04/2014Inclui referênciasResumo: A elasticidade pode ser definida como a capacidade de um sistema de modificar dinamicamente os recursos computacionais utilizados por uma aplicação. Embora diversos mecanismos de elasticidade tenham sido propostos, ainda apresentam uma série de limitações ao fornecer suporte à elasticidade para aplicações científicas. Neste trabalho, propõe-se uma abordagem para o desenvolvimento de aplicações científicas elásticas, na qual o controle da elasticidade é feito em nível de programação. Isso significa que o controle de elasticidade é incorporado ao código-fonte, permitindo que as ações de alocação e desalocação de recursos possam ser realizadas pela própria aplicação e não dependa de mecanismos externos ou interação com o usuário. A construção de aplicações elásticas de acordo com abordagem proposta baseia-se no conceito de primitivas de elasticidade, um conjunto de funções que permitem que as aplicações comuniquem-se com a nuvem para solicitar ou liberar recursos, bem como para coletar informações do ambiente virtual e da nuvem. Assim, é possível desenvolver controladores de elasticidade sob medida que permitem que aplicações ajustem seus próprios recursos de acordo com suas demandas ou de modo a satisfazer algum critério específico, por exemplo, custo ou desempenho. A abordagem também permite que bibliotecas e frameworks de programação paralela possam ser construídas ou adaptadas de modo a oferecer elasticidade de modo transparente. Para permitir a construção de aplicações elásticas utilizando a abordagem proposta, desenvolveu-se o framework Cloudine. O Cloudine fornece as primitivas de elasticidade e um ambiente de execução, o qual oferece o suporte para a execução das aplicações elásticas na nuvem. A exploração da elasticidade em nível de programação é validada por um conjunto de experimentos realizados utilizando o Cloudine. O framework é utilizado com sucesso para fornecer elasticidade a um conjunto de aplicações, dentre as quais destacam-se uma aplicação de montagem de genomas (SAND) e um modelo climático (OLAM). O Cloudine também é usado para estender a biblioteca OpenMP do GCC (libgomp) para oferecer elasticidade de modo automático e transparente.Abstract: Elasticity is defined as the ability to adaptively scale resources up and down in order to meet varying application demands. Although several mechanisms to provide this feature are offered by public cloud providers and in some academic works, we argue that these solutions present limitations in providing elasticity for scientific applications, since they are not developed to this purpose and cannot consider the particularities of this class of applications. In this thesis we propose an approach for exploring the elasticity in scientific applications, in which the elasticity control is embedded within application code and the elasticity actions (allocation and deallocation of resources) are performed by the application itself, based in its runtime requirements or internal events. The development of embedded elasticity controllers is based on the concept of elasticity primitives, which are basic functions that allow to perform requests for allocation or deallocation of resources directly to the cloud. Thus, it is possible to develop tailor made elasticity controllers that enable applications to adjust its own resources according to its demands or to satisfy some specific criteria, such as cost or performance. It is also possible to develop elasticity-aware parallel processing middleware that transparently support applications elasticity. To enable the construction of elastic applications using the presented approach, we developed the Cloudine framework. Cloudine provides the primitive set and a runtime environment, which supports the execution of elastic application in the cloud. The proposed approach is validated by a set of experiments using Cloudine. The framework was successfully used to provide elasticity to a number of applications, among which we highlight a genome assembler (SAND) and a climate model (OLAM). The Cloudine is also used to extend the GCC’s OpenMP library (libgomp) to provide automatic allocation of resources

    17th SC@RUG 2020 proceedings 2019-2020

    Get PDF

    17th SC@RUG 2020 proceedings 2019-2020

    Get PDF

    17th SC@RUG 2020 proceedings 2019-2020

    Get PDF
    corecore