Search CORE

10 research outputs found

Coscheduling techniques and monitoring tools for non-dedicated cluster computing

Author: Giné de Solà Francesc
Hernández Budé Porfidio
Luque Fadón Emilio
Solsona Theás Francesc
Publication venue
Publication date: 15/03/2004
Field of study

Our efforts are directed towards the understanding of the coscheduling mechanism in a NOW system when a parallel job is executed jointly with local workloads, balancing parallel perfor-mance against the local interactive response. Explicit and implicit coscheduling techniques in a PVM-Linux NOW (or cluster) have been implemented. Furthermore, dynamic coscheduling remains an open question when parallel jobs are executed in a non-dedicated Cluster. A basis model for dynamic coscheduling in Cluster systems is presented in this paper. Also, one dynamic coscheduling algorithm for this model is proposed. The applicability of this algorithm has been proved and its performance ana-lyzed by simulation. Finally, a new tool (named Monito) for monitoring the different queues of messages in such an environments is presented. The main aim of implementing this facility is to provide a mean of capturing the bottlenecks and overheads of the communication system in a PVM-Linux cluster.Facultad de Informátic

Servicio de Difusión de la Creación Intelectual

Coscheduling techniques and monitoring tools for non-dedicated cluster computing

Author: Giné de Solà Francesc
Hernández Budé Porfidio
Luque Fadón Emilio
Solsona Theás Francesc
Publication venue
Publication date: 15/03/2004
Field of study

Servicio de Difusión de la Creación Intelectual

Coscheduling under Memory Constraints in a NOW Environment

Author: D. Burger
D.G. Feitelson
D.G. Feitelson
F. Solsona
K.Y. Wang
W. Leinberger
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Planificación de aplicaciones best-effort y soft real-time en NOWs

Author: García Gutiérrez José Ramón
Hernández Budé Porfidio
Universitat Autònoma de Barcelona. Departament d'Arquitectura de Computadors i Sistemes Operatius
Universitat Autònoma de Barcelona. Escola d'Enginyeria
Publication venue
Publication date: 01/01/2007
Field of study

La aparición de nuevos tipos de aplicaciones, como vídeo bajo demanda, realidad virtual y videoconferencias entre otras, caracterizadas por la necesidad de cumplir sus deadlines. Este tipo de aplicaciones, han sido denominadas en la literatura aplicaciones soft-real time (SRT) periódicas. Este trabajo se centra en el problema de la planificación temporal de este nuevo tipo de aplicaciones en clusters no dedicados.L'aparició de nous tipus d'aplicacions, com vídeo sota demanda, realitat virtual i videoconferències entre unes altres, caracteritzades per la necessitat de complir les seves deadlines. Aquest tipus d'aplicacions, han estat denominades en la literatura aplicacions soft-real time (SRT) periòdiques. Aquest treball es centra en el problema de la planificació temporal d'aquest nou tipus d'aplicacions en clusters no dedicats

Diposit Digital de Documents de la UAB

G-LOMARC-TS: Lookahead group matchmaking for time/space sharing on multi-core parallel machines

Author: Zeng Xijie
Publication venue: 'University of Windsor Leddy Library'
Publication date: 01/01/2009
Field of study

Parallel machines with multi-core nodes are becoming increasingly popular. The performances of applications running on these machines are improved gradually due to the resource competition in each node. Researches have found that coscheduling different applications with complementary resource characteristics on the same set of nodes (semi time sharing) may improve the performance. We propose a scheduling algorithm G-LOMARC-TS which incorporates both space and semi time sharing scheduling methods and matches groups of jobs if possible for coscheduling. Since matchmaking may select jobs further down the waiting queue and the jobs in front of the queue may be delayed subsequently, fairness for each individual job will be watched and the delay will be kept within a limited bound. Several heuristics are used to solve the NP-complete problem of forming groups. Our experiment results show both utilization gain and average relative response time improvements of G-LOMARC-TS over other several scheduling policies

Scholarship at UWindsor

Adaptive Resource Relocation in Virtualized Heterogeneous Clusters

Author: Atif Muhammad
Publication venue
Publication date: 01/01/2010
Field of study

Cluster computing has recently gone through an evolution from single processor systems to multicore/multi-socket systems. This has resulted in lowering the cost/performance ratio of the compute machines. Compute farms that host these machines tend to become heterogeneous over time due to incremental extensions, hardware upgrades and/or nodes being purchased for users with particular needs. This heterogeneity is not surprising given the wide range of processor, memory and network technologies that become available and the relatively small price difference between these various options. Different CPU architectures, memory capacities, communication and I/O interfaces of the participating compute nodes present many challenges to job scheduling and often result in under or over utilization of the compute resources. In general, it is not feasible for the application programmers to specifically optimize their programs for such a set of differing compute n odes, due to the difficulty and time-intensiveness of such a task. The trend of heterogeneous compute farms has coincided with resurgence in the virtualization technology. Virtualization technology is receiving widespread adoption, mainly due to the benefits of server consolidation and isolation, load balancing, security and fault tolerance. Virtualization has also generated considerable interest in the High Performance Computing (HPC) community, due to the resulting high availability, fault tolerance, cluster partitioning and accommodation of conflicting user requirements. However, the HPC community is still wary of the potential overheads associated withâ€˜ virtualization, as it results in slower network communications and disk I/O, which need to be addressed. The live migration feature, available to most virtualization technologies, can be leveraged to improve the throughput of a heterogeneous compute farm (HC) used for HPC applications. For this we mitigated the slow network communication in Xen; an open source virtual machine monitor. We present a detailed analysis of the communication framework of Xen and propose communication configurations that give 50% improvement over the conventional Xen network configuration. From a detailed study of the migration facility in Xen, we propose an improvement in the live migration facility specifically targeting HPC applications. This optimization gives around 50% improvement over the default migration facility of Xen. In this thesis, we also investigate resource scheduling in heterogeneous compute farm with the perspective of dynamic resource re-mapping. Our approach is to profile each job in the compute farm at runtime, and propose a better resource mapping compared to the initial allocation. We then migrate the job(s) to the best-suited homogeneous sub-cluster to improve overall throughput of the HC. For this, we develop a novel heterogeneity and virtualization-aware profiling framework, which is able to predict the CPU and communication characteristics of high performance scientific applications. The prediction accuracy of our performance estimation model is over 80%. The framework implementation is lightweight, with an overhead of 3%. Our experiments show that we are able to improve the throughput of the compute farm by 25% and the time saved by the HC with our framework is over 30%. The framework can be readily extended to HCs supporting a cloud computing environment

The Australian National University

Portable Checkpointing for Parallel Applications

Author: Bronevetsky Grigory
Publication venue
Publication date: 30/08/2006
Field of study

High Performance Computing (HPC) systems represent the peak of modern computational capability. As ever-increasing demands for computational power have fuelled the demand for ever-larger computing systems, modern HPC systems have grown to incorporate hundreds, thousands or as many as 130,000 processors. At these scales, the huge number of individual components in a single system makes the probability that a single component will fail quite high, with today's large HPC systems featuring mean times between failures on the order of hours or a few days. As many modern computational tasks require days or months to complete, fault tolerance becomes critical to HPC system design. The past three decades have seen significant amounts of research on parallel system fault tolerance. However, as most of it has been either theoretical or has focused on low-level solutions that are embedded into a particular operating system or type of hardware, this work has had little impact on real HPC systems. This thesis attempts to address this lack of impact by describing a high-level approach for implementing checkpoint/restart functionality that decouples the fault tolerance solution from the details of the operating system, system libraries and the hardware and instead connects it to the APIs implemented by the above components. The resulting solution enables applications that use these APIs to become self-checkpointing and self-restarting regardless of the the software/hardware platform that may implement the APIs. The particular focus of this thesis is on the problem of checkpoint/restart of parallel applications. It presents two theoretical checkpointing protocols, one for the message passing communication model and one for the shared memory model. The former is the first protocol to be compatible with application-level checkpointing of individual processes, while the latter is the first protocol that is compatible with arbitrary shared memory models, APIs, implementations and consistency protocols. These checkpointing protocols are used to implement checkpointing systems for applications that use the MPI and OpenMP parallel APIs, respectively, and are first in providing checkpoint/restart to arbitrary implementations of these popular APIs. Both checkpointing systems are extensively evaluated on multiple software/hardware platforms and are shown to feature low overheads

eCommons@Cornell

A self-mobile skeleton in the presence of external loads

Author: Alsalkini Turkey
Publication venue: Mathematical and Computer Sciences
Publication date: 01/10/2017
Field of study

Multicore clusters provide cost-eﬀective platforms for running CPU-intensive and data-intensive parallel applications. To eﬀectively utilise these platforms, sharing their resources is needed amongst the applications rather than dedicated environments. When such computational platforms are shared, user applications must compete at runtime for the same resource so the demand is irregular and hence the load is changeable and unpredictable. This thesis explores a mechanism to exploit shared multicore clusters taking into account the external load. This mechanism seeks to reduce runtime by ﬁnding the best computing locations to serve the running computations. We propose a generic algorithmic data-parallel skeleton which is aware of its computations and the load state of the computing environment. This skeleton is structured using the Master/Worker pattern where the master and workers are distributed on the nodes of the cluster. This skeleton divides the problem into computations where all these computations are initiated by the master and coordinated by the distributed workers. Moreover, the skeleton has built-in mobility to implicitly move the parallel computations between two workers. This mobility is data mobility controlled by the application, the skeleton. This skeleton is not problem-speciﬁc and therefore it is able to execute diﬀerent kinds of problems. Our experiments suggest that this skeleton is able to eﬃciently compensate for unpredictable load variations. We also propose a performance cost model that estimates the continuation time of the running computations locally and remotely. This model also takes the network delay, data size and the load state as inputs to estimate the transfer time of the potential movement. Our experiments demonstrate that this model takes accurate decisions based on estimates in diﬀerent load patterns to reduce the total execution time. This model is problem-independent because it considers the progress of all current computations. Moreover, this model is based on measurements so it is not dependent on the programming language. Furthermore, this model takes into account the load state of the nodes on which the computation run. This state includes the characteristics of the nodes and hence this model is architecture-independent. Because the scheduling has direct impact on system performance, we support the skeleton with a cost-informed scheduler that uses a hybrid scheduling policy to improve the dynamicity and adaptivity of the skeleton. This scheduler has agents distributed over the participating workers to keep the load information up to date, trigger the estimations, and facilitate the mobility operations. On runtime, the skeleton co-schedules its computations over computational resources without interfering with the native operating system scheduler. We demonstrate that using a hybrid approach the system makes mobility decisions which lead to improved performance and scalability over large number of computational resources. Our experiments suggest that the adaptivity of our skeleton in shared environment improves the performance and reduces resource contention on nodes that are heavily loaded. Therefore, this adaptivity allows other applications to acquire more resources. Finally, our experiments show that the load scheduler has a low incurred overhead, not exceeding 0.6%, compared to the total execution time

ROS: The Research Output Service. Heriot-Watt University Edinburgh

Anales del XIII Congreso Argentino de Ciencias de la Computación (CACIC)

Author: De Giusti Armando Eduardo
Red de Universidades con Carreras en Informática (RedUNCI)
Simari Guillermo Ricardo
Publication venue: 'Universidad Nacional del Nordeste'
Publication date: 05/11/2012
Field of study

Contenido: Arquitecturas de computadoras Sistemas embebidos Arquitecturas orientadas a servicios (SOA) Redes de comunicaciones Redes heterogéneas Redes de Avanzada Redes inalámbricas Redes móviles Redes activas Administración y monitoreo de redes y servicios Calidad de Servicio (QoS, SLAs) Seguridad informática y autenticación, privacidad Infraestructura para firma digital y certificados digitales Análisis y detección de vulnerabilidades Sistemas operativos Sistemas P2P Middleware Infraestructura para grid Servicios de integración (Web Services o .Net)Red de Universidades con Carreras en Informática (RedUNCI

Servicio de Difusión de la Creación Intelectual

Anales del XIII Congreso Argentino de Ciencias de la Computación (CACIC)

Author: De Giusti Armando Eduardo
Red de Universidades con Carreras en Informática (RedUNCI)
Simari Guillermo Ricardo
Publication venue: 'Universidad Nacional del Nordeste'
Publication date: 05/11/2012
Field of study

Servicio de Difusión de la Creación Intelectual