742 research outputs found
Distributed simulation optimization and parameter exploration framework for the cloud
Simulation models are becoming an increasingly popular tool for the analysis and optimization of complex real systems in different fields. Finding an optimal system design requires performing a large sweep over the parameter space in an organized way. Hence, the model optimization process is extremely demanding from a computational point of view, as it requires careful, time-consuming, complex orchestration of coordinated executions. In this paper, we present the design of SOF (Simulation Optimization and exploration Framework in the cloud), a framework which exploits the computing power of a cloud computational environment in order to carry out effective and efficient simulation optimization strategies. SOF offers several attractive features. Firstly, SOF requires “zero configuration” as it does not require any additional software installed on the remote node; only standard Apache Hadoop and SSH access are sufficient. Secondly, SOF is transparent to the user, since the user is totally unaware that the system operates on a distributed environment. Finally, SOF is highly customizable and programmable, since it enables the running of different simulation optimization scenarios using diverse programming languages – provided that the hosting platform supports them – and different simulation toolkits, as developed by the modeler. The tool has been fully developed and is available on a public repository1 under the terms of the open source Apache License. It has been tested and validated on several private platforms, such as a dedicated cluster of workstations, as well as on public platforms, including the Hortonworks Data Platform and Amazon Web Services Elastic MapReduce solution
Behavioral types in programming languages
A recent trend in programming language research is to use behav- ioral type theory to ensure various correctness properties of large- scale, communication-intensive systems. Behavioral types encompass concepts such as interfaces, communication protocols, contracts, and choreography. The successful application of behavioral types requires a solid understanding of several practical aspects, from their represen- tation in a concrete programming language, to their integration with other programming constructs such as methods and functions, to de- sign and monitoring methodologies that take behaviors into account. This survey provides an overview of the state of the art of these aspects, which we summarize as the pragmatics of behavioral types
Agent-based resource management for grid computing
A computational grid is a hardware and software infrastructure that provides
dependable, consistent, pervasive, and inexpensive access to high-end
computational capability. An ideal grid environment should provide access to the
available resources in a seamless manner. Resource management is an important
infrastructural component of a grid computing environment. The overall aim of
resource management is to efficiently schedule applications that need to utilise the
available resources in the grid environment. Such goals within the high
performance community will rely on accurate performance prediction capabilities.
An existing toolkit, known as PACE (Performance Analysis and Characterisation
Environment), is used to provide quantitative data concerning the performance of
sophisticated applications running on high performance resources. In this thesis an
ASCI (Accelerated Strategic Computing Initiative) kernel application, Sweep3D,
is used to illustrate the PACE performance prediction capabilities. The validation
results show that a reasonable accuracy can be obtained, cross-platform
comparisons can be easily undertaken, and the process benefits from a rapid
evaluation time. While extremely well-suited for managing a locally distributed
multi-computer, the PACE functions do not map well onto a wide-area
environment, where heterogeneity, multiple administrative domains, and communication irregularities dramatically complicate the job of resource
management. Scalability and adaptability are two key challenges that must be
addressed.
In this thesis, an A4 (Agile Architecture and Autonomous Agents) methodology is
introduced for the development of large-scale distributed software systems with
highly dynamic behaviours. An agent is considered to be both a service provider
and a service requestor. Agents are organised into a hierarchy with service
advertisement and discovery capabilities. There are four main performance
metrics for an A4 system: service discovery speed, agent system efficiency,
workload balancing, and discovery success rate.
Coupling the A4 methodology with PACE functions, results in an Agent-based
Resource Management System (ARMS), which is implemented for grid
computing. The PACE functions supply accurate performance information (e. g.
execution time) as input to a local resource scheduler on the fly. At a meta-level,
agents advertise their service information and cooperate with each other to
discover available resources for grid-enabled applications. A Performance
Monitor and Advisor (PMA) is also developed in ARMS to optimise the
performance of the agent behaviours.
The PMA is capable of performance modelling and simulation about the agents in
ARMS and can be used to improve overall system performance. The PMA can
monitor agent behaviours in ARMS and reconfigure them with optimised
strategies, which include the use of ACTs (Agent Capability Tables), limited
service lifetime, limited scope for service advertisement and discovery, agent
mobility and service distribution, etc.
The main contribution of this work is that it provides a methodology and
prototype implementation of a grid Resource Management System (RMS). The
system includes a number of original features that cannot be found in existing
research solutions
Follow-the-leader Formation Marching Through a Scalable O(log2n) Parallel Architecture.
An important topic in the field of Multi Robot Systems focuses on motion coordination and synchronization for formation keeping. Although several works have addressed such problem, little attention has been devoted to study the computational complexity within the framework of large-scale systems. This paper presents our current work on how to achieve high computational performance for systems composed by a large number of robots that must fulfill with a marching and formation task. A scalable Multi-Processor Parallel Architecture is introduced with the purpose of achieving scalability, i.e., computation time of O(log2n) for a n-robots system. Our architecture has been tested onto a multi-processor system and validated against several simulations testing
Recommended from our members
A distributed analysis and monitoring framework for the compact Muon solenoid experiment and a pedestrian simulation
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.The design of a parallel and distributed computing system is a very complicated task. It requires a detailed understanding of the design issues and of the theoretical and practical aspects of their solutions. Firstly, this thesis discusses in detail the major concepts and components required to make parallel and distributed computing a reality. A multithreaded and distributed framework capable of analysing the simulation data produced by a pedestrian simulation software was developed. Secondly, this thesis discusses the origins and fundamentals of Grid computing and the motivations for its use in High Energy Physics. Access to the data produced by the Large Hadron Collider (LHC) has to be provided for more than five thousand scientists all over the world. Users who run analysis jobs on the Grid do not necessarily have expertise in Grid computing. Simple, userfriendly and reliable monitoring of the analysis jobs is one of the key components of the operations of the distributed analysis; reliable monitoring is one of the crucial components of the Worldwide LHC Computing Grid for providing the functionality and performance that is required by the LHC experiments. The CMS Dashboard Task Monitoring and the CMS Dashboard Job Summary monitoring applications were developed to serve the needs of the CMS community
- …