51,852 research outputs found
A Fault Tolerant, Dynamic and Low Latency BDII Architecture for Grids
The current BDII model relies on information gathering from agents that run
on each core node of a Grid. This information is then published into a Grid
wide information resource known as Top BDII. The Top level BDIIs are updated
typically in cycles of a few minutes each. A new BDDI architecture is proposed
and described in this paper based on the hypothesis that only a few attribute
values change in each BDDI information cycle and consequently it may not be
necessary to update each parameter in a cycle. It has been demonstrated that
significant performance gains can be achieved by exchanging only the
information about records that changed during a cycle. Our investigations have
led us to implement a low latency and fault tolerant BDII system that involves
only minimal data transfer and facilitates secure transactions in a Grid
environment.Comment: 18 pages; 10 figures; 4 table
Experimental Study of Remote Job Submission and Execution on LRM through Grid Computing Mechanisms
Remote job submission and execution is fundamental requirement of distributed
computing done using Cluster computing. However, Cluster computing limits usage
within a single organization. Grid computing environment can allow use of
resources for remote job execution that are available in other organizations.
This paper discusses concepts of batch-job execution using LRM and using Grid.
The paper discusses two ways of preparing test Grid computing environment that
we use for experimental testing of concepts. This paper presents experimental
testing of remote job submission and execution mechanisms through LRM specific
way and Grid computing ways. Moreover, the paper also discusses various
problems faced while working with Grid computing environment and discusses
their trouble-shootings. The understanding and experimental testing presented
in this paper would become very useful to researchers who are new to the field
of job management in Grid.Comment: Fourth International Conference on Advanced Computing & Communication
Technologies (ACCT), 201
Grid-Brick Event Processing Framework in GEPS
Experiments like ATLAS at LHC involve a scale of computing and data
management that greatly exceeds the capability of existing systems, making it
necessary to resort to Grid-based Parallel Event Processing Systems (GEPS).
Traditional Grid systems concentrate the data in central data servers which
have to be accessed by many nodes each time an analysis or processing job
starts. These systems require very powerful central data servers and make
little use of the distributed disk space that is available in commodity
computers. The Grid-Brick system, which is described in this paper, follows a
different approach. The data storage is split among all grid nodes having each
one a piece of the whole information. Users submit queries and the system will
distribute the tasks through all the nodes and retrieve the result, merging
them together in the Job Submit Server. The main advantage of using this system
is the huge scalability it provides, while its biggest disadvantage appears in
the case of failure of one of the nodes. A workaround for this problem involves
data replication or backup.Comment: 6 pages; document for CHEP'03 conferenc
Managing a Fleet of Autonomous Mobile Robots (AMR) using Cloud Robotics Platform
In this paper, we provide details of implementing a system for managing a
fleet of autonomous mobile robots (AMR) operating in a factory or a warehouse
premise. While the robots are themselves autonomous in its motion and obstacle
avoidance capability, the target destination for each robot is provided by a
global planner. The global planner and the ground vehicles (robots) constitute
a multi agent system (MAS) which communicate with each other over a wireless
network. Three different approaches are explored for implementation. The first
two approaches make use of the distributed computing based Networked Robotics
architecture and communication framework of Robot Operating System (ROS) itself
while the third approach uses Rapyuta Cloud Robotics framework for this
implementation. The comparative performance of these approaches are analyzed
through simulation as well as real world experiment with actual robots. These
analyses provide an in-depth understanding of the inner working of the Cloud
Robotics Platform in contrast to the usual ROS framework. The insight gained
through this exercise will be valuable for students as well as practicing
engineers interested in implementing similar systems else where. In the
process, we also identify few critical limitations of the current Rapyuta
platform and provide suggestions to overcome them.Comment: 14 pages, 15 figures, journal pape
Many-Task Computing and Blue Waters
This report discusses many-task computing (MTC) generically and in the
context of the proposed Blue Waters systems, which is planned to be the largest
NSF-funded supercomputer when it begins production use in 2012. The aim of this
report is to inform the BW project about MTC, including understanding aspects
of MTC applications that can be used to characterize the domain and
understanding the implications of these aspects to middleware and policies.
Many MTC applications do not neatly fit the stereotypes of high-performance
computing (HPC) or high-throughput computing (HTC) applications. Like HTC
applications, by definition MTC applications are structured as graphs of
discrete tasks, with explicit input and output dependencies forming the graph
edges. However, MTC applications have significant features that distinguish
them from typical HTC applications. In particular, different engineering
constraints for hardware and software must be met in order to support these
applications. HTC applications have traditionally run on platforms such as
grids and clusters, through either workflow systems or parallel programming
systems. MTC applications, in contrast, will often demand a short time to
solution, may be communication intensive or data intensive, and may comprise
very short tasks. Therefore, hardware and software for MTC must be engineered
to support the additional communication and I/O and must minimize task dispatch
overheads. The hardware of large-scale HPC systems, with its high degree of
parallelism and support for intensive communication, is well suited for MTC
applications. However, HPC systems often lack a dynamic resource-provisioning
feature, are not ideal for task communication via the file system, and have an
I/O system that is not optimized for MTC-style applications. Hence, additional
software support is likely to be required to gain full benefit from the HPC
hardware
- …