180,122 research outputs found
Enabling Adaptive Grid Scheduling and Resource Management
Wider adoption of the Grid concept has led to an increasing amount of federated
computational, storage and visualisation resources being available to scientists and
researchers. Distributed and heterogeneous nature of these resources renders most of the
legacy cluster monitoring and management approaches inappropriate, and poses new
challenges in workflow scheduling on such systems. Effective resource utilisation monitoring
and highly granular yet adaptive measurements are prerequisites for a more efficient Grid
scheduler. We present a suite of measurement applications able to monitor per-process
resource utilisation, and a customisable tool for emulating observed utilisation models. We
also outline our future work on a predictive and probabilistic Grid scheduler. The research is
undertaken as part of UK e-Science EPSRC sponsored project SO-GRM (Self-Organising
Grid Resource Management) in cooperation with BT
Resource and Application Models for Advanced Grid Schedulers
As Grid computing is becoming an inevitable future, managing, scheduling and monitoring dynamic, heterogeneous resources will present new challenges. Solutions will have to be agile and adaptive, support self-organization and autonomous management, while maintaining optimal resource utilisation. Presented in this paper are basic principles and architectural concepts for efficient resource allocation in heterogeneous Grid environment
Job Monitoring in an Interactive Grid Analysis Environment
The grid is emerging as a great computational resource but
its dynamic behavior makes the Grid environment unpredictable. Systems and networks can fail, and the
introduction of more users can result in resource starvation.
Once a job has been submitted for execution on the grid,
monitoring becomes essential for a user to see that the job is completed in an efficient way, and to detect any problems
that occur while the job is running. In current environments
once a user submits a job he loses direct control over the job and the system behaves like a batch system: the user
submits the job and later gets a result back. The only
information a user can obtain about a job is whether it is
scheduled, running, cancelled or finished. Today users are
becoming increasingly interested in such analysis grid
environments in which they can check the progress of the
job, obtain intermediate results, terminate the job based on
the progress of job or intermediate results, steer the job to
other nodes to achieve better performance and check the
resources consumed by the job. In order to fulfill their
requirements of interactivity a mechanism is needed that
can provide the user with real time access to information
about different attributes of a job. In this paper we present
the design of a Job Monitoring Service, a web service that
will provide interactive remote job monitoring by allowing
users to access different attributes of a job once it has been submitted to the interactive Grid Analysis Environment
Development of Grid e-Infrastructure in South-Eastern Europe
Over the period of 6 years and three phases, the SEE-GRID programme has
established a strong regional human network in the area of distributed
scientific computing and has set up a powerful regional Grid infrastructure. It
attracted a number of user communities and applications from diverse fields
from countries throughout the South-Eastern Europe. From the infrastructure
point view, the first project phase has established a pilot Grid infrastructure
with more than 20 resource centers in 11 countries. During the subsequent two
phases of the project, the infrastructure has grown to currently 55 resource
centers with more than 6600 CPUs and 750 TBs of disk storage, distributed in 16
participating countries. Inclusion of new resource centers to the existing
infrastructure, as well as a support to new user communities, has demanded
setup of regionally distributed core services, development of new monitoring
and operational tools, and close collaboration of all partner institution in
managing such a complex infrastructure. In this paper we give an overview of
the development and current status of SEE-GRID regional infrastructure and
describe its transition to the NGI-based Grid model in EGI, with the strong SEE
regional collaboration.Comment: 22 pages, 12 figures, 4 table
Performance testing of distributed computational resources in the software development phase
A grid software harmonization is possible through adoption of standards i.e. common protocols and interfaces. In the development phase of standard implementation, the performance testing of grid subsystems can detect hidden software issues which are not detectable using other testing procedures. A simple software solution was proposed which consists of a communication layer, resource consumption agents hosted in computational resources (clients or servers), a database of the performance results and a web interface to visualize the results. Communication between agents, monitoring the resources and main control Python script (supervisor) is possible through the communication layer based on the secure XML-RPC protocol. The resource monitoring agent is a key element of performance testing which provides information about all monitored processes including their child processes. The agent is a simple Python script based on the Python psutil library. The second agent, provided after the resource monitored phase, records data from the resources in the central MySQL database. The results can be queried and visualized using a web interface. The database and data visualization scripts could be considered for a service thus the testers do not need install them to run own tests
Resource monitoring with globus toolkit 4.
The past few years have seen the Grid rapidly evolving towards a service-oriented computing infrastructure. With the OGSA facilitating this evolution, it is expected that WSRF will be acting as the main an enabling technology to drive the Grid further. Resource monitoring plays a critical role in managing a large-scale Grid system. This paper presents GREMO, a lightweight resource monitor developed with Globus Toolkit 4 (GT4) for monitoring CPU and memory of computing nodes in a Windows and Linux environments
- …