61 research outputs found
Bulk Scheduling with the DIANA Scheduler
Results from the research and development of a Data Intensive and Network
Aware (DIANA) scheduling engine, to be used primarily for data intensive
sciences such as physics analysis, are described. In Grid analyses, tasks can
involve thousands of computing, data handling, and network resources. The
central problem in the scheduling of these resources is the coordinated
management of computation and data at multiple locations and not just data
replication or movement. However, this can prove to be a rather costly
operation and efficient sing can be a challenge if compute and data resources
are mapped without considering network costs. We have implemented an adaptive
algorithm within the so-called DIANA Scheduler which takes into account data
location and size, network performance and computation capability in order to
enable efficient global scheduling. DIANA is a performance-aware and
economy-guided Meta Scheduler. It iteratively allocates each job to the site
that is most likely to produce the best performance as well as optimizing the
global queue for any remaining jobs. Therefore it is equally suitable whether a
single job is being submitted or bulk scheduling is being performed. Results
indicate that considerable performance improvements can be gained by adopting
the DIANA scheduling approach.Comment: 12 pages, 11 figures. To be published in the IEEE Transactions in
Nuclear Science, IEEE Press. 200
Job Monitoring in an Interactive Grid Analysis Environment
The grid is emerging as a great computational resource but
its dynamic behavior makes the Grid environment unpredictable. Systems and networks can fail, and the
introduction of more users can result in resource starvation.
Once a job has been submitted for execution on the grid,
monitoring becomes essential for a user to see that the job is completed in an efficient way, and to detect any problems
that occur while the job is running. In current environments
once a user submits a job he loses direct control over the job and the system behaves like a batch system: the user
submits the job and later gets a result back. The only
information a user can obtain about a job is whether it is
scheduled, running, cancelled or finished. Today users are
becoming increasingly interested in such analysis grid
environments in which they can check the progress of the
job, obtain intermediate results, terminate the job based on
the progress of job or intermediate results, steer the job to
other nodes to achieve better performance and check the
resources consumed by the job. In order to fulfill their
requirements of interactivity a mechanism is needed that
can provide the user with real time access to information
about different attributes of a job. In this paper we present
the design of a Job Monitoring Service, a web service that
will provide interactive remote job monitoring by allowing
users to access different attributes of a job once it has been submitted to the interactive Grid Analysis Environment
Job Interactivity Using a Steering Service in an Interactive Grid Analysis Environment
Grid computing has been dominated by the execution of batch jobs. Interactive data analysis is a new domain in the area of grid job execution. The Grid-Enabled Analysis Environment (GAE) attempts to address this in HEP grids by the use of a Steering Service. This service will provide physicists with the continuous feedback of their jobs and will provide them with the ability to control and steer the execution of their submitted jobs. It will enable them to move their jobs to different grid nodes when desired. The Steering Service will also act autonomously to make steering decisions on behalf of the user, attempting to optimize the execution of the job. This service will also ensure the optimal consumption of the Grid user's resource quota. The Steering Service will provide a web service interface defined by standard WSDL. In this paper we have discussed how the Steering Service will facilitate interactive remote analysis of data generated in Interactive Grid Analysis Environment
A database for on-line event analysis on a distributed memory machine
Parallel in-memory databases can enhance the structuring and parallelization of programs used in High Energy Physics (HEP). Efficient database access routines are used as communication primitives which hide the communication topology in contrast to the more explicit communications like PVM or MPI. A parallel in-memory database, called SPIDER, has been implemented on a 32 node Meiko CS-2 distributed memory machine. The spider primitives generate a lower overhead than the one generated by PVM or PMI. The event reconstruction program, CPREAD of the CPLEAR experiment, has been used as a test case. Performance measurerate generated by CPLEAR
Distributed Analysis and Load Balancing System for Grid Enabled Analysis on Hand-held devices using Multi-Agents Systems
Handheld devices, while growing rapidly, are inherently constrained and lack
the capability of executing resource hungry applications. This paper presents
the design and implementation of distributed analysis and load-balancing system
for hand-held devices using multi-agents system. This system enables low
resource mobile handheld devices to act as potential clients for Grid enabled
applications and analysis environments. We propose a system, in which mobile
agents will transport, schedule, execute and return results for heavy
computational jobs submitted by handheld devices. Moreover, in this way, our
system provides high throughput computing environment for hand-held devices.Comment: 4 pages, 3 figures. Proceedings of the 3rd International Conference
on Grid and Cooperative Computing (GCC 2004
Heterogeneous Relational Databases for a Grid-enabled Analysis Environment
Grid based systems require a database access mechanism that can provide seamless homogeneous access to the requested data through a virtual data access system, i.e. a system which can take care of tracking the data that is stored in geographically distributed heterogeneous databases. This system should provide an integrated view of the data that is stored in the different repositories by using a virtual data access mechanism, i.e. a mechanism which can hide the heterogeneity of the backend databases from the client applications. This paper focuses on accessing data stored in disparate relational databases through a web service interface, and exploits the features of a Data Warehouse and Data Marts. We present a middleware that enables applications to access data stored in geographically distributed relational databases without being aware of their physical locations and underlying schema. A web service interface is provided to enable applications to access this middleware in a language and platform independent way. A prototype implementation was created based on Clarens [4], Unity [7] and POOL [8]. This ability to access the data stored in the distributed relational databases transparently is likely to be a very powerful one for Grid users, especially the scientific community wishing to collate and analyze data distributed over the Grid
Resource Management Services for a Grid Analysis Environment
Selecting optimal resources for submitting jobs on a computational Grid or
accessing data from a data grid is one of the most important tasks of any Grid
middleware. Most modern Grid software today satisfies this responsibility and
gives a best-effort performance to solve this problem. Almost all decisions
regarding scheduling and data access are made by the software automatically,
giving users little or no control over the entire process. To solve this
problem, a more interactive set of services and middleware is desired that
provides users more information about Grid weather, and gives them more control
over the decision making process. This paper presents a set of services that
have been developed to provide more interactive resource management
capabilities within the Grid Analysis Environment (GAE) being developed
collaboratively by Caltech, NUST and several other institutes. These include a
steering service, a job monitoring service and an estimator service that have
been designed and written using a common Grid-enabled Web Services framework
named Clarens. The paper also presents a performance analysis of the developed
services to show that they have indeed resulted in a more interactive and
powerful system for user-centric Grid-enabled physics analysis.Comment: 8 pages, 7 figures. Workshop on Web and Grid Services for Scientific
Data Analysis at the Int Conf on Parallel Processing (ICPP05). Norway June
200
A Multi Interface Grid Discovery System
Discovery Systems (DS) can be considered as entry points for global loosely coupled distributed systems. An efficient Discovery System in essence increases the performance, reliability and decision making capability of distributed systems. With the rapid increase in scale of distributed applications, existing solutions for discovery systems are fast becoming either obsolete or incapable of handling such complexity. They are particularly ineffective when handling service lifetimes and providing up-to-date information, poor at enabling dynamic service access and they can also impose unwanted restrictions on interfaces to widely available information repositories. In this paper we present essential the design characteristics, an implementation and a performance analysis for a discovery system capable of overcoming these deficiencies in large, globally distributed environments
DIANA Scheduling Hierarchies for Optimizing Bulk Job Scheduling
The use of meta-schedulers for resource management in large-scale distributed systems often leads to a hierarchy of schedulers. In this paper, we discuss why existing meta-scheduling hierarchies are sometimes not sufficient for Grid systems due to their inability to re-organise jobs already scheduled locally. Such a job re-organisation is required to adapt to evolving loads which are common in heavily used Grid infrastructures. We propose a peer-topeer scheduling model and evaluate it using case studies and mathematical modelling. We detail the DIANA (Data Intensive and Network Aware) scheduling algorithm and its queue management system for coping with the load distribution and for supporting bulk job scheduling. We demonstrate that such a system is beneficial for dynamic, distributed and self-organizing resource management and can assist in optimizing load or job distribution in complex Grid infrastructures
- …