34,509 research outputs found
A hyper-heuristic for adaptive scheduling in computational grids
In this paper we present the design and implementation of an hyper-heuristic for efficiently scheduling independent jobs in computational grids. An efficient scheduling of jobs to grid resources depends on many parameters, among others, the characteristics of the resources and jobs (such as computing capacity, consistency of computing, workload, etc.). Moreover, these characteristics change over time due to the dynamic nature of grid environment, therefore the planning of jobs to resources should be adaptively done. Existing ad hoc scheduling methods (batch and immediate mode) have shown their efficacy for certain types of resource and job characteristics. However, as stand alone methods, they are not able to produce the best planning of jobs to resources for different types of Grid resources and job characteristics. In this work we have designed and implemented a hyper-heuristic that uses a set of ad hoc (immediate and batch mode) scheduling methods to provide the scheduling of jobs to Grid resources according to the Grid and job characteristics. The hyper-heuristic is a high level algorithm, which examines the state and characteristics of the Grid system (jobs and resources), and selects and applies the ad hoc method that yields the best planning of jobs. The resulting hyper-heuristic based scheduler can be thus used to develop network-aware applications that need efficient planning of jobs to resources. The hyper-heuristic has been tested and evaluated in a dynamic setting through a prototype of a Grid simulator. The experimental evaluation showed the usefulness of the hyper-heuristic for planning of jobs to resources as compared to planning without knowledge of the resource and job characteristics.Peer ReviewedPostprint (author's final draft
Grid Computing: The Trend Of The Millenium
A grid can be simply defined as a combination of different components which function collectively as a part of one large electrical or electronic circuit. The term “Grid Computing” can similarly be applied to a large number of computers which connect together to collectively solve a problem (which may be of scientific interest in most cases) of very high complexity and magnitude. The fundamental idea behind the making of any computer based grid is to utilize the idle time of processor cycles. Simply stated, a processor during the times it would stay idle would now team up with similar idle processors to tackle various complexities. The role each processor plays is very carefully defined and there is utmost transparency in the working of each processor/computer in a grid. This is called the “Division of Labor” in the smart world of intelligent computing. In lay man terms this is equivalent to a student and his group of friends collectively solving a single assignment which contains more than a single problem. The solution is trivial but the effort is collective. A grid computing environment may take many forms. It could be molded as a cluster based, distributed computing environment based or peer-to-peer system. A cluster based environment would see a central computer often called a “cluster head” distributing or maintaining a job schedule of the other computers in the grid. A distributed environment is seen often in the web environment. For example when a user requests a popular web page from the web server and if the web server is experiencing traffic congestion, then the user is re-routed to the same page on a different web server. The transition takes place so rapidly that the momentary delay due to server bottleneck problems is hardly felt. Peer to peer computing can be best explained through music download engines. If a user has a file he decides to share it via the web, other users needing the same file copy it through their music download engines. With great computing power comes great responsibility. Security is of utmost importance in a grid environment. Since a grid performs large computations, data is assumed to be available at every node in the processing cycle. This increases the risk of data manipulation in various forms. Also we have to keep in mind what happens to the data when a node fails. An ideal grid will have a small time of convergence and a low recovery time in case of a complete grid failure. By convergence, we mean that each and every processor node will have complete information about each and every other processor node in the grid. Recovery time is the time it takes for the grid to start from scratch after a major breakdown. To put things in the right perspective, a good grid based computing environment will have an intelligent grid administrator to monitor user logs and scheduled jobs and a good grid operating system which will be tailor made to suit the application of the grid. We propose a similarity between the OSI model and a Grid Model to elaborate the functions and utilities of a grid. We also try to propose a queuing theory for Grid Computing. There have been numerous previous comparisons and each is knowledgeable in its own right. But a similarity with a network model adds more weight since physically a grid is nothing but an interconnection, and interconnection can be best defined in relation to a computer network interconnection. How do users access networked computers, how are files shared and what are the levels of security are best explained through a networked computer system
A generic method for energy-efficient and energy-cost-effective production at the unit process level
Bulk Scheduling with the DIANA Scheduler
Results from the research and development of a Data Intensive and Network
Aware (DIANA) scheduling engine, to be used primarily for data intensive
sciences such as physics analysis, are described. In Grid analyses, tasks can
involve thousands of computing, data handling, and network resources. The
central problem in the scheduling of these resources is the coordinated
management of computation and data at multiple locations and not just data
replication or movement. However, this can prove to be a rather costly
operation and efficient sing can be a challenge if compute and data resources
are mapped without considering network costs. We have implemented an adaptive
algorithm within the so-called DIANA Scheduler which takes into account data
location and size, network performance and computation capability in order to
enable efficient global scheduling. DIANA is a performance-aware and
economy-guided Meta Scheduler. It iteratively allocates each job to the site
that is most likely to produce the best performance as well as optimizing the
global queue for any remaining jobs. Therefore it is equally suitable whether a
single job is being submitted or bulk scheduling is being performed. Results
indicate that considerable performance improvements can be gained by adopting
the DIANA scheduling approach.Comment: 12 pages, 11 figures. To be published in the IEEE Transactions in
Nuclear Science, IEEE Press. 200
Efficient Resource Matching in Heterogeneous Grid Using Resource Vector
In this paper, a method for efficient scheduling to obtain optimum job
throughput in a distributed campus grid environment is presented; Traditional
job schedulers determine job scheduling using user and job resource attributes.
User attributes are related to current usage, historical usage, user priority
and project access. Job resource attributes mainly comprise of soft
requirements (compilers, libraries) and hard requirements like memory, storage
and interconnect. A job scheduler dispatches jobs to a resource if a job's hard
and soft requirements are met by a resource. In current scenario during
execution of a job, if a resource becomes unavailable, schedulers are presented
with limited options, namely re-queuing job or migrating job to a different
resource. Both options are expensive in terms of data and compute time. These
situations can be avoided, if the often ignored factor, availability time of a
resource in a grid environment is considered. We propose resource rank
approach, in which jobs are dispatched to a resource which has the highest rank
among all resources that match the job's requirement. The results show that our
approach can increase throughput of many serial / monolithic jobs.Comment: 10 page
- …