34,509 research outputs found

    A hyper-heuristic for adaptive scheduling in computational grids

    Get PDF
    In this paper we present the design and implementation of an hyper-heuristic for efficiently scheduling independent jobs in computational grids. An efficient scheduling of jobs to grid resources depends on many parameters, among others, the characteristics of the resources and jobs (such as computing capacity, consistency of computing, workload, etc.). Moreover, these characteristics change over time due to the dynamic nature of grid environment, therefore the planning of jobs to resources should be adaptively done. Existing ad hoc scheduling methods (batch and immediate mode) have shown their efficacy for certain types of resource and job characteristics. However, as stand alone methods, they are not able to produce the best planning of jobs to resources for different types of Grid resources and job characteristics. In this work we have designed and implemented a hyper-heuristic that uses a set of ad hoc (immediate and batch mode) scheduling methods to provide the scheduling of jobs to Grid resources according to the Grid and job characteristics. The hyper-heuristic is a high level algorithm, which examines the state and characteristics of the Grid system (jobs and resources), and selects and applies the ad hoc method that yields the best planning of jobs. The resulting hyper-heuristic based scheduler can be thus used to develop network-aware applications that need efficient planning of jobs to resources. The hyper-heuristic has been tested and evaluated in a dynamic setting through a prototype of a Grid simulator. The experimental evaluation showed the usefulness of the hyper-heuristic for planning of jobs to resources as compared to planning without knowledge of the resource and job characteristics.Peer ReviewedPostprint (author's final draft

    Grid Computing: The Trend Of The Millenium

    Get PDF
    A grid can be simply defined as a combination of different components which function collectively as a part of one large electrical or electronic circuit. The term “Grid Computing” can similarly be applied to a large number of computers which connect together to collectively solve a problem (which may be of scientific interest in most cases) of very high complexity and magnitude. The fundamental idea behind the making of any computer based grid is to utilize the idle time of processor cycles. Simply stated, a processor during the times it would stay idle would now team up with similar idle processors to tackle various complexities.  The role each processor plays is very carefully defined and there is utmost transparency in the working of each processor/computer in a grid. This is called the “Division of Labor” in the smart world of intelligent computing. In lay man terms this is equivalent to a student and his group of friends collectively solving a single assignment which contains more than a single problem. The solution is trivial but the effort is collective. A grid computing environment may take many forms. It could be molded as a cluster based, distributed computing environment based or peer-to-peer system. A cluster based environment would see a central computer often called a “cluster head” distributing or maintaining a job schedule of the other computers in the grid. A distributed environment is seen often in the web environment. For example when a user requests a popular web page from the web server and if the web server is experiencing traffic congestion, then the user is re-routed to the same page on a different web server. The transition takes place so rapidly that the momentary delay due to server bottleneck problems is hardly felt. Peer to peer computing can be best explained through music download engines. If a user has a file he decides to share it via the web, other users needing the same file copy it through their music download engines. With great computing power comes great responsibility. Security is of utmost importance in a grid environment. Since a grid performs large computations, data is assumed to be available at every node in the processing cycle. This increases the risk of data manipulation in various forms. Also we have to keep in mind what happens to the data when a node fails. An ideal grid will have a small time of convergence and a low recovery time in case of a complete grid failure. By convergence, we mean that each and every processor node will have complete information about each and every other processor node in the grid. Recovery time is the time it takes for the grid to start from scratch after a major breakdown. To put things in the right perspective, a good grid based computing environment will have an intelligent grid administrator to monitor user logs and scheduled jobs and a good grid operating system which will be tailor made to suit the application of the grid. We propose a similarity between the OSI model and a Grid Model to elaborate the functions and utilities of a grid. We also try to propose a queuing theory for Grid Computing. There have been numerous previous comparisons and each is knowledgeable in its own right. But a similarity with a network model adds more weight since physically a grid is nothing but an interconnection, and interconnection can be best defined in relation to a computer network interconnection. How do users access networked computers, how are files shared and what are the levels of security are best explained through a networked computer system

    Bulk Scheduling with the DIANA Scheduler

    Full text link
    Results from the research and development of a Data Intensive and Network Aware (DIANA) scheduling engine, to be used primarily for data intensive sciences such as physics analysis, are described. In Grid analyses, tasks can involve thousands of computing, data handling, and network resources. The central problem in the scheduling of these resources is the coordinated management of computation and data at multiple locations and not just data replication or movement. However, this can prove to be a rather costly operation and efficient sing can be a challenge if compute and data resources are mapped without considering network costs. We have implemented an adaptive algorithm within the so-called DIANA Scheduler which takes into account data location and size, network performance and computation capability in order to enable efficient global scheduling. DIANA is a performance-aware and economy-guided Meta Scheduler. It iteratively allocates each job to the site that is most likely to produce the best performance as well as optimizing the global queue for any remaining jobs. Therefore it is equally suitable whether a single job is being submitted or bulk scheduling is being performed. Results indicate that considerable performance improvements can be gained by adopting the DIANA scheduling approach.Comment: 12 pages, 11 figures. To be published in the IEEE Transactions in Nuclear Science, IEEE Press. 200

    Efficient Resource Matching in Heterogeneous Grid Using Resource Vector

    Full text link
    In this paper, a method for efficient scheduling to obtain optimum job throughput in a distributed campus grid environment is presented; Traditional job schedulers determine job scheduling using user and job resource attributes. User attributes are related to current usage, historical usage, user priority and project access. Job resource attributes mainly comprise of soft requirements (compilers, libraries) and hard requirements like memory, storage and interconnect. A job scheduler dispatches jobs to a resource if a job's hard and soft requirements are met by a resource. In current scenario during execution of a job, if a resource becomes unavailable, schedulers are presented with limited options, namely re-queuing job or migrating job to a different resource. Both options are expensive in terms of data and compute time. These situations can be avoided, if the often ignored factor, availability time of a resource in a grid environment is considered. We propose resource rank approach, in which jobs are dispatched to a resource which has the highest rank among all resources that match the job's requirement. The results show that our approach can increase throughput of many serial / monolithic jobs.Comment: 10 page
    • …
    corecore