18 research outputs found

    Fuzzy C-Mean And Genetic Algorithms Based Scheduling For Independent Jobs In Computational Grid

    Get PDF
    The concept of Grid computing is becoming the most important research area in the high performance computing. Under this concept, the jobs scheduling in Grid computing has more complicated problems to discover a diversity of available resources, select the appropriate applications and map to suitable resources. However, the major problem is the optimal job scheduling, which Grid nodes need to allocate the appropriate resources for each job. In this paper, we combine Fuzzy C-Mean and Genetic Algorithms which are popular algorithms, the Grid can be used for scheduling. Our model presents the method of the jobs classifications based mainly on Fuzzy C-Mean algorithm and mapping the jobs to the appropriate resources based mainly on Genetic algorithm. In the experiments, we used the workload historical information and put it into our simulator. We get the better result when compared to the traditional algorithms for scheduling policies. Finally, the paper also discusses approach of the jobs classifications and the optimization engine in Grid scheduling

    Predicting Workflow Task Execution Time in the Cloud using A Two-Stage Machine Learning Approach

    Get PDF
    Many techniques such as scheduling and resource provisioning rely on performance prediction of workflow tasks for varying input data. However, such estimates are difficult to generate in the cloud. This paper introduces a novel two-stage machine learning approach for predicting workflow task execution times for varying input data in the cloud. In order to achieve high accuracy predictions, our approach relies on parameters reflecting runtime information and two stages of predictions. Empirical results for four real world workflow applications and several commercial cloud providers demonstrate that our approach outperforms existing prediction methods. In our experiments, our approach respectively achieves a best-case and worst-case estimation error of 1.6% and 12.2%, while existing methods achieved errors beyond 20% (for some cases even over 50%) in more than 75% of the evaluated workflow tasks. In addition, we show that the models predicted by our approach for a specific cloud can be ported with low effort to new clouds with low errors by requiring only a small number of executions

    Seer: a lightweight online failure prediction approach

    Get PDF
    Online failure prediction aims to predict the manifestation of failures at runtime before the failures actually occur. Existing online failure prediction approaches typically operate on data which is either directly reported by the system under test or directly observable from outside system executions. These approaches generally refrain themselves from collecting internal execution data that can further improve the prediction quality. One reason behind this general trend is due to the runtime overhead cost incurred by the measurement instruments that are required to collect the data. In this work we conjecture that large cost reductions in collecting internal execution data for online failure prediction can derive from reducing the cost of the measurement instruments, while still supporting acceptable levels of prediction quality. To evaluate this conjecture, we present a lightweight online failure prediction approach, called Seer. Seer uses fast hardware performance counters to perform most of the data collection work. The data is augmented with further data collected by a minimal amount of software instrumentation that is added to the systems software. We refer to the data collected in this manner as hybrid spectra. We applied the proposed approach to three widely used open source subject applications and evaluated it by comparing and contrasting three types of hybrid spectra and two types of traditional software spectra. At the lowest level of runtime overheads attained in the experiments, the hybrid spectra predicted the failures about half way through the executions with an F-measure of 0.77 and a runtime overhead of 1.98%, on average. Comparing hybrid spectra to software spectra, we observed that, for comparable runtime overhead levels, the hybrid spectra provided significantly better prediction accuracies and earlier warnings for failures than the software spectra. Alternatively, for comparable accuracy levels, the hybrid spectra incurred significantly less runtime overheads and provided earlier warnings

    Large-scale analysis of neuroimaging data on commercial clouds with content-aware resource allocation strategies

    Get PDF
    The combined use of mice that have genetic mutations (transgenic mouse models) of human pathology and advanced neuroimaging methods (such as magnetic resonance imaging) has the potential to radically change how we approach disease understanding, diagnosis and treatment. Morphological changes occurring in the brain of transgenic animals as a result of the interaction between environment and genotype can be assessed using advanced image analysis methods, an effort described as ‘mouse brain phenotyping’. However, the computational methods involved in the analysis of high-resolution brain images are demanding. While running such analysis on local clusters is possible, not all users have access to such infrastructure and even for those that do, having additional computational capacity can be beneficial (e.g. to meet sudden high throughput demands). In this paper we use a commercial cloud platform for brain neuroimaging and analysis. We achieve a registration-based multi-atlas, multi-template anatomical segmentation, normally a lengthy-in-time effort, within a few hours. Naturally, performing such analyses on the cloud entails a monetary cost, and it is worthwhile identifying strategies that can allocate resources intelligently. In our context a critical aspect is the identification of how long each job will take. We propose a method that estimates the complexity of an image-processing task, a registration, using statistical moments and shape descriptors of the image content. We use this information to learn and predict the completion time of a registration. The proposed approach is easy to deploy, and could serve as an alternative for laboratories that may require instant access to large high-performance-computing infrastructures. To facilitate adoption from the community we publicly release the source code

    An Effort Prediction Framework for Software Defect Correction

    Get PDF
    Developers apply changes and updates to software systems to adapt to emerging environments and address new requirements. In turn, these changes introduce additional software defects, usually caused by our inability to comprehend the full scope of the modi ed code. As a result, software practitioners have developed tools to aid in the detection and prediction of imminent software defects, in addition to the eort required to correct them. Although software development eort prediction has been in use for many years, research into defect-correction eort prediction is relatively new. The increasing complexity, integration and ubiquitous nature of current software systems has sparked renewed interest in this eld. Eort prediction now plays a critical role in the planning activities of managers. Accurate predictions help corporations budget, plan and distribute available resources eectively and e ciently. In particular, early defect-correction eort predictions could be used by testers to set schedules, and by managers to plan costs and provide earlier feedback to customers about future releases. In this work, we address the problem of predicting the eort needed to resolve a software defect. More speci cally, our study is concerned with defects or issues that are reported on an Issue Tracking System or any other defect repository. Current approaches use one prediction method or technique to produce eort predictions. This approach usually suers from the weaknesses of the chosen prediction method, and consequently the accuracy of the predictions are aected. To address this problem, we present a composite prediction framework. Rather than using one prediction approach for all defects, we propose the use of multiple integrated methods which complement the weaknesses of one another. Our framework is divided into two sub-categories, Similarity-Score Dependent and Similarity-Score Independent. The Similarity-Score Dependent method utilizes the power of Case-Based Reasoning, also known as Instance-Based Reasoning, to compute predictions. It relies on matching target issues to similar historical cases, then combines their known eort for an informed estimate. On the other hand, the Similarity-Score Independent method makes use of other defect-related information with some statistical manipulation to produce the required estimate. To measure similarity between defects, some method of distance calculation must be used. In some cases, this method might produce misleading results due to observed inconsistencies in history, and the fact that current similarity-scoring techniques cannot account for all the variability in the data. In this case, the Similarity-Score Independent method can be used to estimate the eort, where the eect of such inconsistencies can be reduced. We have performed a number of experimental studies on the proposed framework to assess the eectiveness of the presented techniques. We extracted the data sets from an operational Issue Tracking System in order to test the validity of the model on real project data. These studies involved the development of multiple tools in both the Java programming language and PHP, each for a certain stage of data analysis and manipulation. The results show that our proposed approach produces signi cant improvements when compared to current methods

    Exploring Scheduling for On-demand File Systems and Data Management within HPC Environments

    Get PDF

    Exploring Scheduling for On-demand File Systems and Data Management within HPC Environments

    Get PDF
    corecore