32,215 research outputs found

    Fuzzy C-Mean And Genetic Algorithms Based Scheduling For Independent Jobs In Computational Grid

    Get PDF
    The concept of Grid computing is becoming the most important research area in the high performance computing. Under this concept, the jobs scheduling in Grid computing has more complicated problems to discover a diversity of available resources, select the appropriate applications and map to suitable resources. However, the major problem is the optimal job scheduling, which Grid nodes need to allocate the appropriate resources for each job. In this paper, we combine Fuzzy C-Mean and Genetic Algorithms which are popular algorithms, the Grid can be used for scheduling. Our model presents the method of the jobs classifications based mainly on Fuzzy C-Mean algorithm and mapping the jobs to the appropriate resources based mainly on Genetic algorithm. In the experiments, we used the workload historical information and put it into our simulator. We get the better result when compared to the traditional algorithms for scheduling policies. Finally, the paper also discusses approach of the jobs classifications and the optimization engine in Grid scheduling

    Workload characterization of the shared/buy-in computing cluster at Boston University

    Full text link
    Computing clusters provide a complete environment for computational research, including bio-informatics, machine learning, and image processing. The Shared Computing Cluster (SCC) at Boston University is based on a shared/buy-in architecture that combines shared computers, which are free to be used by all users, and buy-in computers, which are computers purchased by users for semi-exclusive use. Although there exists significant work on characterizing the performance of computing clusters, little is known about shared/buy-in architectures. Using data traces, we statistically analyze the performance of the SCC. Our results show that the average waiting time of a buy-in job is 16.1% shorter than that of a shared job. Furthermore, we identify parameters that have a major impact on the performance experienced by shared and buy-in jobs. These parameters include the type of parallel environment and the run time limit (i.e., the maximum time during which a job can use a resource). Finally, we show that the semi-exclusive paradigm, which allows any SCC user to use idle buy-in resources for a limited time, increases the utilization of buy-in resources by 17.4%, thus significantly improving the performance of the system as a whole.http://people.bu.edu/staro/MIT_Conference_Yoni.pdfAccepted manuscrip

    Characterizing and Subsetting Big Data Workloads

    Full text link
    Big data benchmark suites must include a diversity of data and workloads to be useful in fairly evaluating big data systems and architectures. However, using truly comprehensive benchmarks poses great challenges for the architecture community. First, we need to thoroughly understand the behaviors of a variety of workloads. Second, our usual simulation-based research methods become prohibitively expensive for big data. As big data is an emerging field, more and more software stacks are being proposed to facilitate the development of big data applications, which aggravates hese challenges. In this paper, we first use Principle Component Analysis (PCA) to identify the most important characteristics from 45 metrics to characterize big data workloads from BigDataBench, a comprehensive big data benchmark suite. Second, we apply a clustering technique to the principle components obtained from the PCA to investigate the similarity among big data workloads, and we verify the importance of including different software stacks for big data benchmarking. Third, we select seven representative big data workloads by removing redundant ones and release the BigDataBench simulation version, which is publicly available from http://prof.ict.ac.cn/BigDataBench/simulatorversion/.Comment: 11 pages, 6 figures, 2014 IEEE International Symposium on Workload Characterizatio

    A study on performance measures for auto-scaling CPU-intensive containerized applications

    Get PDF
    Autoscaling of containers can leverage performance measures from the different layers of the computational stack. This paper investigate the problem of selecting the most appropriate performance measure to activate auto-scaling actions aiming at guaranteeing QoS constraints. First, the correlation between absolute and relative usage measures and how a resource allocation decision can be influenced by them is analyzed in different workload scenarios. Absolute and relative measures could assume quite different values. The former account for the actual utilization of resources in the host system, while the latter account for the share that each container has of the resources used. Then, the performance of a variant of Kubernetes’ auto-scaling algorithm, that transparently uses the absolute usage measures to scale-in/out containers, is evaluated through a wide set of experiments. Finally, a detailed analysis of the state-of-the-art is presented
    • …
    corecore