29,126 research outputs found
Recommended from our members
PGGA: A predictable and grouped genetic algorithm for job scheduling
This paper presents a predictable and grouped genetic algorithm (PGGA) for job scheduling. The novelty of the PGGA is twofold: (1) a job workload estimation algorithm is designed to estimate a job workload based on its historical execution records, (2) the divisible load theory (DLT) is employed to predict an optimal fitness value by which the PGGA speeds up the convergence process in searching a large scheduling space. Comparison with traditional scheduling methods such as first-come-first-serve (FCFS) and random scheduling, heuristics such as a typical genetic algorithm, Min-Min and Max-Min indicates that the PGGA is more effective and efficient in finding optimal scheduling solutions
ALOHA: A Unified Platform-Aware Evaluation Method for CNNs Execution on Heterogeneous Systems at the Edge
CNN design and deployment on embedded edge-processing systems is an error-prone and effort-hungry process, that poses the need for accurate and effective automated assisting tools. In such tools, pre-evaluating the platform-aware CNN metrics such as latency, energy cost, and throughput is a key requirement for successfully reaching the implementation goals imposed by use-case constraints. Especially when more complex parallel and heterogeneous computing platforms are considered, currently utilized estimation methods are inaccurate or require a lot of characterization experiments and efforts. In this paper, we propose an alternative method, designed to be flexible, easy to use, and accurate at the same time. Considering a modular platform and execution model that adequately describes the details of the platform and the scheduling of different CNN operators on different platform processing elements, our method captures precisely operations and data transfers and their deployment on computing and communication resources, significantly improving the evaluation accuracy. We have tested our method on more than 2000 CNN layers, targeting an FPGA-based accelerator and a GPU platform as reference example architectures. Results have shown that our evaluation method increases the estimation precision by up to 5× for execution time, and by 2\times for energy, compared to other widely used analytical methods. Moreover, we assessed the impact of the improved platform-awareness on a set of neural architecture search experiments, targeting both hardware platforms, and enforcing 2 sets of latency constraints, performing 5 trials on each search space, for a total number of 20 experiments. The predictability is improved by 4\times , reaching, with respect to alternatives, selection results clearly more similar to those obtained with on-hardware measurements
A Taxonomy of Workflow Management Systems for Grid Computing
With the advent of Grid and application technologies, scientists and
engineers are building more and more complex applications to manage and process
large data sets, and execute scientific experiments on distributed resources.
Such application scenarios require means for composing and executing complex
workflows. Therefore, many efforts have been made towards the development of
workflow management systems for Grid computing. In this paper, we propose a
taxonomy that characterizes and classifies various approaches for building and
executing workflows on Grids. We also survey several representative Grid
workflow systems developed by various projects world-wide to demonstrate the
comprehensiveness of the taxonomy. The taxonomy not only highlights the design
and engineering similarities and differences of state-of-the-art in Grid
workflow systems, but also identifies the areas that need further research.Comment: 29 pages, 15 figure
D-SPACE4Cloud: A Design Tool for Big Data Applications
The last years have seen a steep rise in data generation worldwide, with the
development and widespread adoption of several software projects targeting the
Big Data paradigm. Many companies currently engage in Big Data analytics as
part of their core business activities, nonetheless there are no tools and
techniques to support the design of the underlying hardware configuration
backing such systems. In particular, the focus in this report is set on Cloud
deployed clusters, which represent a cost-effective alternative to on premises
installations. We propose a novel tool implementing a battery of optimization
and prediction techniques integrated so as to efficiently assess several
alternative resource configurations, in order to determine the minimum cost
cluster deployment satisfying QoS constraints. Further, the experimental
campaign conducted on real systems shows the validity and relevance of the
proposed method
Dynamic Energy Management for Chip Multi-processors under Performance Constraints
We introduce a novel algorithm for dynamic energy management (DEM) under performance constraints in chip multi-processors (CMPs). Using the novel concept of delayed instructions count, performance loss estimations are calculated at the end of each control period for each core. In addition, a Kalman filtering based approach is employed to predict workload in the next control period for which voltage-frequency pairs must be selected. This selection is done with a novel dynamic voltage and frequency scaling (DVFS) algorithm whose objective is to reduce energy consumption but without degrading performance beyond the user set threshold. Using our customized Sniper based CMP system simulation framework, we demonstrate the effectiveness of the proposed algorithm for a variety of benchmarks for 16 core and 64 core network-on-chip based CMP architectures. Simulation results show consistent energy savings across the board. We present our work as an investigation of the tradeoff between the achievable energy reduction via DVFS when predictions are done using the effective Kalman filter for different performance penalty thresholds
- …