53 research outputs found

    Dynamic energy-aware scheduling for parallel task-based application in cloud computing

    Get PDF
    Green Computing is a recent trend in computer science, which tries to reduce the energy consumption and carbon footprint produced by computers on distributed platforms such as clusters, grids, and clouds. Traditional scheduling solutions attempt to minimize processing times without taking into account the energetic cost. One of the methods for reducing energy consumption is providing scheduling policies in order to allocate tasks on specific resources that impact over the processing times and energy consumption. In this paper, we propose a real-time dynamic scheduling system to execute efficiently task-based applications on distributed computing platforms in order to minimize the energy consumption. Scheduling tasks on multiprocessors is a well known NP-hard problem and optimal solution of these problems is not feasible, we present a polynomial-time algorithm that combines a set of heuristic rules and a resource allocation technique in order to get good solutions on an affordable time scale. The proposed algorithm minimizes a multi-objective function which combines the energy-consumption and execution time according to the energy-performance importance factor provided by the resource provider or user, also taking into account sequence-dependent setup times between tasks, setup times and down times for virtual machines (VM) and energy profiles for different architectures. A prototype implementation of the scheduler has been tested with different kinds of DAG generated at random as well as on real task-based COMPSs applications. We have tested the system with different size instances and importance factors, and we have evaluated which combination provides a better solution and energy savings. Moreover, we have also evaluated the introduced overhead by measuring the time for getting the scheduling solutions for a different number of tasks, kinds of DAG, and resources, concluding that our method is suitable for run-time scheduling.This work has been supported by the Spanish Government (contracts TIN2015-65316-P, TIN2012-34557, CSD2007-00050, CAC2007-00052 and SEV-2011-00067), by Generalitat de Catalunya (contract 2014-SGR-1051), by the European Commission (Euroserver project, contract 610456) and by Consejo Nacional de Ciencia y TecnologĂ­a of Mexico (special program for postdoctoral position BSC-CNS-CONACYT contract 290790, grant number 265937).Peer ReviewedAward-winningPostprint (published version

    Cilk : efficient multithreaded computing

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1998.Includes bibliographical references (p. 170-179).by Keith H. Randall.Ph.D

    Ant Colony Heuristic for Mapping and Scheduling Tasks and Communications on Heterogeneous Embedded Systems

    Get PDF
    To exploit the power of modern heterogeneous multiprocessor embedded platforms on partitioned applications, the designer usually needs to efficiently map and schedule all the tasks and the communications of the application, respecting the constraints imposed by the target architecture. Since the problem is heavily constrained, common methods used to explore such design space usually fail, obtaining low-quality solutions. In this paper, we propose an ant colony optimization (ACO) heuristic that, given a model of the target architecture and the application, efficiently executes both scheduling and mapping to optimize the application performance. We compare our approach with several other heuristics, including simulated annealing, tabu search, and genetic algorithms, on the performance to reach the optimum value and on the potential to explore the design space. We show that our approach obtains better results than other heuristics by at least 16% on average, despite an overhead in execution time. Finally, we validate the approach by scheduling and mapping a JPEG encoder on a realistic target architecture

    Analysis and Approximation of Optimal Co-Scheduling on CMP

    Get PDF
    In recent years, the increasing design complexity and the problems of power and heat dissipation have caused a shift in processor technology to favor Chip Multiprocessors. In Chip Multiprocessors (CMP) architecture, it is common that multiple cores share some on-chip cache. The sharing may cause cache thrashing and contention among co-running jobs. Job co-scheduling is an approach to tackling the problem by assigning jobs to cores appropriately so that the contention and consequent performance degradations are minimized. This dissertation aims to tackle two of the most prominent challenges in job co-scheduling.;The first challenge is in the computational complexity for determining optimal job co-schedules. This dissertation presents one of the first systematic analyses on the complexity of job co-scheduling. Besides proving the NP completeness of job co-scheduling, it introduces a set of algorithms, based on graph theory and Integer/Linear Programming, for computing optimal co-schedules or their lower bounds in scenarios with or without job migrations. For complex cases, it empirically demonstrates the feasibility for approximating the optimal schedules effectively by proposing several heuristics-based algorithms. These discoveries facilitate the assessment of job co-schedulers by providing necessary baselines, and shed insights to the development of practical co-scheduling systems.;The second challenge resides in the prediction of the performance of processes co-running on a shared cache. This dissertation explores the influence on co-run performance prediction imposed by co-runners, program inputs, and cache configurations. Through a sequence of formal analysis, we derive an analytical co-run locality model, uncovering the inherent statistical connections between the data references of programs single-runs and their co-run locality. The model offers theoretical insights on co-run locality analysis and leads to a lightweight approach for fast prediction of shared cache performance. We demonstrate the effectiveness of the model in enabling proactive job co-scheduling.;Together, the two-dimensional findings open up many new opportunities for cache management on modern CMP by laying the foundation for job co-scheduling, and enhancing the understanding to data locality and cache sharing significantly

    Working Notes from the 1992 AAAI Spring Symposium on Practical Approaches to Scheduling and Planning

    Get PDF
    The symposium presented issues involved in the development of scheduling systems that can deal with resource and time limitations. To qualify, a system must be implemented and tested to some degree on non-trivial problems (ideally, on real-world problems). However, a system need not be fully deployed to qualify. Systems that schedule actions in terms of metric time constraints typically represent and reason about an external numeric clock or calendar and can be contrasted with those systems that represent time purely symbolically. The following topics are discussed: integrating planning and scheduling; integrating symbolic goals and numerical utilities; managing uncertainty; incremental rescheduling; managing limited computation time; anytime scheduling and planning algorithms, systems; dependency analysis and schedule reuse; management of schedule and plan execution; and incorporation of discrete event techniques

    Worst-case delay analysis of real-time switched Ethernet networks with flow local synchronization

    Get PDF
    Les rĂ©seaux Ethernet commutĂ© full-duplex constituent des solutions intĂ©ressantes pour des applications industrielles. Mais le non-dĂ©terminisme d’un commutateur IEEE 802.1d, fait que l’analyse pire cas de dĂ©lai de flux critiques est encore un problĂšme ouvert. Plusieurs mĂ©thodes ont Ă©tĂ© proposĂ©es pour obtenir des bornes supĂ©rieures des dĂ©lais de communication sur des rĂ©seaux Ethernet commutĂ© full duplex temps rĂ©els, faisant l’hypothĂšse que le trafic en entrĂ©e du rĂ©seau peut ĂȘtre bornĂ©. Le problĂšme principal reste le pessimisme introduit par la mĂ©thode de calcul de cette borne supĂ©rieure du dĂ©lai. Ces mĂ©thodes considĂšrent que tous les flux transmis sur le rĂ©seau sont indĂ©pendants. Ce qui est vrai pour les flux Ă©mis par des nƓuds sources diffĂ©rents car il n’existe pas, dans le cas gĂ©nĂ©ral, d’horloge globale permettant de synchroniser les flux. Mais pour les flux Ă©mis par un mĂȘme nƓud source, il est possible de faire l’hypothĂšse d’une synchronisation locale de ces flux. Une telle hypothĂšse permet de bĂątir un modĂšle plus prĂ©cis des flux et en consĂ©quence Ă©limine des scĂ©narios impossibles qui augmentent le pessimisme du calcul. Le sujet principal de cette thĂšse est d’étudier comment des flux pĂ©riodiques synchronisĂ©s par des offsets peuvent ĂȘtre gĂ©rĂ©s dans le calcul des bornes supĂ©rieures des dĂ©lais sur un rĂ©seau Ethernet commutĂ© temps-rĂ©el. Dans un premier temps, il s’agit de prĂ©senter l’impact des contraintes d’offsets sur le calcul des bornes supĂ©rieures des dĂ©lais de bout en bout. Il s’agit ensuite de prĂ©senter comment intĂ©grer ces contraintes d’offsets dans les approches de calcul basĂ©es sur le Network Calculus et la mĂ©thode des Trajectoires. Une mĂ©thode Calcul RĂ©seau modifiĂ©e et une mĂ©thode Trajectoires modifiĂ©e sont alors dĂ©veloppĂ©es et les performances obtenues sont comparĂ©es. Le rĂ©seau avionique AFDX (Avionics Full-Duplex Switched Ethernet) est pris comme exemple d’un rĂ©seau Ethernet commutĂ© full-duplex. Une configuration AFDX industrielle avec un millier de flux est prĂ©sentĂ©e. Cette configuration industrielle est alors Ă©valuĂ©e Ă  l’aide des deux approches, selon un choix d’allocation d’offsets donnĂ©. De plus, diffĂ©rents algorithmes d’allocation des offsets sont testĂ©s sur cette configuration industrielle, pour trouver un algorithme d’allocation quasi-optimal. Une analyse de pessimisme des bornes supĂ©rieures calculĂ©es est alors proposĂ©e. Cette analyse est basĂ©e sur l’approche des trajectoires (rendue optimiste) qui permet de calculer une sous-approximation du dĂ©lai pire-cas. La diffĂ©rence entre la borne supĂ©rieure du dĂ©lai (calculĂ©e par une mĂ©thode donnĂ©e) et la sous-approximation du dĂ©lai pire cas donne une borne supĂ©rieure du pessimisme de la mĂ©thode. Cette analyse fournit des rĂ©sultats intĂ©ressants sur le pessimisme des approches Calcul RĂ©seau et mĂ©thode des Trajectoires. La derniĂšre partie de la thĂšse porte sur une architecture de rĂ©seau temps rĂ©el hĂ©tĂ©rogĂšne obtenue par connexion de rĂ©seaux CAN via des ponts sur un rĂ©seau fĂ©dĂ©rateur de type Ethernet commutĂ©. Deux approches, une basĂ©e sur les composants et l’autre sur les Trajectoires sont proposĂ©es pour permettre une analyse des dĂ©lais pire-cas sur un tel rĂ©seau. La capacitĂ© de calcul d’une borne supĂ©rieure des dĂ©lais pire-cas dans le contexte d’une architecture hĂ©tĂ©rogĂšne est intĂ©ressante pour les domaines industriels. ABSTRACT : Full-duplex switched Ethernet is a promising candidate for interconnecting real-time industrial applications. But due to IEEE 802.1d indeterminism, the worst-case delay analysis of critical flows supported by such a network is still an open problem. Several methods have been proposed for upper-bounding communication delays on a real-time switched Ethernet network, assuming that the incoming traffic can be upper bounded. The main problem remaining is to assess the tightness, i.e. the pessimism, of the method calculating this upper bound on the communication delay. These methods consider that all flows transmitted over the network are independent. This is true for flows emitted by different source nodes since, in general, there is no global clock synchronizing them. But the flows emitted by the same source node are local synchronized. Such an assumption helps to build a more precise flow model that eliminates some impossible communication scenarios which lead to a pessimistic delay upper bounds. The core of this thesis is to study how local periodic flows synchronized with offsets can be handled when computing delay upper-bounds on a real-time switched Ethernet. In a first step, the impact of these offsets on the delay upper-bound computation is illustrated. Then, the integration of offsets in the Network Calculus and the Trajectory approaches is introduced. Therefore, a modified Network Calculus approach and a modified Trajectory approach are developed whose performances are compared on an Avionics Full-DupleX switched Ethernet (AFDX) industrial configuration with one thousand of flows. It has been shown that, in the context of this AFDX configuration, the Trajectory approach leads to slightly tighter end-to-end delay upper bounds than the ones of the Network Calculus approach. But offsets of local flows have to be chosen. Different offset assignment algorithms are then investigated on the AFDX industrial configuration. A near-optimal assignment can be exhibited. Next, a pessimism analysis of the computed upper-bounds is proposed. This analysis is based on the Trajectory approach (made optimistic) which computes an under-estimation of the worst-case delay. The difference between the upper-bound (computed by a given method) and the under-estimation of the worst-case delay gives an upper-bound of the pessimism of the method. This analysis gives interesting comparison results on the Network Calculus and the Trajectory approaches pessimism. The last part of the thesis, deals with a real-time heterogeneous network architecture where CAN buses are interconnected through a switched Ethernet backbone using dedicated bridges. Two approaches, the component-based approach and the Trajectory approach, are developed to conduct a worst-case delay analysis for such a network. Clearly, the ability to compute end-to-end delays upper-bounds in the context of heterogeneous network architecture is promising for industrial domains

    Parallel Natural Language Parsing: From Analysis to Speedup

    Get PDF
    Electrical Engineering, Mathematics and Computer Scienc

    Debugging multithreaded programs that incorporate user-level locking

    Get PDF
    Thesis (S.B. and M.Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1998.Includes bibliographical references (p. 119-124).by Andrew F. Stark.S.B.and M.Eng

    High-level synthesis of VLSI circuits

    Get PDF
    • 

    corecore