231,108 research outputs found

    A Domain Specific Approach to High Performance Heterogeneous Computing

    Full text link
    Users of heterogeneous computing systems face two problems: firstly, in understanding the trade-off relationships between the observable characteristics of their applications, such as latency and quality of the result, and secondly, how to exploit knowledge of these characteristics to allocate work to distributed computing platforms efficiently. A domain specific approach addresses both of these problems. By considering a subset of operations or functions, models of the observable characteristics or domain metrics may be formulated in advance, and populated at run-time for task instances. These metric models can then be used to express the allocation of work as a constrained integer program, which can be solved using heuristics, machine learning or Mixed Integer Linear Programming (MILP) frameworks. These claims are illustrated using the example domain of derivatives pricing in computational finance, with the domain metrics of workload latency or makespan and pricing accuracy. For a large, varied workload of 128 Black-Scholes and Heston model-based option pricing tasks, running upon a diverse array of 16 Multicore CPUs, GPUs and FPGAs platforms, predictions made by models of both the makespan and accuracy are generally within 10% of the run-time performance. When these models are used as inputs to machine learning and MILP-based workload allocation approaches, a latency improvement of up to 24 and 270 times over the heuristic approach is seen.Comment: 14 pages, preprint draft, minor revisio

    Hedonic Coalition Formation for Distributed Task Allocation among Wireless Agents

    Full text link
    Autonomous wireless agents such as unmanned aerial vehicles or mobile base stations present a great potential for deployment in next-generation wireless networks. While current literature has been mainly focused on the use of agents within robotics or software applications, we propose a novel usage model for self-organizing agents suited to wireless networks. In the proposed model, a number of agents are required to collect data from several arbitrarily located tasks. Each task represents a queue of packets that require collection and subsequent wireless transmission by the agents to a central receiver. The problem is modeled as a hedonic coalition formation game between the agents and the tasks that interact in order to form disjoint coalitions. Each formed coalition is modeled as a polling system consisting of a number of agents which move between the different tasks present in the coalition, collect and transmit the packets. Within each coalition, some agents can also take the role of a relay for improving the packet success rate of the transmission. The proposed algorithm allows the tasks and the agents to take distributed decisions to join or leave a coalition, based on the achieved benefit in terms of effective throughput, and the cost in terms of delay. As a result of these decisions, the agents and tasks structure themselves into independent disjoint coalitions which constitute a Nash-stable network partition. Moreover, the proposed algorithm allows the agents and tasks to adapt the topology to environmental changes such as the arrival/removal of tasks or the mobility of the tasks. Simulation results show how the proposed algorithm improves the performance, in terms of average player (agent or task) payoff, of at least 30.26% (for a network of 5 agents with up to 25 tasks) relatively to a scheme that allocates nearby tasks equally among agents.Comment: to appear, IEEE Transactions on Mobile Computin

    Decentralized dynamic task allocation for UAVs with limited communication range

    Full text link
    We present the Limited-range Online Routing Problem (LORP), which involves a team of Unmanned Aerial Vehicles (UAVs) with limited communication range that must autonomously coordinate to service task requests. We first show a general approach to cast this dynamic problem as a sequence of decentralized task allocation problems. Then we present two solutions both based on modeling the allocation task as a Markov Random Field to subsequently assess decisions by means of the decentralized Max-Sum algorithm. Our first solution assumes independence between requests, whereas our second solution also considers the UAVs' workloads. A thorough empirical evaluation shows that our workload-based solution consistently outperforms current state-of-the-art methods in a wide range of scenarios, lowering the average service time up to 16%. In the best-case scenario there is no gap between our decentralized solution and centralized techniques. In the worst-case scenario we manage to reduce by 25% the gap between current decentralized and centralized techniques. Thus, our solution becomes the method of choice for our problem

    Exact and heuristic allocation of multi-kernel applications to multi-FPGA platforms

    Get PDF
    FPGA-based accelerators demonstrated high energy efficiency compared to GPUs and CPUs. However, single FPGA designs may not achieve sufficient task parallelism. In this work, we optimize the mapping of high-performance multi-kernel applications, like Convolutional Neural Networks, to multi-FPGA platforms. First, we formulate the system level optimization problem, choosing within a huge design space the parallelism and number of compute units for each kernel in the pipeline. Then we solve it using a combination of Geometric Programming, producing the optimum performance solution given resource and DRAM bandwidth constraints, and a heuristic allocator of the compute units on the FPGA cluster.Peer ReviewedPostprint (author's final draft
    • …
    corecore