Search CORE

1,404 research outputs found

Combinatorial Approximation Algorithms Guaranteed Versus Experimental Performance.

Author: Vredeveld Tjark
Publication venue
Publication date
Field of study

Cooperation and Competition when Bidding for Complex Projects: Centralized and Decentralized Perspectives

Author: Datta Anwitaman
Rzadca Krzysztof
Skowron Piotr
Publication venue
Publication date: 15/09/2016
Field of study

To successfully complete a complex project, be it a construction of an airport or of a backbone IT system, agents (companies or individuals) must form a team having required competences and resources. A team can be formed either by the project issuer based on individual agents' offers (centralized formation); or by the agents themselves (decentralized formation) bidding for a project as a consortium---in that case many feasible teams compete for the contract. We investigate rational strategies of the agents (what salary should they ask? with whom should they team up?). We propose concepts to characterize the stability of the winning teams and study their computational complexity

arXiv.org e-Print Archive

Distributed Virtual System (DIVIRS) Project

Author: Neuman B. Clifford
Schorr Herbert
Publication venue
Publication date
Field of study

As outlined in our continuation proposal 92-ISI-50R (revised) on contract NCC 2-539, we are (1) developing software, including a system manager and a job manager, that will manage available resources and that will enable programmers to program parallel applications in terms of a virtual configuration of processors, hiding the mapping to physical nodes; (2) developing communications routines that support the abstractions implemented in item one; (3) continuing the development of file and information systems based on the virtual system model; and (4) incorporating appropriate security measures to allow the mechanisms developed in items 1 through 3 to be used on an open network. The goal throughout our work is to provide a uniform model that can be applied to both parallel and distributed systems. We believe that multiprocessor systems should exist in the context of distributed systems, allowing them to be more easily shared by those that need them. Our work provides the mechanisms through which nodes on multiprocessors are allocated to jobs running within the distributed system and the mechanisms through which files needed by those jobs can be located and accessed

NASA Technical Reports Server

Ocean-Atmosphere Application Scheduling within DIET

Author: Caniou Yves
Caron Eddy
Charrier Ghislain
Desprez Frédéric
Maisonnave Eric
Pichon Vincent
Publication venue: HAL CCSD
Publication date: 10/12/2008
Field of study

In this report, we tackle the problem of scheduling an Ocean-Atmosphere application in an heterogeneous environment. The application is used for long term climate forecast. In this context, we analyzed the execution of an experiment. An experiment is composed of several identical simulations composed of parallel tasks. On homogeneous platforms, we propose a heuristic and its optimizations, all based on the same idea: we divide the processors into disjoint sets, each group executing parallel tasks. On heterogeneous platforms the algorithm presented is applied on subsets of simulations. The computation of the subsets is done greedily and aims at minimizing the execution time by sending each subset on a cluster. We performed experiments on the french research grid \emph{Grid'5000} which exhibited some technical difficulties. We also present some modifications done to the heuristics to minimize the impact of these technical difficulties. Our simulations are then validated by experimentations

HAL-ENS-LYON

Crossref

INRIA a CCSD electronic archive server

Hal-Diderot

Oskar Bordeaux

HAL-Rennes 1

Evaluation of a Simple, Scalable, Parallel Best-First Search Strategy

Author: Adi Botea
Akihiro Kishimoto
Alex Fukunaga
Botea
Burns
Bäckström
Campbell
Cazenave
Chakrabarti
Cook
Dutt
Evett
Felner
Gomes
Hamadi
Hart
Holzmann
Huberman
Karp
Korf
Korf
Kumar
Mahanti
Mahapatra
Mattern
Melatti
Powley
Powley
Rao
Romein
Romein
Snir
Stern
Verstoep
Vrakas
Zobrist
Publication venue: 'Elsevier BV'
Publication date: 24/10/2012
Field of study

Large-scale, parallel clusters composed of commodity processors are increasingly available, enabling the use of vast processing capabilities and distributed RAM to solve hard search problems. We investigate Hash-Distributed A* (HDA*), a simple approach to parallel best-first search that asynchronously distributes and schedules work among processors based on a hash function of the search state. We use this approach to parallelize the A* algorithm in an optimal sequential version of the Fast Downward planner, as well as a 24-puzzle solver. The scaling behavior of HDA* is evaluated experimentally on a shared memory, multicore machine with 8 cores, a cluster of commodity machines using up to 64 cores, and large-scale high-performance clusters, using up to 2400 processors. We show that this approach scales well, allowing the effective utilization of large amounts of distributed memory to optimally solve problems which require terabytes of RAM. We also compare HDA* to Transposition-table Driven Scheduling (TDS), a hash-based parallelization of IDA*, and show that, in planning, HDA* significantly outperforms TDS. A simple hybrid which combines HDA* and TDS to exploit strengths of both algorithms is proposed and evaluated.Comment: in press, to appear in Artificial Intelligenc

arXiv.org e-Print Archive

Elsevier - Publisher Connector

Crossref

Macroservers: An Execution Model for DRAM Processor-In-Memory Arrays

Author: Sterling Thomas L.
Zima Hans P.
Publication venue: 'California Institute of Technology Library'
Publication date: 01/01/2000
Field of study

The emergence of semiconductor fabrication technology allowing a tight coupling between high-density DRAM and CMOS logic on the same chip has led to the important new class of Processor-In-Memory (PIM) architectures. Newer developments provide powerful parallel processing capabilities on the chip, exploiting the facility to load wide words in single memory accesses and supporting complex address manipulations in the memory. Furthermore, large arrays of PIMs can be arranged into a massively parallel architecture. In this report, we describe an object-based programming model based on the notion of a macroserver. Macroservers encapsulate a set of variables and methods; threads, spawned by the activation of methods, operate asynchronously on the variables' state space. Data distributions provide a mechanism for mapping large data structures across the memory region of a macroserver, while work distributions allow explicit control of bindings between threads and data. Both data and work distributuions are first-class objects of the model, supporting the dynamic management of data and threads in memory. This offers the flexibility required for fully exploiting the processing power and memory bandwidth of a PIM array, in particular for irregular and adaptive applications. Thread synchronization is based on atomic methods, condition variables, and futures. A special type of lightweight macroserver allows the formulation of flexible scheduling strategies for the access to resources, using a monitor-like mechanism

CiteSeerX

Caltech Authors

Analyses and optimizations of timing-constrained embedded systems considering resource synchronization and machine learning approaches

Author: Shi Junjie
Publication venue
Publication date: 01/01/2023
Field of study

Nowadays, embedded systems have become ubiquitous, powering a vast array of applications from consumer electronics to industrial automation. Concurrently, statistical and machine learning algorithms are being increasingly adopted across various application domains, such as medical diagnosis, autonomous driving, and environmental analysis, offering sophisticated data analysis and decision-making capabilities. As the demand for intelligent and time-sensitive applications continues to surge, accompanied by growing concerns regarding data privacy, the deployment of machine learning models on embedded devices has emerged as an indispensable requirement. However, this integration introduces both significant opportunities for performance enhancement and complex challenges in deployment optimization. On the one hand, deploying machine learning models on embedded systems with limited computational capacity, power budgets, and stringent timing requirements necessitates additional adjustments to ensure optimal performance and meet the imposed timing constraints. On the other hand, the inherent capabilities of machine learning, such as self-adaptation during runtime, prove invaluable in addressing challenges encountered in embedded systems, aiding in optimization and decision-making processes. This dissertation introduces two primary modifications for the analyses and optimizations of timing-constrained embedded systems. For one thing, it addresses the relatively long access times required for shared resources of machine learning tasks. For another, it considers the limited communication resources and data privacy concerns in distributed embedded systems when deploying machine learning models. Additionally, this work provides a use case that employs a machine learning method to tackle challenges specific to embedded systems. By addressing these key aspects, this dissertation contributes to the analysis and optimization of timing-constrained embedded systems, considering resource synchronization and machine learning models to enable improved performance and efficiency in real-time applications with stringent constraints

Eldorado - Ressourcen aus und für Lehre, Studium und Forschung