22,200 research outputs found

    An MDP decomposition approach for traffic control at isolated signalized intersections

    Get PDF
    This article presents a novel approach for the dynamic control of a signalized intersection. At the intersection, there is a number of arrival flows of cars, each having a single queue (lane). The set of all flows is partitioned into disjoint combinations of nonconflicting flows that will receive green together. The dynamic control of the traffic lights is based on the numbers of cars waiting in the queues. The problem concerning when to switch (and which combination to serve next) is modeled as a Markovian decision process in discrete time. For large intersections (i.e., intersections with a large number of flows), the number of states becomes tremendously large, prohibiting straightforward optimization using value iteration or policy iteration. Starting from an optimal (or nearly optimal) fixed-cycle strategy, a one-step policy improvement is proposed that is easy to compute and is shown to give a close to optimal strategy for the dynamic proble

    Optimization as a design strategy. Considerations based on building simulation-assisted experiments about problem decomposition

    Full text link
    In this article the most fundamental decomposition-based optimization method - block coordinate search, based on the sequential decomposition of problems in subproblems - and building performance simulation programs are used to reason about a building design process at micro-urban scale and strategies are defined to make the search more efficient. Cyclic overlapping block coordinate search is here considered in its double nature of optimization method and surrogate model (and metaphore) of a sequential design process. Heuristic indicators apt to support the design of search structures suited to that method are developed from building-simulation-assisted computational experiments, aimed to choose the form and position of a small building in a plot. Those indicators link the sharing of structure between subspaces ("commonality") to recursive recombination, measured as freshness of the search wake and novelty of the search moves. The aim of these indicators is to measure the relative effectiveness of decomposition-based design moves and create efficient block searches. Implications of a possible use of these indicators in genetic algorithms are also highlighted.Comment: 48 pages. 12 figures, 3 table

    Solution of partial differential equations on vector and parallel computers

    Get PDF
    The present status of numerical methods for partial differential equations on vector and parallel computers was reviewed. The relevant aspects of these computers are discussed and a brief review of their development is included, with particular attention paid to those characteristics that influence algorithm selection. Both direct and iterative methods are given for elliptic equations as well as explicit and implicit methods for initial boundary value problems. The intent is to point out attractive methods as well as areas where this class of computer architecture cannot be fully utilized because of either hardware restrictions or the lack of adequate algorithms. Application areas utilizing these computers are briefly discussed

    A bibliography on parallel and vector numerical algorithms

    Get PDF
    This is a bibliography of numerical methods. It also includes a number of other references on machine architecture, programming language, and other topics of interest to scientific computing. Certain conference proceedings and anthologies which have been published in book form are listed also

    Transformations of High-Level Synthesis Codes for High-Performance Computing

    Full text link
    Specialized hardware architectures promise a major step in performance and energy efficiency over the traditional load/store devices currently employed in large scale computing systems. The adoption of high-level synthesis (HLS) from languages such as C/C++ and OpenCL has greatly increased programmer productivity when designing for such platforms. While this has enabled a wider audience to target specialized hardware, the optimization principles known from traditional software design are no longer sufficient to implement high-performance codes. Fast and efficient codes for reconfigurable platforms are thus still challenging to design. To alleviate this, we present a set of optimizing transformations for HLS, targeting scalable and efficient architectures for high-performance computing (HPC) applications. Our work provides a toolbox for developers, where we systematically identify classes of transformations, the characteristics of their effect on the HLS code and the resulting hardware (e.g., increases data reuse or resource consumption), and the objectives that each transformation can target (e.g., resolve interface contention, or increase parallelism). We show how these can be used to efficiently exploit pipelining, on-chip distributed fast memory, and on-chip streaming dataflow, allowing for massively parallel architectures. To quantify the effect of our transformations, we use them to optimize a set of throughput-oriented FPGA kernels, demonstrating that our enhancements are sufficient to scale up parallelism within the hardware constraints. With the transformations covered, we hope to establish a common framework for performance engineers, compiler developers, and hardware developers, to tap into the performance potential offered by specialized hardware architectures using HLS

    A survey of parallel execution strategies for transitive closure and logic programs

    Get PDF
    An important feature of database technology of the nineties is the use of parallelism for speeding up the execution of complex queries. This technology is being tested in several experimental database architectures and a few commercial systems for conventional select-project-join queries. In particular, hash-based fragmentation is used to distribute data to disks under the control of different processors in order to perform selections and joins in parallel. With the development of new query languages, and in particular with the definition of transitive closure queries and of more general logic programming queries, the new dimension of recursion has been added to query processing. Recursive queries are complex; at the same time, their regular structure is particularly suited for parallel execution, and parallelism may give a high efficiency gain. We survey the approaches to parallel execution of recursive queries that have been presented in the recent literature. We observe that research on parallel execution of recursive queries is separated into two distinct subareas, one focused on the transitive closure of Relational Algebra expressions, the other one focused on optimization of more general Datalog queries. Though the subareas seem radically different because of the approach and formalism used, they have many common features. This is not surprising, because most typical Datalog queries can be solved by means of the transitive closure of simple algebraic expressions. We first analyze the relationship between the transitive closure of expressions in Relational Algebra and Datalog programs. We then review sequential methods for evaluating transitive closure, distinguishing iterative and direct methods. We address the parallelization of these methods, by discussing various forms of parallelization. Data fragmentation plays an important role in obtaining parallel execution; we describe hash-based and semantic fragmentation. Finally, we consider Datalog queries, and present general methods for parallel rule execution; we recognize the similarities between these methods and the methods reviewed previously, when the former are applied to linear Datalog queries. We also provide a quantitative analysis that shows the impact of the initial data distribution on the performance of methods
    corecore