7 research outputs found

    The Polyhedral Model Beyond Loops Recursion Optimization and Parallelization Through Polyhedral Modeling

    Get PDF
    International audienceThere may be a huge gap between the statements outlined by programmers in a program source code and instructions that are actually performed by a given processor architecture when running the executable code. This gap is due to the way the input code has been interpreted, translated and transformed by the compiler and the final processor hardware. Thus, there is an opportunity for efficient optimization strategies, that are dedicated to specific control structures and memory access patterns, to apply as soon as the actual runtime behavior has been discovered, even if they could not have been applied on the original source code. In this paper, we develop this idea by identifying code extracts that behave as polyhedral-compliant loops at runtime, while not having been outlined at all as loops in the original source code. In particular, we are interested in recursive functions whose runtime behavior can be modeled as polyhedral loops. Therefore, the scope of this study exclusively includes recursive functions whose control flow and memory accesses exhibit an affine behavior, which means that there exists a semantically equivalent affine loop nest, candidate for poly-hedral optimizations. Accordingly, our approach is based on analyzing early executions of a recursive program using a Nested Loop Recognition (NLR) algorithm, performing the affine loop modeling of the original program runtime behavior , which is then used to generate an equivalent iterative program, finally optimized using the polyhedral compiler Polly. We present some preliminary results showing that this approach brings recursion optimization techniques into a higher level in addition to widening the scope of the polyhe-dral model to include originally non-loop programs

    Networks on Chips: Structure and Design Methodologies

    Get PDF

    Adaptive streaming applications : analysis and implementation models

    Get PDF
    This thesis presents a highly automated design framework, called DaedalusRT, and several novel techniques. As the foundation of the DaedalusRT design framework, two types of dataflow Models-of-Computation (MoC) are used, one as timing analysis model and another one as the implementation model. The timing analysis model is used to formally reason about timing behavior of an application. In the context of DaedalusRT, the Mode-Aware Data Flow (MADF) MoC has been developed as the timing analysis model for adaptive streaming applications using different static modes. A novel mode transition protocol is devised to allow efficient reasoning of timing behavior during mode transitions. Based on the transition protocol, a hard real-time scheduling approach is proposed. On the other hand, the implementation model is used for efficient code generation of parallel computation, communication, and synchronization. In this thesis, the Parameterized Polyhedral Process Network (P3N) MoC has been developed to model adaptive streaming applications with parameter reconfiguration. An approach to verify the functional property of the P3N MoC has been devised. Finally, implementation of the P3N MoC on a MPSoC platform has shown that run-time performance penalty due to parameter reconfiguration is negligible.Technology Foundation STWComputer Systems, Imagery and Medi

    Worst-case temporal analysis of real-time dynamic streaming applications

    Get PDF

    Self-adaptivity of applications on network on chip multiprocessors: the case of fault-tolerant Kahn process networks

    Get PDF
    Technology scaling accompanied with higher operating frequencies and the ability to integrate more functionality in the same chip has been the driving force behind delivering higher performance computing systems at lower costs. Embedded computing systems, which have been riding the same wave of success, have evolved into complex architectures encompassing a high number of cores interconnected by an on-chip network (usually identified as Multiprocessor System-on-Chip). However these trends are hindered by issues that arise as technology scaling continues towards deep submicron scales. Firstly, growing complexity of these systems and the variability introduced by process technologies make it ever harder to perform a thorough optimization of the system at design time. Secondly, designers are faced with a reliability wall that emerges as age-related degradation reduces the lifetime of transistors, and as the probability of defects escaping post-manufacturing testing is increased. In this thesis, we take on these challenges within the context of streaming applications running in network-on-chip based parallel (not necessarily homogeneous) systems-on-chip that adopt the no-remote memory access model. In particular, this thesis tackles two main problems: (1) fault-aware online task remapping, (2) application-level self-adaptation for quality management. For the former, by viewing fault tolerance as a self-adaptation aspect, we adopt a cross-layer approach that aims at graceful performance degradation by addressing permanent faults in processing elements mostly at system-level, in particular by exploiting redundancy available in multi-core platforms. We propose an optimal solution based on an integer linear programming formulation (suitable for design time adoption) as well as heuristic-based solutions to be used at run-time. We assess the impact of our approach on the lifetime reliability. We propose two recovery schemes based on a checkpoint-and-rollback and a rollforward technique. For the latter, we propose two variants of a monitor-controller- adapter loop that adapts application-level parameters to meet performance goals. We demonstrate not only that fault tolerance and self-adaptivity can be achieved in embedded platforms, but also that it can be done without incurring large overheads. In addressing these problems, we present techniques which have been realized (depending on their characteristics) in the form of a design tool, a run-time library or a hardware core to be added to the basic architecture
    corecore