50,794 research outputs found

    Pixie: A heterogeneous Virtual Coarse-Grained Reconfigurable Array for high performance image processing applications

    Full text link
    Coarse-Grained Reconfigurable Arrays (CGRAs) enable ease of programmability and result in low development costs. They enable the ease of use specifically in reconfigurable computing applications. The smaller cost of compilation and reduced reconfiguration overhead enables them to become attractive platforms for accelerating high-performance computing applications such as image processing. The CGRAs are ASICs and therefore, expensive to produce. However, Field Programmable Gate Arrays (FPGAs) are relatively cheaper for low volume products but they are not so easily programmable. We combine best of both worlds by implementing a Virtual Coarse-Grained Reconfigurable Array (VCGRA) on FPGA. VCGRAs are a trade off between FPGA with large routing overheads and ASICs. In this perspective we present a novel heterogeneous Virtual Coarse-Grained Reconfigurable Array (VCGRA) called "Pixie" which is suitable for implementing high performance image processing applications. The proposed VCGRA contains generic processing elements and virtual channels that are described using the Hardware Description Language VHDL. Both elements have been optimized by using the parameterized configuration tool flow and result in a resource reduction of 24% for each processing elements and 82% for each virtual channels respectively.Comment: Presented at 3rd International Workshop on Overlay Architectures for FPGAs (OLAF 2017) arXiv:1704.0880

    CAN Fieldbus Communication in the CSP-based CT Library

    Get PDF
    In closed-loop control systems several realworld entities are simultaneously communicated to through a multitude of spatially distributed sensors and actuators. This intrinsic parallelism and complexity motivates implementing control software in the form of concurrent processes deployed on distributed hardware architectures. A CSP based occam-like architecture seems to be the most convenient for such a purpose. Many, often conflicting, requirements make design and implementation of distributed real-time control systems an extremely difficult task. The scope of this paper is limited to achieving safe and real-time communication over a CAN fieldbus for an\ud existing CSP-based framework

    Toward Sequentializing Overparallelized Protocol Code

    Full text link
    In our ongoing work, we use constraint automata to compile protocol specifications expressed as Reo connectors into efficient executable code, e.g., in C. We have by now studied this automata based compilation approach rather well, and have devised effective solutions to some of its problems. Because our approach is based on constraint automata, the approach, its problems, and our solutions are in fact useful and relevant well beyond the specific case of compiling Reo. In this short paper, we identify and analyze two such rather unexpected problems.Comment: In Proceedings ICE 2014, arXiv:1410.701

    CapablePtrs: Securely Compiling Partial Programs using the Pointers-as-Capabilities Principle

    Get PDF
    Capability machines such as CHERI provide memory capabilities that can be used by compilers to provide security benefits for compiled code (e.g., memory safety). The C to CHERI compiler, for example, achieves memory safety by following a principle called "pointers as capabilities" (PAC). Informally, PAC says that a compiler should represent a source language pointer as a machine code capability. But the security properties of PAC compilers are not yet well understood. We show that memory safety is only one aspect, and that PAC compilers can provide significant additional security guarantees for partial programs: the compiler can provide guarantees for a compilation unit, even if that compilation unit is later linked to attacker-controlled machine code. This paper is the first to study the security of PAC compilers for partial programs formally. We prove for a model of such a compiler that it is fully abstract. The proof uses a novel proof technique (dubbed TrICL, read trickle), which is of broad interest because it reuses and extends the compiler correctness relation in a natural way, as we demonstrate. We implement our compiler on top of the CHERI platform and show that it can compile legacy C code with minimal code changes. We provide performance benchmarks that show how performance overhead is proportional to the number of cross-compilation-unit function calls

    Streamlined islands and the English Channel megaflood hypothesis

    Get PDF
    Recognising ice-age catastrophic megafloods is important because they had significant impact on large-scale drainage evolution and patterns of water and sediment movement to the oceans, and likely induced very rapid, short-term effects on climate. It has been previously proposed that a drainage system on the floor of the English Channel was initiated by catastrophic flooding in the Pleistocene but this suggestion has remained controversial. Here we examine this hypothesis through an analysis of key landform features. We use a new compilation of multi- and single-beam bathymetry together with sub-bottom profiler data to establish the internal structure, planform geometry and hence origin of a set of 36 mid-channel islands. Whilst there is evidence of modern-day surficial sediment processes, the majority of the islands can be clearly demonstrated to be formed of bedrock, and are hence erosional remnants rather than depositional features. The islands display classic lemniscate or tear-drop outlines, with elongated tips pointing downstream, typical of streamlined islands formed during high-magnitude water flow. The length-to-width ratio for the entire island population is 3.4 ± 1.3 and the degree-of-elongation or k-value is 3.7 ± 1.4. These values are comparable to streamlined islands in other proven Pleistocene catastrophic flood terrains and are distinctly different to values found in modern-day rivers. The island geometries show a correlation with bedrock type: with those carved from Upper Cretaceous chalk having larger length-to-width ratios (3.2 ± 1.3) than those carved into more mixed Paleogene terrigenous sandstones, siltstones and mudstones (3.0 ± 1.5). We attribute these differences to the former rock unit having a lower skin friction which allowed longer island growth to achieve minimum drag. The Paleogene islands, although less numerous than the Chalk islands, also assume more perfect lemniscate shapes. These lithologies therefore reached island equilibrium shape more quickly but were also susceptible to total erosion. Our observations support the hypothesis that the islands were initially carved by high-water volume flows via a unique catastrophic drainage of a pro-glacial lake in the southern North Sea at the Dover Strait rather than by fluvial erosion throughout the Pleistocene

    ret2spec: Speculative Execution Using Return Stack Buffers

    Full text link
    Speculative execution is an optimization technique that has been part of CPUs for over a decade. It predicts the outcome and target of branch instructions to avoid stalling the execution pipeline. However, until recently, the security implications of speculative code execution have not been studied. In this paper, we investigate a special type of branch predictor that is responsible for predicting return addresses. To the best of our knowledge, we are the first to study return address predictors and their consequences for the security of modern software. In our work, we show how return stack buffers (RSBs), the core unit of return address predictors, can be used to trigger misspeculations. Based on this knowledge, we propose two new attack variants using RSBs that give attackers similar capabilities as the documented Spectre attacks. We show how local attackers can gain arbitrary speculative code execution across processes, e.g., to leak passwords another user enters on a shared system. Our evaluation showed that the recent Spectre countermeasures deployed in operating systems can also cover such RSB-based cross-process attacks. Yet we then demonstrate that attackers can trigger misspeculation in JIT environments in order to leak arbitrary memory content of browser processes. Reading outside the sandboxed memory region with JIT-compiled code is still possible with 80\% accuracy on average.Comment: Updating to the cam-ready version and adding reference to the original pape

    The Structure of Rapidly Rotating Late-Type Spiral Galaxies: I. Photometry, HI and Optical Kinematics

    Full text link
    We present I-band photometry, long-slit optical spectroscopy, and new aperture synthesis HI observations for eight late-type spirals with rotation velocities in the range 243 km/s < V_{rot} < 308 km/s. The sample will be used to study the structure and angular momentum of disks at the high-mass end of the spiral galaxy population; here we discuss the basic properties of these ``fast rotators'', and derive hybrid optical/HI rotation curves for each. Despite the presence of HI warps and low-mass companions in many systems, their kinematics are regular and there is excellent agreement between optical and HI tracers near the optical radius r_{opt}. At high inclinations at which projection effects are negligible, the sample galaxies exhibit flat, featureless rotation curves out to their last measured points at 1.7r_{opt}--3.5 r_{opt}. The intermediate inclination systems are also consistent with a constant rotation amplitude for r > 0.5 r_{opt}. We therefore find no evidence for declining rotation curves at the high-mass end of the late-type spiral galaxy population. Combining our data with the compilation of spirals with reliable outer HI kinematics from the work of Casertano & van Gorkom, we find no convincing trends between logarithmic outer rotation curve slopes and rotation amplitudes or surface brightnesses for galaxies with V_{rot} > 220 km/s. Correlations between these slopes and morphological types or disk scale lengths are also marginal in this regime.Comment: v2: minor changes to match proofs. 23 pages, 15 figures, AJ in press. For version with high resolution figures, see http://www.physics.rutgers.edu/~spekkens/papers/fast1.pd

    Fully differential NNLO computations with MATRIX

    Full text link
    We present the computational framework MATRIX which allows us to evaluate fully differential cross sections for a wide class of processes at hadron colliders in next-to-next-to-leading order (NNLO) QCD. The processes we consider are 2→12\to 1 and 2→22\to 2 hadronic reactions involving Higgs and vector bosons in the final state. All possible leptonic decay channels of the vector bosons are included for the first time in the calculations, by consistently accounting for all resonant and non-resonant diagrams, off-shell effects and spin correlations. We briefly introduce the theoretical framework MATRIX is based on, discuss its relevant features and provide a detailed description of how to use MATRIX to obtain NNLO accurate results for the various processes. We report reference predictions for inclusive and fiducial cross sections of all the physics processes considered here and discuss their corresponding uncertainties. MATRIX features an automatic extrapolation procedure that allows us, for the first time, to control the systematic uncertainties inherent to the applied NNLO subtraction procedure down to the few permille level (or better).Comment: 76 pages, 2 figures, 11 table

    Resource Control for Synchronous Cooperative Threads

    Get PDF
    We develop new methods to statically bound the resources needed for the execution of systems of concurrent, interactive threads. Our study is concerned with a \emph{synchronous} model of interaction based on cooperative threads whose execution proceeds in synchronous rounds called instants. Our contribution is a system of compositional static analyses to guarantee that each instant terminates and to bound the size of the values computed by the system as a function of the size of its parameters at the beginning of the instant. Our method generalises an approach designed for first-order functional languages that relies on a combination of standard termination techniques for term rewriting systems and an analysis of the size of the computed values based on the notion of quasi-interpretation. We show that these two methods can be combined to obtain an explicit polynomial bound on the resources needed for the execution of the system during an instant. As a second contribution, we introduce a virtual machine and a related bytecode thus producing a precise description of the resources needed for the execution of a system. In this context, we present a suitable control flow analysis that allows to formulte the static analyses for resource control at byte code level
    • …
    corecore