50,794 research outputs found
Pixie: A heterogeneous Virtual Coarse-Grained Reconfigurable Array for high performance image processing applications
Coarse-Grained Reconfigurable Arrays (CGRAs) enable ease of programmability
and result in low development costs. They enable the ease of use specifically
in reconfigurable computing applications. The smaller cost of compilation and
reduced reconfiguration overhead enables them to become attractive platforms
for accelerating high-performance computing applications such as image
processing. The CGRAs are ASICs and therefore, expensive to produce. However,
Field Programmable Gate Arrays (FPGAs) are relatively cheaper for low volume
products but they are not so easily programmable. We combine best of both
worlds by implementing a Virtual Coarse-Grained Reconfigurable Array (VCGRA) on
FPGA. VCGRAs are a trade off between FPGA with large routing overheads and
ASICs. In this perspective we present a novel heterogeneous Virtual
Coarse-Grained Reconfigurable Array (VCGRA) called "Pixie" which is suitable
for implementing high performance image processing applications. The proposed
VCGRA contains generic processing elements and virtual channels that are
described using the Hardware Description Language VHDL. Both elements have been
optimized by using the parameterized configuration tool flow and result in a
resource reduction of 24% for each processing elements and 82% for each virtual
channels respectively.Comment: Presented at 3rd International Workshop on Overlay Architectures for
FPGAs (OLAF 2017) arXiv:1704.0880
CAN Fieldbus Communication in the CSP-based CT Library
In closed-loop control systems several realworld entities are simultaneously communicated to through a multitude of spatially distributed sensors and actuators. This intrinsic parallelism and complexity motivates implementing control software in the form of concurrent processes deployed on distributed hardware architectures. A CSP based occam-like architecture seems to be the most convenient for such a purpose. Many, often conflicting, requirements make design and implementation of distributed real-time control systems an extremely difficult task. The scope of this paper is limited to achieving safe and real-time communication over a CAN fieldbus for an\ud
existing CSP-based framework
Toward Sequentializing Overparallelized Protocol Code
In our ongoing work, we use constraint automata to compile protocol
specifications expressed as Reo connectors into efficient executable code,
e.g., in C. We have by now studied this automata based compilation approach
rather well, and have devised effective solutions to some of its problems.
Because our approach is based on constraint automata, the approach, its
problems, and our solutions are in fact useful and relevant well beyond the
specific case of compiling Reo. In this short paper, we identify and analyze
two such rather unexpected problems.Comment: In Proceedings ICE 2014, arXiv:1410.701
CapablePtrs: Securely Compiling Partial Programs using the Pointers-as-Capabilities Principle
Capability machines such as CHERI provide memory capabilities that can be
used by compilers to provide security benefits for compiled code (e.g., memory
safety). The C to CHERI compiler, for example, achieves memory safety by
following a principle called "pointers as capabilities" (PAC). Informally, PAC
says that a compiler should represent a source language pointer as a machine
code capability. But the security properties of PAC compilers are not yet well
understood. We show that memory safety is only one aspect, and that PAC
compilers can provide significant additional security guarantees for partial
programs: the compiler can provide guarantees for a compilation unit, even if
that compilation unit is later linked to attacker-controlled machine code. This
paper is the first to study the security of PAC compilers for partial programs
formally. We prove for a model of such a compiler that it is fully abstract.
The proof uses a novel proof technique (dubbed TrICL, read trickle), which is
of broad interest because it reuses and extends the compiler correctness
relation in a natural way, as we demonstrate. We implement our compiler on top
of the CHERI platform and show that it can compile legacy C code with minimal
code changes. We provide performance benchmarks that show how performance
overhead is proportional to the number of cross-compilation-unit function
calls
Streamlined islands and the English Channel megaflood hypothesis
Recognising ice-age catastrophic megafloods is important because they had significant impact on large-scale drainage evolution and patterns of water and sediment movement to the oceans, and likely induced very rapid, short-term effects on climate. It has been previously proposed that a drainage system on the floor of the English Channel was initiated by catastrophic flooding in the Pleistocene but this suggestion has remained controversial. Here we examine this hypothesis through an analysis of key landform features. We use a new compilation of multi- and single-beam bathymetry together with sub-bottom profiler data to establish the internal structure, planform geometry and hence origin of a set of 36 mid-channel islands. Whilst there is evidence of modern-day surficial sediment processes, the majority of the islands can be clearly demonstrated to be formed of bedrock, and are hence erosional remnants rather than depositional features. The islands display classic lemniscate or tear-drop outlines, with elongated tips pointing downstream, typical of streamlined islands formed during high-magnitude water flow. The length-to-width ratio for the entire island population is 3.4 ± 1.3 and the degree-of-elongation or k-value is 3.7 ± 1.4. These values are comparable to streamlined islands in other proven Pleistocene catastrophic flood terrains and are distinctly different to values found in modern-day rivers. The island geometries show a correlation with bedrock type: with those carved from Upper Cretaceous chalk having larger length-to-width ratios (3.2 ± 1.3) than those carved into more mixed Paleogene terrigenous sandstones, siltstones and mudstones (3.0 ± 1.5). We attribute these differences to the former rock unit having a lower skin friction which allowed longer island growth to achieve minimum drag. The Paleogene islands, although less numerous than the Chalk islands, also assume more perfect lemniscate shapes. These lithologies therefore reached island equilibrium shape more quickly but were also susceptible to total erosion. Our observations support the hypothesis that the islands were initially carved by high-water volume flows via a unique catastrophic drainage of a pro-glacial lake in the southern North Sea at the Dover Strait rather than by fluvial erosion throughout the Pleistocene
ret2spec: Speculative Execution Using Return Stack Buffers
Speculative execution is an optimization technique that has been part of CPUs
for over a decade. It predicts the outcome and target of branch instructions to
avoid stalling the execution pipeline. However, until recently, the security
implications of speculative code execution have not been studied.
In this paper, we investigate a special type of branch predictor that is
responsible for predicting return addresses. To the best of our knowledge, we
are the first to study return address predictors and their consequences for the
security of modern software. In our work, we show how return stack buffers
(RSBs), the core unit of return address predictors, can be used to trigger
misspeculations. Based on this knowledge, we propose two new attack variants
using RSBs that give attackers similar capabilities as the documented Spectre
attacks. We show how local attackers can gain arbitrary speculative code
execution across processes, e.g., to leak passwords another user enters on a
shared system. Our evaluation showed that the recent Spectre countermeasures
deployed in operating systems can also cover such RSB-based cross-process
attacks. Yet we then demonstrate that attackers can trigger misspeculation in
JIT environments in order to leak arbitrary memory content of browser
processes. Reading outside the sandboxed memory region with JIT-compiled code
is still possible with 80\% accuracy on average.Comment: Updating to the cam-ready version and adding reference to the
original pape
The Structure of Rapidly Rotating Late-Type Spiral Galaxies: I. Photometry, HI and Optical Kinematics
We present I-band photometry, long-slit optical spectroscopy, and new
aperture synthesis HI observations for eight late-type spirals with rotation
velocities in the range 243 km/s < V_{rot} < 308 km/s. The sample will be used
to study the structure and angular momentum of disks at the high-mass end of
the spiral galaxy population; here we discuss the basic properties of these
``fast rotators'', and derive hybrid optical/HI rotation curves for each.
Despite the presence of HI warps and low-mass companions in many systems, their
kinematics are regular and there is excellent agreement between optical and HI
tracers near the optical radius r_{opt}. At high inclinations at which
projection effects are negligible, the sample galaxies exhibit flat,
featureless rotation curves out to their last measured points at
1.7r_{opt}--3.5 r_{opt}. The intermediate inclination systems are also
consistent with a constant rotation amplitude for r > 0.5 r_{opt}. We therefore
find no evidence for declining rotation curves at the high-mass end of the
late-type spiral galaxy population. Combining our data with the compilation of
spirals with reliable outer HI kinematics from the work of Casertano & van
Gorkom, we find no convincing trends between logarithmic outer rotation curve
slopes and rotation amplitudes or surface brightnesses for galaxies with
V_{rot} > 220 km/s. Correlations between these slopes and morphological types
or disk scale lengths are also marginal in this regime.Comment: v2: minor changes to match proofs. 23 pages, 15 figures, AJ in press.
For version with high resolution figures, see
http://www.physics.rutgers.edu/~spekkens/papers/fast1.pd
Fully differential NNLO computations with MATRIX
We present the computational framework MATRIX which allows us to evaluate
fully differential cross sections for a wide class of processes at hadron
colliders in next-to-next-to-leading order (NNLO) QCD. The processes we
consider are and hadronic reactions involving Higgs and
vector bosons in the final state. All possible leptonic decay channels of the
vector bosons are included for the first time in the calculations, by
consistently accounting for all resonant and non-resonant diagrams, off-shell
effects and spin correlations. We briefly introduce the theoretical framework
MATRIX is based on, discuss its relevant features and provide a detailed
description of how to use MATRIX to obtain NNLO accurate results for the
various processes. We report reference predictions for inclusive and fiducial
cross sections of all the physics processes considered here and discuss their
corresponding uncertainties. MATRIX features an automatic extrapolation
procedure that allows us, for the first time, to control the systematic
uncertainties inherent to the applied NNLO subtraction procedure down to the
few permille level (or better).Comment: 76 pages, 2 figures, 11 table
Resource Control for Synchronous Cooperative Threads
We develop new methods to statically bound the resources needed for the
execution of systems of concurrent, interactive threads. Our study is concerned
with a \emph{synchronous} model of interaction based on cooperative threads
whose execution proceeds in synchronous rounds called instants. Our
contribution is a system of compositional static analyses to guarantee that
each instant terminates and to bound the size of the values computed by the
system as a function of the size of its parameters at the beginning of the
instant. Our method generalises an approach designed for first-order functional
languages that relies on a combination of standard termination techniques for
term rewriting systems and an analysis of the size of the computed values based
on the notion of quasi-interpretation. We show that these two methods can be
combined to obtain an explicit polynomial bound on the resources needed for the
execution of the system during an instant. As a second contribution, we
introduce a virtual machine and a related bytecode thus producing a precise
description of the resources needed for the execution of a system. In this
context, we present a suitable control flow analysis that allows to formulte
the static analyses for resource control at byte code level
- …