38,391 research outputs found
Probabilistic Timed Automata with Clock-Dependent Probabilities
Probabilistic timed automata are classical timed automata extended with
discrete probability distributions over edges. We introduce clock-dependent
probabilistic timed automata, a variant of probabilistic timed automata in
which transition probabilities can depend linearly on clock values.
Clock-dependent probabilistic timed automata allow the modelling of a
continuous relationship between time passage and the likelihood of system
events. We show that the problem of deciding whether the maximum probability of
reaching a certain location is above a threshold is undecidable for
clock-dependent probabilistic timed automata. On the other hand, we show that
the maximum and minimum probability of reaching a certain location in
clock-dependent probabilistic timed automata can be approximated using a
region-graph-based approach.Comment: Full version of a paper published at RP 201
High-level synthesis optimization for blocked floating-point matrix multiplication
In the last decade floating-point matrix multiplication on FPGAs has been studied extensively and efficient architectures as well as detailed performance models have been developed. By design these IP cores take a fixed footprint which not necessarily optimizes the use of all available resources. Moreover, the low-level architectures are not easily amenable to a parameterized synthesis. In this paper high-level synthesis is used to fine-tune the configuration parameters in order to achieve the highest performance with maximal resource utilization. An\ exploration strategy is presented to optimize the use of critical resources (DSPs, memory) for any given FPGA. To account for the limited memory size on the FPGA, a block-oriented matrix multiplication is organized such that the block summation is done on the CPU while the block multiplication occurs on the logic fabric simultaneously. The communication overhead between the CPU and the FPGA is minimized by streaming the blocks in a Gray code ordering scheme which maximizes the data reuse for consecutive block matrix product calculations. Using high-level synthesis optimization, the programmable logic operates at 93% of the theoretical peak performance and the combined CPU-FPGA design achieves 76% of the available hardware processing speed for the floating-point multiplication of 2K by 2K matrices
From FPGA to ASIC: A RISC-V processor experience
This work document a correct design flow using these tools in the Lagarto RISC- V Processor and the RTL design considerations that must be taken into account, to move from a design for FPGA to design for ASIC
A 64-point Fourier transform chip for high-speed wireless LAN application using OFDM
In this article, we present a novel fixed-point 16-bit word-width 64-point FFT/IFFT processor developed primarily for the application in the OFDM based IEEE 802.11a Wireless LAN (WLAN) baseband processor. The 64-point FFT is realized by decomposing it into a 2-D structure of 8-point FFTs. This approach reduces the number of required complex multiplications compared to the conventional radix-2 64-point FFT algorithm. The complex multiplication operations are realized using shift-and-add operations. Thus, the processor does not use any 2-input digital multiplier. It also does not need any RAM or ROM for internal storage of coefficients. The proposed 64-point FFT/IFFT processor has been fabricated and tested successfully using our in-house 0.25 ?m BiCMOS technology. The core area of this chip is 6.8 mm2. The average dynamic power consumption is 41 mW @ 20 MHz operating frequency and 1.8 V supply voltage. The processor completes one parallel-to-parallel (i. e., when all input data are available in parallel and all output data are generated in parallel) 64-point FFT computation in 23 cycles. These features show that though it has been developed primarily for application in the IEEE 802.11a standard, it can be used for any application that requires fast operation as well as low power consumption
Validating foundry technologies for extended mission profiles
This paper presents a process qualification and characterization strategy that can extend the foundry process reliability potential to meet specific automotive mission profile requirements. In this case study, data and analyses are provided that lead to sufficient confidence for pushing the allowed mission profile envelope of a process towards more aggressive (automotive) applications.\ud
\u
Digital IP Protection Using Threshold Voltage Control
This paper proposes a method to completely hide the functionality of a
digital standard cell. This is accomplished by a differential threshold logic
gate (TLG). A TLG with inputs implements a subset of Boolean functions of
variables that are linear threshold functions. The output of such a gate is
one if and only if an integer weighted linear arithmetic sum of the inputs
equals or exceeds a given integer threshold. We present a novel architecture of
a TLG that not only allows a single TLG to implement a large number of complex
logic functions, which would require multiple levels of logic when implemented
using conventional logic primitives, but also allows the selection of that
subset of functions by assignment of the transistor threshold voltages to the
input transistors. To obfuscate the functionality of the TLG, weights of some
inputs are set to zero by setting their device threshold to be a high .
The threshold voltage of the remaining transistors is set to low to
increase their transconductance. The function of a TLG is not determined by the
cell itself but rather the signals that are connected to its inputs. This makes
it possible to hide the support set of the function by essentially removing
some variable from the support set of the function by selective assignment of
high and low to the input transistors. We describe how a standard cell
library of TLGs can be mixed with conventional standard cells to realize
complex logic circuits, whose function can never be discovered by reverse
engineering. A 32-bit Wallace tree multiplier and a 28-bit 4-tap filter were
synthesized on an ST 65nm process, placed and routed, then simulated including
extracted parastics with and without obfuscation. Both obfuscated designs had
much lower area (25%) and much lower dynamic power (30%) than their
nonobfuscated CMOS counterparts, operating at the same frequency
LEGaTO: first steps towards energy-efficient toolset for heterogeneous computing
LEGaTO is a three-year EU H2020 project which started in December 2017. The LEGaTO project will leverage task-based programming models to provide a software ecosystem for Made-in-Europe heterogeneous hardware composed of CPUs, GPUs, FPGAs and dataflow engines. The aim is to attain one order of magnitude energy savings from the edge to the converged cloud/HPC.Peer ReviewedPostprint (author's final draft
Belle II Technical Design Report
The Belle detector at the KEKB electron-positron collider has collected
almost 1 billion Y(4S) events in its decade of operation. Super-KEKB, an
upgrade of KEKB is under construction, to increase the luminosity by two orders
of magnitude during a three-year shutdown, with an ultimate goal of 8E35 /cm^2
/s luminosity. To exploit the increased luminosity, an upgrade of the Belle
detector has been proposed. A new international collaboration Belle-II, is
being formed. The Technical Design Report presents physics motivation, basic
methods of the accelerator upgrade, as well as key improvements of the
detector.Comment: Edited by: Z. Dole\v{z}al and S. Un
Towards a Scalable Hardware/Software Co-Design Platform for Real-time Pedestrian Tracking Based on a ZYNQ-7000 Device
Currently, most designers face a daunting task to
research different design flows and learn the intricacies of
specific software from various manufacturers in
hardware/software co-design. An urgent need of creating a
scalable hardware/software co-design platform has become a key
strategic element for developing hardware/software integrated
systems. In this paper, we propose a new design flow for building
a scalable co-design platform on FPGA-based system-on-chip.
We employ an integrated approach to implement a histogram
oriented gradients (HOG) and a support vector machine (SVM)
classification on a programmable device for pedestrian tracking.
Not only was hardware resource analysis reported, but the
precision and success rates of pedestrian tracking on nine open
access image data sets are also analysed. Finally, our proposed
design flow can be used for any real-time image processingrelated
products on programmable ZYNQ-based embedded
systems, which benefits from a reduced design time and provide a
scalable solution for embedded image processing products
- …