6,215 research outputs found
VLSI Implementation of Deep Neural Network Using Integral Stochastic Computing
The hardware implementation of deep neural networks (DNNs) has recently
received tremendous attention: many applications in fact require high-speed
operations that suit a hardware implementation. However, numerous elements and
complex interconnections are usually required, leading to a large area
occupation and copious power consumption. Stochastic computing has shown
promising results for low-power area-efficient hardware implementations, even
though existing stochastic algorithms require long streams that cause long
latencies. In this paper, we propose an integer form of stochastic computation
and introduce some elementary circuits. We then propose an efficient
implementation of a DNN based on integral stochastic computing. The proposed
architecture has been implemented on a Virtex7 FPGA, resulting in 45% and 62%
average reductions in area and latency compared to the best reported
architecture in literature. We also synthesize the circuits in a 65 nm CMOS
technology and we show that the proposed integral stochastic architecture
results in up to 21% reduction in energy consumption compared to the binary
radix implementation at the same misclassification rate. Due to fault-tolerant
nature of stochastic architectures, we also consider a quasi-synchronous
implementation which yields 33% reduction in energy consumption w.r.t. the
binary radix implementation without any compromise on performance.Comment: 11 pages, 12 figure
Relative multiplexing for minimizing switching in linear-optical quantum computing
Many existing schemes for linear-optical quantum computing (LOQC) depend on
multiplexing (MUX), which uses dynamic routing to enable near-deterministic
gates and sources to be constructed using heralded, probabilistic primitives.
MUXing accounts for the overwhelming majority of active switching demands in
current LOQC architectures. In this manuscript, we introduce relative
multiplexing (RMUX), a general-purpose optimization which can dramatically
reduce the active switching requirements for MUX in LOQC, and thereby reduce
hardware complexity and energy consumption, as well as relaxing demands on
performance for various photonic components. We discuss the application of RMUX
to the generation of entangled states from probabilistic single-photon sources,
and argue that an order of magnitude improvement in the rate of generation of
Bell states can be achieved. In addition, we apply RMUX to the proposal for
percolation of a 3D cluster state in [PRL 115, 020502 (2015)], and we find that
RMUX allows a 2.4x increase in loss tolerance for this architecture.Comment: Published version, New Journal of Physics, Volume 19, June 201
ハードウェアリソースの高稼働率化に基づく細粒度多値リコンフィギャラブルVLSIアーキテクチャ
Tohoku University亀山充隆課
Hardware-based Security for Virtual Trusted Platform Modules
Virtual Trusted Platform modules (TPMs) were proposed as a software-based
alternative to the hardware-based TPMs to allow the use of their cryptographic
functionalities in scenarios where multiple TPMs are required in a single
platform, such as in virtualized environments. However, virtualizing TPMs,
especially virutalizing the Platform Configuration Registers (PCRs), strikes
against one of the core principles of Trusted Computing, namely the need for a
hardware-based root of trust. In this paper we show how strength of
hardware-based security can be gained in virtual PCRs by binding them to their
corresponding hardware PCRs. We propose two approaches for such a binding. For
this purpose, the first variant uses binary hash trees, whereas the other
variant uses incremental hashing. In addition, we present an FPGA-based
implementation of both variants and evaluate their performance
Access and metro network convergence for flexible end-to-end network design
This paper reports on the architectural, protocol, physical layer, and integrated testbed demonstrations carried out by the DISCUS FP7 consortium in the area of access - metro network convergence. Our architecture modeling results show the vast potential for cost and power savings that node consolidation can bring. The architecture, however, also recognizes the limits of long-reach transmission for low-latency 5G services and proposes ways to address such shortcomings in future projects. The testbed results, which have been conducted end-to-end, across access - metro and core, and have targeted all the layers of the network from the application down to the physical layer, show the practical feasibility of the concepts proposed in the project
- …