753 research outputs found
The Case for Asymmetric Systolic Array Floorplanning
The widespread proliferation of deep learning applications has triggered the
need to accelerate them directly in hardware. General Matrix Multiplication
(GEMM) kernels are elemental deep-learning constructs and they inherently map
onto Systolic Arrays (SAs). SAs are regular structures that are well-suited for
accelerating matrix multiplications. Typical SAs use a pipelined array of
Processing Elements (PEs), which communicate with local connections and
pre-orchestrated data movements. In this work, we show that the physical layout
of SAs should be asymmetric to minimize wirelength and improve energy
efficiency. The floorplan of the SA adjusts better to the asymmetric widths of
the horizontal and vertical data buses and their switching activity profiles.
It is demonstrated that such physically asymmetric SAs reduce interconnect
power by 9.1% when executing state-of-the-art Convolutional Neural Network
(CNN) layers, as compared to SAs of the same size but with a square (i.e.,
symmetric) layout. The savings in interconnect power translate, in turn, to
2.1% overall power savings.Comment: CNNA 202
Low-Power Data Streaming in Systolic Arrays with Bus-Invert Coding and Zero-Value Clock Gating
Systolic Array (SA) architectures are well suited for accelerating matrix
multiplications through the use of a pipelined array of Processing Elements
(PEs) communicating with local connections and pre-orchestrated data movements.
Even though most of the dynamic power consumption in SAs is due to
multiplications and additions, pipelined data movement within the SA
constitutes an additional important contributor. The goal of this work is to
reduce the dynamic power consumption associated with the feeding of data to the
SA, by synergistically applying bus-invert coding and zero-value clock gating.
By exploiting salient attributes of state-of-the-art CNNs, such as the value
distribution of the weights, the proposed SA applies appropriate encoding only
to the data that exhibits high switching activity. Similarly, when one of the
inputs is zero, unnecessary operations are entirely skipped. This selectively
targeted, application-aware encoding approach is demonstrated to reduce the
dynamic power consumption of data streaming in CNN applications using Bfloat16
arithmetic by 1%-19%. This translates to an overall dynamic power reduction of
6.2%-9.4%.Comment: International Conference on Modern Circuits and Systems Technologies
(MOCAST
ArrayFlex: A Systolic Array Architecture with Configurable Transparent Pipelining
Convolutional Neural Networks (CNNs) are the state-of-the-art solution for
many deep learning applications. For maximum scalability, their computation
should combine high performance and energy efficiency. In practice, the
convolutions of each CNN layer are mapped to a matrix multiplication that
includes all input features and kernels of each layer and is computed using a
systolic array. In this work, we focus on the design of a systolic array with
configurable pipeline with the goal to select an optimal pipeline configuration
for each CNN layer. The proposed systolic array, called ArrayFlex, can operate
in normal, or in shallow pipeline mode, thus balancing the execution time in
cycles and the operating clock frequency. By selecting the appropriate pipeline
configuration per CNN layer, ArrayFlex reduces the inference latency of
state-of-the-art CNNs by 11%, on average, as compared to a traditional
fixed-pipeline systolic array. Most importantly, this result is achieved while
using 13%-23% less power, for the same applications, thus offering a combined
energy-delay-product efficiency between 1.4x and 1.8x.Comment: DATE 202
Ambipolar charge injection and transport in a single pentacene monolayer island
Electrons and holes are locally injected in a single pentacene monolayer
island. The two-dimensional distribution and concentration of the injected
carriers are measured by electrical force microscopy. In crystalline monolayer
islands, both carriers are delocalized over the whole island. On disordered
monolayer, carriers stay localized at their injection point. These results
provide insight into the electronic properties, at the nanometer scale, of
organic monolayers governing performances of organic transistors and molecular
devices.Comment: To be published in Nano Letter
Exsolution-enhanced reverse water-gas shift chemical looping activity of Sr2FeMo0.6Ni0.4O6-δ double perovskite
This study investigates the structural evolution and redox characteristics of the double perovskite Sr2FeMo0.6Ni0.4O6-δ (SFMN) during hydrogen (H2) and carbon dioxide (CO2) redox cycles and explores the material performance in the Reverse Water-Gas Shift Chemical Looping (RWGS-CL) reaction. In-situ and ex-situ X-Ray Diffraction (XRD) and High-Resolution Transmission Electron Microscopy (HRTEM) studies reveal that H2 reduction at temperatures above 800 °C leads to the exsolution of bimetallic Ni-Fe alloy particles and the formation of a Ruddlesden-Popper (RP) phase. A core–shell structure with Ni-Fe core and a perovskite oxide shell is formed with subsequent redox cycles, and the resulting material exhibits better performance and high stability in the RWGS-CL process. Thermogravimetric (TGA) and Temperature Programmed Reduction (TPR) and Oxidation (TPO) analyses show that the optimal reduction and oxidation temperatures for maximizing the CO yield are around 850 °C and 750 °C respectively, and that the cycled material is able to work steadily under isothermal conditions at 850 °C
Exciton bimolecular annihilation dynamics in supramolecular nanostructures of conjugated oligomers
We present femtosecond transient absorption measurements on -conjugated
supramolecular assemblies in a high pump fluence regime.
Oligo(\emph{p}-phenylenevinylene) monofunctionalized with
ureido-\emph{s}-triazine (MOPV) self-assembles into chiral stacks in dodecane
solution below 75C at a concentration of M. We
observe exciton bimolecular annihilation in MOPV stacks at high excitation
fluence, indicated by the fluence-dependent decay of B-exciton
spectral signatures, and by the sub-linear fluence dependence of time- and
wavelength-integrated photoluminescence (PL) intensity. These two
characteristics are much less pronounced in MOPV solution where the phase
equilibrium is shifted significantly away from supramolecular assembly,
slightly below the transition temperature. A mesoscopic rate-equation model is
applied to extract the bimolecular annihilation rate constant from the
excitation fluence dependence of transient absorption and PL signals. The
results demonstrate that the bimolecular annihilation rate is very high with a
square-root dependence in time. The exciton annihilation results from a
combination of fast exciton diffusion and resonance energy transfer. The
supramolecular nanostructures studied here have electronic properties that are
intermediate between molecular aggregates and polymeric semiconductors
On the Munn-Silbey approach to polaron transport with off-diagonal coupling
Improved results using a method similar to the Munn-Silbey approach have been
obtained on the temperature dependence of transport properties of an extended
Holstein model incorporating simultaneous diagonal and off-diagonal
exciton-phonon coupling. The Hamiltonian is partially diagonalized by a
canonical transformation, and optimal transformation coefficients are
determined in a self-consistent manner. Calculated transport properties exhibit
substantial corrections on those obtained previously by Munn and Silbey for a
wide range of temperatures thanks to a numerically exact evaluation and an
added momentum-dependence of the transformation matrix. Results on the
diffusion coefficient in the moderate and weak coupling regime show distinct
band-like and hopping-like transport features as a function of temperature.Comment: 12 pages, 6 figures, accpeted in Journal of Physical Chemistry B:
Shaul Mukamel Festschrift (2011
Belowground DNA-based techniques: untangling the network of plant root interactions
Contains fulltext :
91591.pdf (publisher's version ) (Closed access)7 p
Nonexponetial relaxation of photoinduced conductance in organic field effect transistor
We report detailed studies of the slow relaxation of the photoinduced excess
charge carriers in organic metal-insulator-semiconductor field effect
transistors consisting of poly(3-hexylthiophene) as the active layer. The
relaxation process cannot be physically explained by processes, which lead to a
simple or a stretched-exponential decay behavior. Models based on serial
relaxation dynamics due to a hierarchy of systems with increasing spatial
separation of the photo-generated negative and positive charges are used to
explain the results. In order to explain the observed trend, the model is
further modified by introducing a gate voltage dependent coulombic distribution
manifested by the trapped negative charge carriers.Comment: 17 pages, 3 Figure
- …