129 research outputs found
IOPS: An Unified SpMM Accelerator Based on Inner-Outer-Hybrid Product
Sparse matrix multiplication (SpMM) is widely applied to numerous domains,
such as graph processing, machine learning, and data analytics. However, inner
product based SpMM induces redundant zero-element computing for mismatched
nonzero operands, while outer product based approach lacks input reuse across
Process Elements (PEs) and poor output locality for accumulating partial sum
(psum) matrices. Besides, current works only focus on sparse-sparse matrix
multiplication (SSMM) or sparse-dense matrix multiplication (SDMM), rarely
performing efficiently for both. To address these problems, this paper proposes
an unified SpMM accelerator, called IOPS, hybridizing inner with outer
products. It reuses the input matrix among PEs with inner product dataflow, and
removes zero-element calculations with outer product approach in each PE, which
can efficiently process SSMM and SDMM. Moreover, an address mapping method is
designed to accumulate the irregular sparse psum matrices, reducing the latency
and DRAM access of psum accumulating. Furthermore, an adaptive partition
strategy is proposed to tile the input matrices based on their sparsity ratios,
effectively utilizing the storage of architecture and reducing DRAM access.
Compared with the SSMM accelerator, SpArch, we achieve 1.7x~6.3x energy
efficiency and 1.2x~4.4x resource efficiency, with 1.4x~2.1x DRAM access
saving
IDENTIFICATION AND INVESTIGATION OF PROTEINS INTERACTING AND COOPERATING WITH THE VON HIPPEL-LINDAU TUMOR SUPPRESSOR PROTEIN
Ph.DPHD IN CANCER BIOLOG
Sense: Model Hardware Co-design for Accelerating Sparse CNN on Systolic Array
Sparsity is an intrinsic property of convolutional neural network(CNN) and
worth exploiting for CNN accelerators, but extra processing comes with hardware
overhead, causing many architectures suffering from only minor profit.
Meanwhile, systolic array has been increasingly competitive on CNNs
acceleration for its high spatiotemporal locality and low hardware overhead.
However, the irregularity of sparsity induces imbalanced workload under the
rigid systolic dataflow, causing performance degradation. Thus, this paper
proposed a systolicarray-based architecture, called Sense, for sparse CNN
acceleration by model-hardware co-design, achieving large performance
improvement. To balance input feature map(IFM) and weight loads across
Processing Element(PE) array, we applied channel clustering to gather IFMs with
approximate sparsity for array computation, and co-designed a load-balancing
weight pruning method to keep the sparsity ratio of each kernel at a certain
value with little accuracy loss, improving PE utilization and overall
performance. Additionally, Adaptive Dataflow Configuration is applied to
determine the computing strategy based on the storage ratio of IFMs and
weights, lowering 1.17x-1.8x DRAM access compared with Swallow and further
reducing system energy consumption. The whole design is implemented on
ZynqZCU102 with 200MHz and performs at 471-, 34-, 53- and 191-image/s for
AlexNet, VGG-16, ResNet-50 and GoogleNet respectively. Compared against sparse
systolic-array-based accelerators, Swallow, FESA and SPOTS, Sense achieves
1x-2.25x, 1.95x-2.5x and 1.17x-2.37x performance improvement on these CNNs
respectively with reasonable overhead.Comment: 14 pages, 29 figures, 6 tables, IEEE TRANSACTIONS ON VERY LARGE SCALE
INTEGRATION (VLSI) SYSTEM
Recommended from our members
Deriving intensity–duration–frequency (IDF) curves using downscaled in situ rainfall assimilated with remote sensing data
The rainfall intensity–duration–frequency (IDF) curves play an important role in water resources engineering and management. The applications of IDF curves range from assessing rainfall events, classifying climatic regimes, to deriving design storms and assisting in designing urban drainage systems, etc. The deriving procedure of IDF curves, however, requires long-term historical rainfall observations, whereas lack of fine-timescale rainfall records (e.g. sub-daily) often results in less reliable IDF curves. This paper presents the utilization of remote sensing sub-daily rainfall, i.e. Global Satellite Mapping of Precipitation (GSMaP), integrated with the Bartlett-Lewis rectangular pulses (BLRP) model, to disaggregate the daily in situ rainfall, which is then further used to derive more reliable IDF curves. Application of the proposed method in Singapore indicates that the disaggregated hourly rainfall, preserving both the hourly and daily statistic characteristics, produces IDF curves with significantly improved accuracy; on average over 70% of RMSE is reduced as compared to the IDF curves derived from daily rainfall observations. © 2019, The Author(s)
BandMap: Application Mapping with Bandwidth Allocation forCoarse-Grained Reconfigurable Array
This paper proposes an application mapping algorithm, BandMap, for
coarse-grained reconfigurable array (CGRA), which allocates the bandwidth in PE
array according to the transferring demands of data, especially the data with
high spatial reuse, to reduce the routing PEs. To cover bandwidth allocation,
BandMap maps the data flow graphs (DFGs), abstracted from applications, by
solving the maximum independent set (MIS) on a mixture of tuple and quadruple
resource occupation conflict graph. Compared to a state-of-art BusMap work,
Bandmap can achieve reduced routing PEs with the same or even smaller
initiation interval (II)
A digital twin to quantitatively understand aging mechanisms coupled effects of NMC battery using dynamic aging profiles
Traditional lithium-ion battery modeling does not provide sufficient information to accurately verify battery performance under real-time dynamic operating conditions, particularly when considering various aging modes and mechanisms. To improve the current methods, this paper proposes a lithium-ion battery digital twin that can capture real-time data and integrate the strong coupling between SEI layer growth, anode crack propagation, and lithium plating. It can be utilized to estimate aging behavior from macroscopic full-cell level to microscopic particle level, including voltage-current profiles in dynamic aging conditions, predict the degradation behavior of Nickel-Manganese-Cobalt-Oxide (NMC) based lithium-ion batteries, and assist in electrochemical analysis. This model can improve the root cause analysis of cell aging, enabling a quantitative understanding of aging mechanism coupled effects. Three charging protocols with dynamic discharging profiles are developed to simulate real vehicle operation scenarios and used to validate the digital twin, combining operando impedance measurements, post-mortem analysis, and SEM to further prove the conclusions. The digital twin can accurately predict battery capacity fade within 0.4% MAE. The results indicate that SEI layer growth is the primary contributor to capacity degradation and resistance increase. Based on the analysis of the model, it is concluded that one of the proposed multi-step charging protocols, in comparison to a standard continuous charging protocol, can reduce the degradation of NMC-based lithium-ion batteries. This paper represents a firm physical foundation for future physics-informed machine learning development
Elastic scattering and total reaction cross sections of Li studied with a microscopic continuum discretized coupled channels model
We present a systematic study of Li elastic scattering and total
reaction cross sections at incident energies around the Coulomb barrier within
the continuum discretized coupled-channels (CDCC) framework, where Li is
treated in an + two-body model. Collisions with Al,
Zn, Ba and Pa are analyzed. The microscopic optical
potentials (MOP) based on Skyrme nucleon-nucleon interaction for and
are adopted in CDCC calculations and satisfactory agreement with the
experimental data is obtained without any adjustment on MOPs. For comparison,
the and global phenomenological optical potentials (GOP) are also
used in CDCC analysis and a reduction no less than 50 on the surface
imaginary part of deuteron GOP is required for describing the data. In all
cases, the Li breakup effect is significant and provides repulsive
correction to the folding model potential. The reduction on the GOP of deuteron
reveals a strong suppression of the reaction probability of deuteron as a
component of Li as compared with that of a free deuteron. A further
investigation is made by taking the breakup process into account
equivalently within the dynamic polarization potential approach and it shows
that behaves like a tightly bound nucleus in Li induced reactions. We
also compare the CDCC results with those calculated with a Li GOP and it
shows that CDCC calculations provide a better reproduction for the elastic
scattering angular distributions in the sub-barrier energy region and the total
reaction cross sections at energies around the Coulomb barrier.Comment: 10 pages, 12 figure
NDVI With Artificial Neural Networks For SRTM Elevation Model Improvement – Hydrological Model Application
Digital elevation model (DEM) plays a substantial role in hydrological study, from understanding the catchment characteristics, setting up a hydrological model to mapping the flood risk in the region. Depending on the nature of study and its objectives, high resolution and reliable DEM is often desired to set up a sound hydrological model. However, such source of good DEM is not always available and it is generally high-priced. Obtained through radar based remote sensing, Shuttle Radar Topography Mission (SRTM) is a publicly available DEM with resolution of 92m outside US. It is a great source of DEM where no surveyed DEM is available. However, apart from the coarse resolution, SRTM suffers from inaccuracy especially on area with dense vegetation coverage due to the limitation of radar signals not penetrating through canopy. This will lead to the improper setup of the model as well as the erroneous mapping of flood risk. This paper attempts on improving SRTM dataset, using Normalised Difference Vegetation Index (NDVI), derived from Visible Red and Near Infra-Red band obtained from Landsat with resolution of 30m, and Artificial Neural Networks (ANN). The assessment of the improvement and the applicability of this method in hydrology would be highlighted and discussed
The HD-GYP Domain Protein RpfG of Xanthomonas oryzae pv. oryzicola Regulates Synthesis of Extracellular Polysaccharides that Contribute to Biofilm Formation and Virulence on Rice
Bacterial leaf streak caused by Xanthomonas oryzae pv. oryzicola (Xoc) is one of the most important diseases in rice. However, little is known about the pathogenicity mechanisms of Xoc. Here we have investigated the function of three HD-GYP domain regulatory proteins in biofilm formation, the synthesis of virulence factors and virulence of Xoc. Deletion of rpfG resulted in altered production of extracellular polysaccharides (EPS), abolished virulence on rice and enhanced biofilm formation, but had little effect on the secretion of proteases and motility. In contrast, mutational analysis showed that the other two HD-GYP domain proteins had no effect on virulence factor synthesis and tested phenotypes. Mutation of rpfG led to up-regulation of the type III secretion system and altered expression of three putative glycosyltransferase genes gumD, pgaC and xagB, which are part of operons directing the synthesis of different extracellular polysaccharides. The pgaABCD and xagABCD operons were greatly up-regulated in the Xoc Delta rpfG mutant, whereas the expression of the gum genes was unaltered or slightly enhanced. The elevated biofilm formation of the Xoc Delta rpfG mutant was dramatically reduced upon deletion of gumD, xagA and xagB, but not when pgaA and pgaC were deleted. Interestingly, only the Delta gumD mutant, among these single gene mutants, exhibits multiple phenotype alterations including reduced biofilm and EPS production and attenuated virulence on rice. These data indicate that RpfG is a global regulator that controls biofilm formation, EPS production and bacterial virulence in Xoc and that both gumD- and xagB-dependent EPS contribute to biofilm formation under different conditions
A KDM6 inhibitor potently induces ATF4 and its target gene expression through HRI activation and by UTX inhibition
UTX/KDM6A encodes a major histone H3 lysine 27 (H3K27) demethylase, and is frequently mutated in various types of human cancers. Although UTX appears to play a crucial role in oncogenesis, the mechanisms involved are still largely unknown. Here we show that a specific pharmacological inhibitor of H3K27 demethylases, GSK-J4, induces the expression of transcription activating factor 4 (ATF4) protein as well as the ATF4 target genes (e.g. PCK2, CHOP, REDD1, CHAC1 and TRIB3). ATF4 induction by GSK-J4 was due to neither transcriptional nor post-translational regulation. In support of this view, the ATF4 induction was almost exclusively dependent on the heme-regulated eIF2α kinase (HRI) in mouse embryonic fibroblasts (MEFs). Gene expression profiles with UTX disruption by CRISPR-Cas9 editing and the following stable re-expression of UTX showed that UTX specifically suppresses the expression of the ATF4 target genes, suggesting that UTX inhibition is at least partially responsible for the ATF4 induction. Apoptosis induction by GSK-J4 was partially and cell-type specifically correlated with the activation of ATF4-CHOP. These findings highlight that the anti-cancer drug candidate GSK-J4 strongly induces ATF4 and its target genes via HRI activation and raise a possibility that UTX might modulate cancer formation by regulating the HRI-ATF4 axis
- …