33 research outputs found

    Sense: Model Hardware Co-design for Accelerating Sparse CNN on Systolic Array

    Full text link
    Sparsity is an intrinsic property of convolutional neural network(CNN) and worth exploiting for CNN accelerators, but extra processing comes with hardware overhead, causing many architectures suffering from only minor profit. Meanwhile, systolic array has been increasingly competitive on CNNs acceleration for its high spatiotemporal locality and low hardware overhead. However, the irregularity of sparsity induces imbalanced workload under the rigid systolic dataflow, causing performance degradation. Thus, this paper proposed a systolicarray-based architecture, called Sense, for sparse CNN acceleration by model-hardware co-design, achieving large performance improvement. To balance input feature map(IFM) and weight loads across Processing Element(PE) array, we applied channel clustering to gather IFMs with approximate sparsity for array computation, and co-designed a load-balancing weight pruning method to keep the sparsity ratio of each kernel at a certain value with little accuracy loss, improving PE utilization and overall performance. Additionally, Adaptive Dataflow Configuration is applied to determine the computing strategy based on the storage ratio of IFMs and weights, lowering 1.17x-1.8x DRAM access compared with Swallow and further reducing system energy consumption. The whole design is implemented on ZynqZCU102 with 200MHz and performs at 471-, 34-, 53- and 191-image/s for AlexNet, VGG-16, ResNet-50 and GoogleNet respectively. Compared against sparse systolic-array-based accelerators, Swallow, FESA and SPOTS, Sense achieves 1x-2.25x, 1.95x-2.5x and 1.17x-2.37x performance improvement on these CNNs respectively with reasonable overhead.Comment: 14 pages, 29 figures, 6 tables, IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEM

    SimFIR: A Simple Framework for Fisheye Image Rectification with Self-supervised Representation Learning

    Full text link
    In fisheye images, rich distinct distortion patterns are regularly distributed in the image plane. These distortion patterns are independent of the visual content and provide informative cues for rectification. To make the best of such rectification cues, we introduce SimFIR, a simple framework for fisheye image rectification based on self-supervised representation learning. Technically, we first split a fisheye image into multiple patches and extract their representations with a Vision Transformer (ViT). To learn fine-grained distortion representations, we then associate different image patches with their specific distortion patterns based on the fisheye model, and further subtly design an innovative unified distortion-aware pretext task for their learning. The transfer performance on the downstream rectification task is remarkably boosted, which verifies the effectiveness of the learned representations. Extensive experiments are conducted, and the quantitative and qualitative results demonstrate the superiority of our method over the state-of-the-art algorithms as well as its strong generalization ability on real-world fisheye images.Comment: Accepted to ICCV 202

    Challenges in QCD matter physics - The Compressed Baryonic Matter experiment at FAIR

    Full text link
    Substantial experimental and theoretical efforts worldwide are devoted to explore the phase diagram of strongly interacting matter. At LHC and top RHIC energies, QCD matter is studied at very high temperatures and nearly vanishing net-baryon densities. There is evidence that a Quark-Gluon-Plasma (QGP) was created at experiments at RHIC and LHC. The transition from the QGP back to the hadron gas is found to be a smooth cross over. For larger net-baryon densities and lower temperatures, it is expected that the QCD phase diagram exhibits a rich structure, such as a first-order phase transition between hadronic and partonic matter which terminates in a critical point, or exotic phases like quarkyonic matter. The discovery of these landmarks would be a breakthrough in our understanding of the strong interaction and is therefore in the focus of various high-energy heavy-ion research programs. The Compressed Baryonic Matter (CBM) experiment at FAIR will play a unique role in the exploration of the QCD phase diagram in the region of high net-baryon densities, because it is designed to run at unprecedented interaction rates. High-rate operation is the key prerequisite for high-precision measurements of multi-differential observables and of rare diagnostic probes which are sensitive to the dense phase of the nuclear fireball. The goal of the CBM experiment at SIS100 (sqrt(s_NN) = 2.7 - 4.9 GeV) is to discover fundamental properties of QCD matter: the phase structure at large baryon-chemical potentials (mu_B > 500 MeV), effects of chiral symmetry, and the equation-of-state at high density as it is expected to occur in the core of neutron stars. In this article, we review the motivation for and the physics programme of CBM, including activities before the start of data taking in 2022, in the context of the worldwide efforts to explore high-density QCD matter.Comment: 15 pages, 11 figures. Published in European Physical Journal

    Finishing the euchromatic sequence of the human genome

    Get PDF
    The sequence of the human genome encodes the genetic instructions for human physiology, as well as rich information about human evolution. In 2001, the International Human Genome Sequencing Consortium reported a draft sequence of the euchromatic portion of the human genome. Since then, the international collaboration has worked to convert this draft into a genome sequence with high accuracy and nearly complete coverage. Here, we report the result of this finishing process. The current genome sequence (Build 35) contains 2.85 billion nucleotides interrupted by only 341 gaps. It covers ∼99% of the euchromatic genome and is accurate to an error rate of ∼1 event per 100,000 bases. Many of the remaining euchromatic gaps are associated with segmental duplications and will require focused work with new methods. The near-complete sequence, the first for a vertebrate, greatly improves the precision of biological analyses of the human genome including studies of gene number, birth and death. Notably, the human enome seems to encode only 20,000-25,000 protein-coding genes. The genome sequence reported here should serve as a firm foundation for biomedical research in the decades ahead

    A systematic analysis of the role of GGDEF-EAL domain proteins in virulence and motility in Xanthomonas oryzae pv. oryzicola

    Get PDF
    The second messenger c-di-GMP is implicated in regulation of various aspects of the lifestyles and virulence of Gram-negative bacteria. Cyclic di-GMP is formed by diguanylate cyclases with a GGDEF domain and degraded by phosphodiesterases with either an EAL or HD-GYP domain. Proteins with tandem GGDEF-EAL domains occur in many bacteria, where they may be involved in c-di-GMP turnover or act as enzymatically-inactive c-di-GMP effectors. Here, we report a systematic study of the regulatory action of the eleven GGDEF-EAL proteins in Xanthomonas oryzae pv. oryzicola, an important rice pathogen causing bacterial leaf streak. Mutational analysis revealed that XOC_2335 and XOC_2393 positively regulate bacterial swimming motility, while XOC_2102, XOC_2393 and XOC_4190 negatively control sliding motility. The ΔXOC_2335/XOC_2393 mutant that had a higher intracellular c-di-GMP level than the wild type and the ΔXOC_4190 mutant exhibited reduced virulence to rice after pressure inoculation. In vitro purified XOC_4190 and XOC_2102 have little or no diguanylate cyclase or phosphodiesterase activity, which is consistent with unaltered c-di-GMP concentration in ΔXOC_4190. Nevertheless, both proteins can bind to c-di-GMP with high affinity, indicating a potential role as c-di-GMP effectors. Overall our findings advance understanding of c-di-GMP signaling and its links to virulence in an important rice pathogen

    A Multivariate Temporal Convolutional Attention Network for Time-Series Forecasting

    No full text
    Multivariate time-series forecasting is one of the crucial and persistent challenges in time-series forecasting tasks. As a kind of data with multivariate correlation and volatility, multivariate time series impose highly nonlinear time characteristics on the forecasting model. In this paper, a new multivariate time-series forecasting model, multivariate temporal convolutional attention network (MTCAN), based on a self-attentive mechanism is proposed. MTCAN is based on the Convolution Neural Network (CNN) model, using 1D dilated convolution as the basic unit to construct asymmetric blocks, and then, the feature extraction is performed by the self-attention mechanism to finally obtain the prediction results. The input and output lengths of this network can be determined flexibly. The validation of the method is carried out with three different multivariate time-series datasets. The reliability and accuracy of the prediction results are compared with Long Short-Term Memory (LSTM), Gated Recurrent Unit (GRU), Convolutional Long Short-Term Memory (ConvLSTM), and Temporal Convolutional Network (TCN). The prediction results show that the model proposed in this paper has significantly improved prediction accuracy and generalization

    A Multivariate Temporal Convolutional Attention Network for Time-Series Forecasting

    No full text
    Multivariate time-series forecasting is one of the crucial and persistent challenges in time-series forecasting tasks. As a kind of data with multivariate correlation and volatility, multivariate time series impose highly nonlinear time characteristics on the forecasting model. In this paper, a new multivariate time-series forecasting model, multivariate temporal convolutional attention network (MTCAN), based on a self-attentive mechanism is proposed. MTCAN is based on the Convolution Neural Network (CNN) model, using 1D dilated convolution as the basic unit to construct asymmetric blocks, and then, the feature extraction is performed by the self-attention mechanism to finally obtain the prediction results. The input and output lengths of this network can be determined flexibly. The validation of the method is carried out with three different multivariate time-series datasets. The reliability and accuracy of the prediction results are compared with Long Short-Term Memory (LSTM), Gated Recurrent Unit (GRU), Convolutional Long Short-Term Memory (ConvLSTM), and Temporal Convolutional Network (TCN). The prediction results show that the model proposed in this paper has significantly improved prediction accuracy and generalization

    Balancedenergy sleep scheduling scheme for high density cluster-based sensor networks

    Get PDF
    Abstract — In order to conserve battery power in very dense sensor networks, some sensor nodes may be put into the sleep state while other sensor nodes remain active for the sensing and communication tasks. However, determining which of the sensor nodes should be put into the sleep state is non-trivial. As the goal of allowing nodes to sleep is to extend network lifetime, we propose and analyze a Balanced-energy Scheduling (BS) scheme in the context of cluster-based sensor networks. The BS scheme aims to evenly distribute the energy load of the sensing and communication tasks among all the nodes in the cluster, thereby extending the time until the cluster can no longer provide adequate sensing coverage. Two related sleep scheduling schemes, the Distance-based Scheduling (DS) scheme and the Randomized Scheduling (RS) scheme are also studied in terms of the coefficient of variation of their energy consumption. Analytical and simulation results are presented to evaluate the proposed BS scheme. It is shown that the BS scheme extends the cluster’s overall network lifetime significantly while maintaining a similar sensing coverage compared with the DS and the RS schemes for sensor clusters. I

    Scheduling sleeping nodes in high density cluster-based sensor networks

    Get PDF
    Abstract. In order to conserve battery power in very dense sensor networks, some sensor nodes may be put into the sleep state while other sensor nodes remain active for the sensing and communication tasks. In this paper, we study the node sleep scheduling problem in the context of clustered sensor networks. We propose and analyze the Linear Distance-based Scheduling (LDS) technique for sleeping in each cluster. The LDS scheme selects a sensor node to sleep with higher probability when it is farther away from the cluster head. We analyze the energy consumption, the sensing coverage property, and the network lifetime of the proposed LDS scheme. The performance of the LDS scheme is compared with that of the conventional Randomized Scheduling (RS) scheme. It is shown that the LDS scheme yields more energy savings while maintaining a similar sensing coverage as the RS scheme for sensor clusters. Therefore, the LDS scheme results in a longer network lifetime than the RS scheme
    corecore