51 research outputs found

    GraphR: Accelerating Graph Processing Using ReRAM

    Full text link
    This paper presents GRAPHR, the first ReRAM-based graph processing accelerator. GRAPHR follows the principle of near-data processing and explores the opportunity of performing massive parallel analog operations with low hardware and energy cost. The analog computation is suit- able for graph processing because: 1) The algorithms are iterative and could inherently tolerate the imprecision; 2) Both probability calculation (e.g., PageRank and Collaborative Filtering) and typical graph algorithms involving integers (e.g., BFS/SSSP) are resilient to errors. The key insight of GRAPHR is that if a vertex program of a graph algorithm can be expressed in sparse matrix vector multiplication (SpMV), it can be efficiently performed by ReRAM crossbar. We show that this assumption is generally true for a large set of graph algorithms. GRAPHR is a novel accelerator architecture consisting of two components: memory ReRAM and graph engine (GE). The core graph computations are performed in sparse matrix format in GEs (ReRAM crossbars). The vector/matrix-based graph computation is not new, but ReRAM offers the unique opportunity to realize the massive parallelism with unprecedented energy efficiency and low hardware cost. With small subgraphs processed by GEs, the gain of performing parallel operations overshadows the wastes due to sparsity. The experiment results show that GRAPHR achieves a 16.01x (up to 132.67x) speedup and a 33.82x energy saving on geometric mean compared to a CPU baseline system. Com- pared to GPU, GRAPHR achieves 1.69x to 2.19x speedup and consumes 4.77x to 8.91x less energy. GRAPHR gains a speedup of 1.16x to 4.12x, and is 3.67x to 10.96x more energy efficiency compared to PIM-based architecture.Comment: Accepted to HPCA 201

    Attentional Encoder Network for Targeted Sentiment Classification

    Full text link
    Targeted sentiment classification aims at determining the sentimental tendency towards specific targets. Most of the previous approaches model context and target words with RNN and attention. However, RNNs are difficult to parallelize and truncated backpropagation through time brings difficulty in remembering long-term patterns. To address this issue, this paper proposes an Attentional Encoder Network (AEN) which eschews recurrence and employs attention based encoders for the modeling between context and target. We raise the label unreliability issue and introduce label smoothing regularization. We also apply pre-trained BERT to this task and obtain new state-of-the-art results. Experiments and analysis demonstrate the effectiveness and lightweight of our model.Comment: 7 page

    Safety of Mesenchymal Stem Cells for Clinical Application

    Get PDF
    Mesenchymal stem cells (MSCs) hold great promise as therapeutic agents in regenerative medicine and autoimmune diseases, based on their differentiation abilities and immunosuppressive properties. However, the therapeutic applications raise a series of questions about the safety of culture-expanded MSCs for human use. This paper summarized recent findings about safety issues of MSCs, in particular their genetic stability in long-term in vitro expansion, their cryopreservation, banking, and the role of serum in the preparation of MSCs

    HyPar: Towards Hybrid Parallelism for Deep Learning Accelerator Array

    Get PDF
    With the rise of artificial intelligence in recent years, Deep Neural Networks (DNNs) have been widely used in many domains. To achieve high performance and energy efficiency, hardware acceleration (especially inference) of DNNs is intensively studied both in academia and industry. However, we still face two challenges: large DNN models and datasets, which incur frequent off-chip memory accesses; and the training of DNNs, which is not well-explored in recent accelerator designs. To truly provide high throughput and energy efficient acceleration for the training of deep and large models, we inevitably need to use multiple accelerators to explore the coarse-grain parallelism, compared to the fine-grain parallelism inside a layer considered in most of the existing architectures. It poses the key research question to seek the best organization of computation and dataflow among accelerators. In this paper, we propose a solution HyPar to determine layer-wise parallelism for deep neural network training with an array of DNN accelerators. HyPar partitions the feature map tensors (input and output), the kernel tensors, the gradient tensors, and the error tensors for the DNN accelerators. A partition constitutes the choice of parallelism for weighted layers. The optimization target is to search a partition that minimizes the total communication during training a complete DNN. To solve this problem, we propose a communication model to explain the source and amount of communications. Then, we use a hierarchical layer-wise dynamic programming method to search for the partition for each layer.Comment: To appear in the 2019 25th International Symposium on High-Performance Computer Architecture (HPCA 2019

    Genetic analysis of phytoene synthase 1 (Psy1) gene function and regulation in common wheat

    Get PDF
    Transcriptome details for three transgenic lines with the most significantly reduced YPC and non-transformed controls. (DOCX 18 kb

    Numerical Well Testing Interpretation Model and Applications in Crossflow Double-Layer Reservoirs by Polymer Flooding

    Get PDF
    This work presents numerical well testing interpretation model and analysis techniques to evaluate formation by using pressure transient data acquired with logging tools in crossflow double-layer reservoirs by polymer flooding. A well testing model is established based on rheology experiments and by considering shear, diffusion, convection, inaccessible pore volume (IPV), permeability reduction, wellbore storage effect, and skin factors. The type curves were then developed based on this model, and parameter sensitivity is analyzed. Our research shows that the type curves have five segments with different flow status: (I) wellbore storage section, (II) intermediate flow section (transient section), (III) mid-radial flow section, (IV) crossflow section (from low permeability layer to high permeability layer), and (V) systematic radial flow section. The polymer flooding field tests prove that our model can accurately determine formation parameters in crossflow double-layer reservoirs by polymer flooding. Moreover, formation damage caused by polymer flooding can also be evaluated by comparison of the interpreted permeability with initial layered permeability before polymer flooding. Comparison of the analysis of numerical solution based on flow mechanism with observed polymer flooding field test data highlights the potential for the application of this interpretation method in formation evaluation and enhanced oil recovery (EOR)
    corecore