90 research outputs found

    Deep in-memory computing

    Get PDF
    There is much interest in embedding data analytics into sensor-rich platforms such as wearables, biomedical devices, autonomous vehicles, robots, and Internet-of-Things to provide these with decision-making capabilities. Such platforms often need to implement machine learning (ML) algorithms under stringent energy constraints with battery-powered electronics. Especially, energy consumption in memory subsystems dominates such a system's energy efficiency. In addition, the memory access latency is a major bottleneck for overall system throughput. To address these issues in memory-intensive inference applications, this dissertation proposes deep in-memory accelerator (DIMA), which deeply embeds computation into the memory array, employing two key principles: (1) accessing and processing multiple rows of memory array at a time, and (2) embedding pitch-matched low-swing analog processing at the periphery of bitcell array. The signal-to-noise ratio (SNR) is budgeted by employing low-swing operations in both memory read and processing to exploit the application level's error immunity for aggressive energy efficiency. This dissertation first describes the system rationale underlying the DIMA's processing stages by identifying the common functional flow across a diverse set of inference algorithms. Based on the analysis, this dissertation presents a multi-functional DIMA to support four algorithms: support vector machine (SVM), template matching (TM), k-nearest neighbor (k-NN), and matched filter. The circuit and architectural level design techniques and guidelines are provided to address the challenges in achieving multi-functionality. A prototype integrated circuit (IC) of a multi-functional DIMA was fabricated with a 16 KB SRAM array in a 65 nm CMOS process. Measurement results show up to 5.6X and 5.8X energy and delay reductions leading to 31X energy delay product (EDP) reduction with negligible (<1%) accuracy degradation as compared to the conventional 8-b fixed-point digital implementation optimally designed for each algorithm. Then, DIMA also has been applied to more complex algorithms: (1) convolutional neural network (CNN), (2) sparse distributed memory (SDM), and (3) random forest (RF). System-level simulations of CNN using circuit behavioral models in a 45 nm SOI CMOS demonstrate that high probability (>0.99) of handwritten digit recognition can be achieved using the MNIST database, along with a 24.5X reduced EDP, a 5.0X reduced energy, and a 4.9X higher throughput as compared to the conventional system. The DIMA-based SDM architecture also achieves up to 25X and 12X delay and energy reductions, respectively, over conventional SDM with negligible accuracy degradation (within 0.4%) for 16X16 binary-pixel image classification. A DIMA-based RF was realized as a prototype IC with a 16 KB SRAM array in a 65 nm process. To the best of our knowledge, this is the first IC realization of an RF algorithm. The measurement results show that the prototype achieves a 6.8X lower EDP compared to a conventional design at the same accuracy (94%) for an eight-class traffic sign recognition problem. The multi-functional DIMA and extension to other algorithms naturally motivated us to consider a programmable DIMA instruction set architecture (ISA), namely MATI. This dissertation explores a synergistic combination of the instruction set, architecture and circuit design to achieve the programmability without losing DIMA's energy and throughput benefits. Employing silicon-validated energy, delay and behavioral models of deep in-memory components, we demonstrate that MATI is able to realize nine ML benchmarks while incurring negligible overhead in energy (< 0.1%), and area (4.5%), and in throughput, over a fixed four-function DIMA. In this process, MATI is able to simultaneously achieve enhancements in both energy (2.5X to 5.5X) and throughput (1.4X to 3.4X) for an overall EDP improvement of up to 12.6X over fixed-function digital architectures

    A miR-200c/141-BMI1 autoregulatory loop regulates oncogenic activity of BMI1 in cancer cells.

    Get PDF
    MicroRNAs (miRNAs) are known to function as oncomiRs or tumor suppressors and are important noncoding RNA regulators of oncogenesis. The miR-200c/141 locus on chromosome 12 encodes miR-200c and miR-141, two members of the miR-200 family, which have been shown to function as tumor suppressive miRNAs by targeting multiple oncogenic factors such as polycomb group protein BMI1. Here, we show that BMI1 reciprocally functions as a transcriptional repressor of the miR-200c/141 cluster and that BMI1 inhibitors upregulate expression of miR-200c and miR-141. Our data suggest that BMI1 binds to the miR-200c/141 promoter and regulates it through transcription factor binding motifs E-box 2 and Z-box 1 to repress expression of miR-200c/141 cluster. We also show that PTC-209, a small molecule inhibitor of BMI1 gene expression induces cellular senescence and transcriptionally upregulates expression of miR-200c/141 cluster in breast cancer cells. Furthermore, inhibition of expression of miR-200c or miR-141 overcomes tumor suppressive effects of PTC-209 including induction of cellular senescence and downregulation of breast cancer stem cell phenotype. Therefore, our studies suggest a reciprocal regulation between BMI1 and miR-200c/141 cluster, and that BMI1 inhibitory drugs can further amplify their inhibitory effects on BMI1 via multiple mechanisms including posttranscriptional regulation by upregulating BMI1 targeting miRNAs

    Benchmarking Self-Supervised Learning on Diverse Pathology Datasets

    Full text link
    Computational pathology can lead to saving human lives, but models are annotation hungry and pathology images are notoriously expensive to annotate. Self-supervised learning has shown to be an effective method for utilizing unlabeled data, and its application to pathology could greatly benefit its downstream tasks. Yet, there are no principled studies that compare SSL methods and discuss how to adapt them for pathology. To address this need, we execute the largest-scale study of SSL pre-training on pathology image data, to date. Our study is conducted using 4 representative SSL methods on diverse downstream tasks. We establish that large-scale domain-aligned pre-training in pathology consistently out-performs ImageNet pre-training in standard SSL settings such as linear and fine-tuning evaluations, as well as in low-label regimes. Moreover, we propose a set of domain-specific techniques that we experimentally show leads to a performance boost. Lastly, for the first time, we apply SSL to the challenging task of nuclei instance segmentation and show large and consistent performance improvements under diverse settings

    Aligning organizational control practices with competitive outsourcing performance

    Get PDF
    The aim of this article is to present a research model that defines how different outsourcing strategies influence organizational control mechanisms that impact outsourcing outcomes. This research study consists of five case studies, each focusing on a foreign multi-national corporation (MNC) that has outsourcing experience in China. The results of these case studies examine the relationships among outsourcing strategies, organizational control, and outsourcing performance outcomes. In addition, the findings explain how trust competence and in-house knowledge of outsourced tasks have moderating effects between outsourcing strategies and process control. This article provides practical insight into the ways that business executives exercise organizational control in order to achieve effective outsourcing outcomes within China&apos;s evolving economic context
    • …
    corecore