121 research outputs found

    An Experimental Study of Reduced-Voltage Operation in Modern FPGAs for Neural Network Acceleration

    Get PDF
    We empirically evaluate an undervolting technique, i.e., underscaling the circuit supply voltage below the nominal level, to improve the power-efficiency of Convolutional Neural Network (CNN) accelerators mapped to Field Programmable Gate Arrays (FPGAs). Undervolting below a safe voltage level can lead to timing faults due to excessive circuit latency increase. We evaluate the reliability-power trade-off for such accelerators. Specifically, we experimentally study the reduced-voltage operation of multiple components of real FPGAs, characterize the corresponding reliability behavior of CNN accelerators, propose techniques to minimize the drawbacks of reduced-voltage operation, and combine undervolting with architectural CNN optimization techniques, i.e., quantization and pruning. We investigate the effect of environmental temperature on the reliability-power trade-off of such accelerators. We perform experiments on three identical samples of modern Xilinx ZCU102 FPGA platforms with five state-of-the-art image classification CNN benchmarks. This approach allows us to study the effects of our undervolting technique for both software and hardware variability. We achieve more than 3X power-efficiency (GOPs/W) gain via undervolting. 2.6X of this gain is the result of eliminating the voltage guardband region, i.e., the safe voltage region below the nominal level that is set by FPGA vendor to ensure correct functionality in worst-case environmental and circuit conditions. 43% of the power-efficiency gain is due to further undervolting below the guardband, which comes at the cost of accuracy loss in the CNN accelerator. We evaluate an effective frequency underscaling technique that prevents this accuracy loss, and find that it reduces the power-efficiency gain from 43% to 25%.Comment: To appear at the DSN 2020 conferenc

    An experimental study of reduced-voltage operation in modern FPGAs for neural network acceleration

    Get PDF
    We empirically evaluate an undervolting technique, i.e., underscaling the circuit supply voltage below the nominal level, to improve the power-efficiency of Convolutional Neural Network (CNN) accelerators mapped to Field Programmable Gate Arrays (FPGAs). Undervolting below a safe voltage level can lead to timing faults due to excessive circuit latency increase. We evaluate the reliability-power trade-off for such accelerators. Specifically, we experimentally study the reduced-voltage operation of multiple components of real FPGAs, characterize the corresponding reliability behavior of CNN accelerators, propose techniques to minimize the drawbacks of reduced-voltage operation, and combine undervolting with architectural CNN optimization techniques, i.e., quantization and pruning. We investigate the effect ofenvironmental temperature on the reliability-power trade-off of such accelerators. We perform experiments on three identical samples of modern Xilinx ZCU102 FPGA platforms with five state-of-the-art image classification CNN benchmarks. This approach allows us to study the effects of our undervolting technique for both software and hardware variability. We achieve more than 3X power-efficiency (GOPs/W ) gain via undervolting. 2.6X of this gain is the result of eliminating the voltage guardband region, i.e., the safe voltage region below the nominal level that is set by FPGA vendor to ensure correct functionality in worst-case environmental and circuit conditions. 43% of the power-efficiency gain is due to further undervolting below the guardband, which comes at the cost of accuracy loss in the CNN accelerator. We evaluate an effective frequency underscaling technique that prevents this accuracy loss, and find that it reduces the power-efficiency gain from 43% to 25%.The work done for this paper was partially supported by a HiPEAC Collaboration Grant funded by the H2020 HiPEAC Project under grant agreement No. 779656. The research leading to these results has received funding from the European Union’s Horizon 2020 Programme under the LEGaTO Project (www.legato-project.eu), grant agreement No. 780681.Peer ReviewedPostprint (author's final draft

    Hemoglobin is inversely related to flow-mediated dilatation in chronic kidney disease

    Get PDF
    The microcirculation is regulated by oxygen gradients and by endothelial release of nitric oxide, which can react with hemoglobin to form S-nitroso derivatives. Here we induced flow-mediated dilatation of the brachial artery in response to ischemia in 141 non-diabetic patients with stage 3–4 chronic kidney disease who had no history of smoking, cardiovascular events or use of erythropoietin-based agents. Patients with hemoglobin concentrations above the cohort median of 11.6 g/dl were found to have significant reductions in flow-mediated dilatation compared to those below the median. This inverse relationship remained significant after adjustment for potential confounders, including insulin sensitivity, glomerular filtration rate, proteinuria, body mass index, serum urate, etiology of underlying renal disease, treatment with anti-hypertensive drugs, and traditional Framingham risk factors. Given that hemoglobin can act as an important nitric oxide carrier and buffer, our studies suggest that the mechanism by which hemoglobin influences the endothelium-dependent microcirculation requires its nitrosylation; however, more direct studies need to be performed

    ChargeCache: Reducing DRAM Latency by Exploiting Row Access Locality

    Get PDF
    22nd IEEE International Symposium on High-Performance Computer Architecture (HPCA) (2016 : Barcelona, SPAIN)DRAM latency continues to be a critical bottleneck for system performance. In this work, we develop a low-cost mechanism, called ChargeCache, that enables faster access to recently-accessed rows in DRAM, with no modifications to DRAM chips. Our mechanism is based on the key observation that a recently-accessed row has more charge and thus the following access to the same row can be performed faster. To exploit this observation, we propose to track the addresses of recently-accessed rows in a table in the memory controller. If a later DRAM request hits in that table, the memory controller uses lower timing parameters, leading to reduced DRAM latency. Row addresses are removed from the table after a specified duration to ensure rows that have leaked too much charge are not accessed with lower latency. We evaluate ChargeCache on a wide variety of workloads and show that it provides significant performance and energy benefits for both single-core and multi-core systems

    Elit Türk kadın hentbolcularda 30 – 15 intermittent fitness test ile anaerobik performans ilişkisinin değerlendirilmesi

    Get PDF
    Bu araştırmanın amacı elit Türk kadın hentbolcuların dayanıklılık performanslarının belirlenmesinde kullanılan saha temelli 30-15 aralıklı test (IFT) performansı ile anaerobik performans; 30 saniye Wingate Anaerobik güç ve kapasite, çeviklik T-Testi performans sonuçları ilişkilerini araştırmaktır. Çalışmaya Türkiye Süper liginde oynayan 30 kadın hentbolcu gönüllü olarak katılmış 4 tanesi çalışmayı tamamlayamamıştır. Araştırma hipotezini test etmek için kesitsel tanımlayıcı korelasyon tasarımı kullanılmıştır. Ölçümler üç ayrı günde 30-15 IFT, Çeviklik T-Test ve Wingate 30 sn Anaerobik güç ve kapasite testleri 72 saat ara ile uygulanmıştır. Araştırma hipotezini test etmek için ilk olarak 30-15 IFT performans sonuçları ile anaerobik performans ve çeviklik ilişki katsayıları hesaplanmış ve ilişki tespit edilen değişkenlerin 30-15 IFT performansını ne kadar tahmin ettiğini belirlemek için de Çoklu Doğrusal Regrasyon analiz yöntemi kullanılmıştır. Katılımcıların 30-15 dayanıklılık testi ile VO2maks kapasitelerine ulaşılmış ve oyuncuların VO2maks seviyeleri ile anaerobik güç ve kapasite arasındaki ilişkiye bakılmış bunun sonucunda yapılan regresyon analizinde, bu iki parametre arasında anlamlı bir ilişkiye rastlanılmamıştır (R2= 0,110 p>0,05). Çalışmada VO2maks ile çeviklik performansı arasındaki ilişki de incelenmiştir. Regresyon analizi bu iki parametre arasında anlamlı bir ilişki olmadığını ortaya çıkarmıştır (R2= 0,134 p>0,05). Özetle, elit kadın hentbolcularda, 30-15 IFT testinin anaerobik performans ile bir ilişkisinin olmadığı belirlenmiştir
    corecore