111 research outputs found

    FLECSim-SoC: A Flexible End-to-End Co-Design Simulation Framework for System on Chips

    Get PDF
    Hardware accelerators for deep neural networks (DNNs) have established themselves over the past decade. Most developments have worked towards higher efficiency with an individual application in mind. This highlights the strong relationship between co-designing the accelerator together with the requirements of the application. Currently for a structured design flow, however, it lacks a tool to evaluate a DNN accelerator embedded in a System on Chip (SoC) platform.To address this gap in the state of the art, we introduce FLECSim, a tool framework that enables an end-to-end simulation of an SoC with dedicated accelerators, CPUs and memories. FLECSim offers flexible configuration of the system and straightforward integration of new accelerator models in both SystemC and RTL, which allows for early design verification. During the simulation, FLECSim provides metrics of the SoC, which can be used to explore the design space. Finally, we present the capabilities of FLECSim, perform an exemplary evaluation with a systolic array-based accelerator and explore the design parameters in terms of accelerator size, power and performance

    Data Movement Reduction for DNN Accelerators: Enabling Dynamic Quantization Through an eFPGA

    Get PDF
    Computational requirements for deep neural networks (DNNs) have been on a rising trend for years. Moreover, network dataflows and topologies are becoming more sophisticated to address more challenging applications. DNN accelerators cannot adopt quickly to the constantly changing DNNs. In this paper, we describe our approach to make a static accelerator more versatile by adding an embedded FPGA (eFPGA). The eFPGA is tightly coupled to the on-chip network, which allows us to pass data through the eFPGA before and after it is processed by the DNN accelerator. Hence, the proposed solution is able to quickly address changing requirements. To show the benefits of this approach, we propose an eFPGA application that enables dynamic quantization of data. We can fit four number converters on an 1.5mm21.5mm^2 eFPGA, which can process 400M data elements per second. We will practically validate our work in the near future, with a SoC tapeout in the ongoing EPI project

    An Analytical Model of Configurable Systolic Arrays to find the Best-Fitting Accelerator for a given DNN Workload

    Get PDF
    Since their breakthrough, complexity of Deep Neural Networks (DNNs) is rising steadily. As a result, accelerators for DNNs are now used in many domains. However, designing and configuring an accelerator that meets the requirements of a given application perfectly is a challenging task. In this paper, we therefore present our approach to support the accelerator design process. With an analytical model of a systolic array we can estimate performance, energy consumption and area for each design option. To determine these metrics, usually a cycle accurate simulation is performed, which is a time-consuming task. Hence, the design space has to be restricted heavily. Analytical modelling, however, allows for fast evaluation of a design using a mathematical abstraction of the accelerator. For DNNs, this works especially well since the dataflow and memory accesses have high regularity. To show the correctness of our model, we perform an exemplary realization with the state-of-the-art systolic array generator Gemmini and compare it with a cycle accurate simulation and state-of-the-art modelling tools, showing less than 1% deviation. We also conducted a design space exploration, showing the analytical model’s capabilities to support an accelerator design. In a case study on ResNet-34, we can demonstrate that our model and DSE tool reduces the time to find the best-fitting solution by four or two orders of magnitude compared to a cycle-accurate simulation or state-of-the-art modelling tools, respectively

    CNNParted: An open source framework for efficient Convolutional Neural Network inference partitioning in embedded systems

    Get PDF
    Applications such as autonomous driving or assistive robotics heavily rely on the usage of Deep Neural Networks. In particular, Convolutional Neural Networks (CNNs) provide precise and reliable results in image processing tasks like camera-based object detection or semantic segmentation. However, to achieve even better results, CNNs are becoming more and more complex. Deploying these networks in distributed embedded systems thereby imposes new challenges, due to additional constraints regarding performance and energy consumption in the near-sensor compute platforms, i.e. the sensor nodes. Processing all data in the central node, however, is disadvantageous since raw data of camera consumes large bandwidth and running CNN inference of multiple tasks requires certain performance. Moreover, sending raw data over the interconnect is not advisable for privacy reasons. Hence, offloading CNN workload to the sensor nodes in the system can lead to reduced traffic on the link and a higher level of data security. However, due to the limited hardware-resources on the sensor nodes, partitioning CNNs has to be done carefully to meet overall latency requirements and energy constraints. Therefore, we present CNNParted, an open-source framework for efficient, hardware-aware CNN inference partitioning targeting embedded AI applications. It automatically searches for potential partitioning points in the CNN to find a beneficial workload distribution between sensor nodes and a central edge node. Thereby, CNNParted not only analyzes the CNN architecture but also takes hardware components, such as dedicated hardware accelerators and memories, into consideration to evaluate inference partitioning regarding latency and energy consumption. Exemplary, we apply CNNParted to three commonly used feed forward CNNs in embedded systems. Thereby, the framework first searches for several potential partitioning points and then evaluates the latter regarding inference latency and energy consumption. Based on the results, beneficial partitioning points can be identified depending on the system constraints. Using the framework, we are able to find and evaluate 10 potential partitioning points for FCN ResNet-50, 13 partitioning points for GoogLeNet, and 8 partitioning points for SqueezeNet V1.1 within 520 s, 330 s, and 140 s, respectively, on an AMD EPYC 7702P running 8 concurrent threads. For GoogLeNet, we determine two partitioning points that provide a good trade-off between required bandwidth, latency and energy consumption. We also provide insights into further interesting findings that can be derived from the evaluation results

    Towards reconfigurable accelerators in HPC: Designing a multipurpose eFPGA tile for heterogeneous SoCs

    Get PDF
    The goal of modern high performance computing platforms is to combine low power consumption and high throughput. Within the European Processor Initiative (EPI), such an SoC platform to meet the novel exascale requirements is built and investigated. As part of this project, we introduce an embedded Field Programmable Gate Array (eFPGA), adding flexibility to accelerate various workloads. In this article, we show our approach to design the eFPGA tile that supports the EPI SoC. While eFPGAs are inherently reconfigurable, their initial design has to be determined for tape-out. The design space of the eFPGA is explored and evaluated with different configurations of two HPC workloads, covering control and dataflow heavy applications. As a result, we present a well-balanced eFPGA design that can host several use cases and potential future ones by allocating 1% of the total EPI SoC area. Finally, our simulation results of the architectures on the eFPGA show great performance improvements over their software counterparts.European Processor Initiative (EPI) project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No. 826647, from Spanish Government (PID2019- 107255GB-C21/AEI /10.13039/501100011033), and from Generalitat de Catalunya (contracts 2017-SGR-1414 and 2017-SGR-1328). M. Moreto is partially supported by the Spanish Ministry of Economy, Industry and Competitiveness under Ramon y Cajal fellowship No. RYC-2016-21104.Peer ReviewedPostprint (author's final draft

    Penilaian Kinerja Keuangan Koperasi di Kabupaten Pelalawan

    Full text link
    This paper describe development and financial performance of cooperative in District Pelalawan among 2007 - 2008. Studies on primary and secondary cooperative in 12 sub-districts. Method in this stady use performance measuring of productivity, efficiency, growth, liquidity, and solvability of cooperative. Productivity of cooperative in Pelalawan was highly but efficiency still low. Profit and income were highly, even liquidity of cooperative very high, and solvability was good

    Juxtaposing BTE and ATE – on the role of the European insurance industry in funding civil litigation

    Get PDF
    One of the ways in which legal services are financed, and indeed shaped, is through private insurance arrangement. Two contrasting types of legal expenses insurance contracts (LEI) seem to dominate in Europe: before the event (BTE) and after the event (ATE) legal expenses insurance. Notwithstanding institutional differences between different legal systems, BTE and ATE insurance arrangements may be instrumental if government policy is geared towards strengthening a market-oriented system of financing access to justice for individuals and business. At the same time, emphasizing the role of a private industry as a keeper of the gates to justice raises issues of accountability and transparency, not readily reconcilable with demands of competition. Moreover, multiple actors (clients, lawyers, courts, insurers) are involved, causing behavioural dynamics which are not easily predicted or influenced. Against this background, this paper looks into BTE and ATE arrangements by analysing the particularities of BTE and ATE arrangements currently available in some European jurisdictions and by painting a picture of their respective markets and legal contexts. This allows for some reflection on the performance of BTE and ATE providers as both financiers and keepers. Two issues emerge from the analysis that are worthy of some further reflection. Firstly, there is the problematic long-term sustainability of some ATE products. Secondly, the challenges faced by policymakers that would like to nudge consumers into voluntarily taking out BTE LEI

    Measurement of associated W plus charm production in pp collisions at √s=7 TeV

    Get PDF
    Peer reviewe

    Vapor phase preparation and characterization of the carbon micro-coils

    Get PDF

    Effect of angiotensin-converting enzyme inhibitor and angiotensin receptor blocker initiation on organ support-free days in patients hospitalized with COVID-19

    Get PDF
    IMPORTANCE Overactivation of the renin-angiotensin system (RAS) may contribute to poor clinical outcomes in patients with COVID-19. Objective To determine whether angiotensin-converting enzyme (ACE) inhibitor or angiotensin receptor blocker (ARB) initiation improves outcomes in patients hospitalized for COVID-19. DESIGN, SETTING, AND PARTICIPANTS In an ongoing, adaptive platform randomized clinical trial, 721 critically ill and 58 non–critically ill hospitalized adults were randomized to receive an RAS inhibitor or control between March 16, 2021, and February 25, 2022, at 69 sites in 7 countries (final follow-up on June 1, 2022). INTERVENTIONS Patients were randomized to receive open-label initiation of an ACE inhibitor (n = 257), ARB (n = 248), ARB in combination with DMX-200 (a chemokine receptor-2 inhibitor; n = 10), or no RAS inhibitor (control; n = 264) for up to 10 days. MAIN OUTCOMES AND MEASURES The primary outcome was organ support–free days, a composite of hospital survival and days alive without cardiovascular or respiratory organ support through 21 days. The primary analysis was a bayesian cumulative logistic model. Odds ratios (ORs) greater than 1 represent improved outcomes. RESULTS On February 25, 2022, enrollment was discontinued due to safety concerns. Among 679 critically ill patients with available primary outcome data, the median age was 56 years and 239 participants (35.2%) were women. Median (IQR) organ support–free days among critically ill patients was 10 (–1 to 16) in the ACE inhibitor group (n = 231), 8 (–1 to 17) in the ARB group (n = 217), and 12 (0 to 17) in the control group (n = 231) (median adjusted odds ratios of 0.77 [95% bayesian credible interval, 0.58-1.06] for improvement for ACE inhibitor and 0.76 [95% credible interval, 0.56-1.05] for ARB compared with control). The posterior probabilities that ACE inhibitors and ARBs worsened organ support–free days compared with control were 94.9% and 95.4%, respectively. Hospital survival occurred in 166 of 231 critically ill participants (71.9%) in the ACE inhibitor group, 152 of 217 (70.0%) in the ARB group, and 182 of 231 (78.8%) in the control group (posterior probabilities that ACE inhibitor and ARB worsened hospital survival compared with control were 95.3% and 98.1%, respectively). CONCLUSIONS AND RELEVANCE In this trial, among critically ill adults with COVID-19, initiation of an ACE inhibitor or ARB did not improve, and likely worsened, clinical outcomes. TRIAL REGISTRATION ClinicalTrials.gov Identifier: NCT0273570
    • 

    corecore