17,585 research outputs found

    OpenPARF: An Open-Source Placement and Routing Framework for Large-Scale Heterogeneous FPGAs with Deep Learning Toolkit

    Full text link
    This paper proposes OpenPARF, an open-source placement and routing framework for large-scale FPGA designs. OpenPARF is implemented with the deep learning toolkit PyTorch and supports massive parallelization on GPU. The framework proposes a novel asymmetric multi-electrostatic field system to solve FPGA placement. It considers fine-grained routing resources inside configurable logic blocks (CLBs) for FPGA routing and supports large-scale irregular routing resource graphs. Experimental results on ISPD 2016 and ISPD 2017 FPGA contest benchmarks and industrial benchmarks demonstrate that OpenPARF can achieve 0.4-12.7% improvement in routed wirelength and more than 2×2\times speedup in placement. We believe that OpenPARF can pave the road for developing FPGA physical design engines and stimulate further research on related topics

    Design and Evaluation of Wearable Multimodal RF Sensing System for Vascular Dementia Detection

    Get PDF

    Meso-scale FDM material layout design strategies under manufacturability constraints and fracture conditions

    Get PDF
    In the manufacturability-driven design (MDD) perspective, manufacturability of the product or system is the most important of the design requirements. In addition to being able to ensure that complex designs (e.g., topology optimization) are manufacturable with a given process or process family, MDD also helps mechanical designers to take advantage of unique process-material effects generated during manufacturing. One of the most recognizable examples of this comes from the scanning-type family of additive manufacturing (AM) processes; the most notable and familiar member of this family is the fused deposition modeling (FDM) or fused filament fabrication (FFF) process. This process works by selectively depositing uniform, approximately isotropic beads or elements of molten thermoplastic material (typically structural engineering plastics) in a series of pre-specified traces to build each layer of the part. There are many interesting 2-D and 3-D mechanical design problems that can be explored by designing the layout of these elements. The resulting structured, hierarchical material (which is both manufacturable and customized layer-by-layer within the limits of the process and material) can be defined as a manufacturing process-driven structured material (MPDSM). This dissertation explores several practical methods for designing these element layouts for 2-D and 3-D meso-scale mechanical problems, focusing ultimately on design-for-fracture. Three different fracture conditions are explored: (1) cases where a crack must be prevented or stopped, (2) cases where the crack must be encouraged or accelerated, and (3) cases where cracks must grow in a simple pre-determined pattern. Several new design tools, including a mapping method for the FDM manufacturability constraints, three major literature reviews, the collection, organization, and analysis of several large (qualitative and quantitative) multi-scale datasets on the fracture behavior of FDM-processed materials, some new experimental equipment, and the refinement of a fast and simple g-code generator based on commercially-available software, were developed and refined to support the design of MPDSMs under fracture conditions. The refined design method and rules were experimentally validated using a series of case studies (involving both design and physical testing of the designs) at the end of the dissertation. Finally, a simple design guide for practicing engineers who are not experts in advanced solid mechanics nor process-tailored materials was developed from the results of this project.U of I OnlyAuthor's request

    Self-Supervised Learning to Prove Equivalence Between Straight-Line Programs via Rewrite Rules

    Full text link
    We target the problem of automatically synthesizing proofs of semantic equivalence between two programs made of sequences of statements. We represent programs using abstract syntax trees (AST), where a given set of semantics-preserving rewrite rules can be applied on a specific AST pattern to generate a transformed and semantically equivalent program. In our system, two programs are equivalent if there exists a sequence of application of these rewrite rules that leads to rewriting one program into the other. We propose a neural network architecture based on a transformer model to generate proofs of equivalence between program pairs. The system outputs a sequence of rewrites, and the validity of the sequence is simply checked by verifying it can be applied. If no valid sequence is produced by the neural network, the system reports the programs as non-equivalent, ensuring by design no programs may be incorrectly reported as equivalent. Our system is fully implemented for a given grammar which can represent straight-line programs with function calls and multiple types. To efficiently train the system to generate such sequences, we develop an original incremental training technique, named self-supervised sample selection. We extensively study the effectiveness of this novel training approach on proofs of increasing complexity and length. Our system, S4Eq, achieves 97% proof success on a curated dataset of 10,000 pairs of equivalent programsComment: 30 pages including appendi

    BRAMAC: Compute-in-BRAM Architectures for Multiply-Accumulate on FPGAs

    Full text link
    Deep neural network (DNN) inference using reduced integer precision has been shown to achieve significant improvements in memory utilization and compute throughput with little or no accuracy loss compared to full-precision floating-point. Modern FPGA-based DNN inference relies heavily on the on-chip block RAM (BRAM) for model storage and the digital signal processing (DSP) unit for implementing the multiply-accumulate (MAC) operation, a fundamental DNN primitive. In this paper, we enhance the existing BRAM to also compute MAC by proposing BRAMAC (Compute-in-BR‾\underline{\text{BR}}AM A‾\underline{\text{A}}rchitectures for M‾\underline{\text{M}}ultiply-Ac‾\underline{\text{Ac}}cumulate). BRAMAC supports 2's complement 2- to 8-bit MAC in a small dummy BRAM array using a hybrid bit-serial & bit-parallel data flow. Unlike previous compute-in-BRAM architectures, BRAMAC allows read/write access to the main BRAM array while computing in the dummy BRAM array, enabling both persistent and tiling-based DNN inference. We explore two BRAMAC variants: BRAMAC-2SA (with 2 synchronous dummy arrays) and BRAMAC-1DA (with 1 double-pumped dummy array). BRAMAC-2SA/BRAMAC-1DA can boost the peak MAC throughput of a large Arria-10 FPGA by 2.6×\times/2.1×\times, 2.3×\times/2.0×\times, and 1.9×\times/1.7×\times for 2-bit, 4-bit, and 8-bit precisions, respectively at the cost of 6.8%/3.4% increase in the FPGA core area. By adding BRAMAC-2SA/BRAMAC-1DA to a state-of-the-art tiling-based DNN accelerator, an average speedup of 2.05×\times/1.7×\times and 1.33×\times/1.52×\times can be achieved for AlexNet and ResNet-34, respectively across different model precisions.Comment: 11 pages, 13 figures, 3 tables, FCCM conference 202

    Cardiovascular diseases prediction by machine learning incorporation with deep learning

    Get PDF
    It is yet unknown what causes cardiovascular disease (CVD), but we do know that it is associated with a high risk of death, as well as severe morbidity and disability. There is an urgent need for AI-based technologies that are able to promptly and reliably predict the future outcomes of individuals who have cardiovascular disease. The Internet of Things (IoT) is serving as a driving force behind the development of CVD prediction. In order to analyse and make predictions based on the data that IoT devices receive, machine learning (ML) is used. Traditional machine learning algorithms are unable to take differences in the data into account and have a low level of accuracy in their model predictions. This research presents a collection of machine learning models that can be used to address this problem. These models take into account the data observation mechanisms and training procedures of a number of different algorithms. In order to verify the efficacy of our strategy, we combined the Heart Dataset with other classification models. The proposed method provides nearly 96 percent of accuracy result than other existing methods and the complete analysis over several metrics has been analysed and provided. Research in the field of deep learning will benefit from additional data from a large number of medical institutions, which may be used for the development of artificial neural network structures

    StringENT test suite: ENT battery revisited for efficient P value computation

    Get PDF
    Random numbers play a key role in a wide variety of applications, ranging from mathematical simulation to cryptography. Generating random or pseudo-random numbers is not an easy task, especially when hardware, time and energy constraints are considered. In order to assess whether generators behave in a random fashion, there are several statistical test batteries. ENT is one of the simplest and most popular, at least in part due to its efficacy and speed. Nonetheless, only one of the tests of this suite provides a p value, which is the most useful and standard way to determine whether the randomness hypothesis holds, for a certain significance level. As a consequence of this, rather arbitrary and at times misleading bounds are set in order to decide which intervals are acceptable for its results. This paper introduces an extension of the battery, named StringENT, which, while sticking to the fast speed that makes ENT popular and useful, still succeeds in providing p values with which sound decisions can be made about the randomness of a sequence. It also highlights a flagrant randomness flaw that the classical ENT battery is not capable of detecting but the new StringENT notices, and introduces two additional tests

    An Improved Latch for SerDes Interface: Design and Analysis under PVT and AC Noise

    Get PDF
    Digital subsystem prefers CMOS process, but it is difficult to manage speed and average power (Pavg) trade-off in each era with power supply voltage (Vdd) scaling. Current mode logic (CML) has emerged as an alternative to design the fundamental block of a SerDes, namely, the latch. However, available CML circuits consume significant Pavg and suffer from rapid input slewing. Typically, fast switching inputs enable current flow to effective supply voltage VP and overcharges output. In fact, VP is different than externally applied Vdd and oscillates with time as and when an abrupt current is drawn. This affects delay td and introduces jitter. The topic presents a new latch for SerDes interface using a new current steering circuit and coupled to a power delivery network (PDN). The significant point is to attain an almost constant td in comparison to conventional designs while the Vdd changes. The post-layout results at 0.09-μm CMOS and 1.1 V Vdd indicate that the Pavg and td are 339.5 µW and 61.9 ps, respectively, at 27OC. Surprisingly, the td variation is noted to be minimum and the power supply noise induced jitter is around 1.5 ns when VP close to the circuit varies due to sudden current

    Optimisation of Triboelectric Nanogenerator performance in vertical contact-separation mode

    Get PDF
    Triboelectric nanogenerator (TENG) is one of the most promising energy harvesters – a technology that uses repeated or reciprocating contact of suitably chosen materials to generate charge via the triboelectric effect (TE) and utilizes this as usable voltage and current. TENGs are attractive as they can continuously generate charge over a wide range of operating conditions and have several valuable advantages such as light weight, simple structure, low cost and high efficiency. Therefore, TENGs have been explored in a wide range of applications, including self-powered wearable electronics, powering electronics and even for harvesting ocean wave/wind energy. One of the major limitations of TENGs is their low power output (usually <500 W/m2). This thesis focuses of a few specific approaches to optimising TENG output performance. This thesis begins by presenting a solution to this challenge by optimizing a low permittivity substrate beneath the tribo-contact layer. The open circuit voltage is found to increase by a factor of 1.3 in moving from PET to the lower permittivity PTFE. TENG performance is also believed to depend on contact force, but the origin of the dependence had not previously been explored. Herein, we show that this behaviour results from a contact force dependent real contact area Ar as governed by surface roughness. The open circuit voltage Voc, short circuit current Isc and Ar for a TENG were found to increase with contact force/pressure. Critically, Voc and Isc saturate at the same contact pressure as Ar suggesting that electrical output follows the same evolution as Ar. Assuming that tribo charges can only transfer across the interface at areas of real contact, it follows that an increasing Ar with contact pressure should produce a corresponding increase in the electrical output. These results underline the importance of accounting for real contact area in TENG design, as well as the distinction between real and nominal contact area in tribo-charge density definition. High-performance ferroelectricassisted TENGs (Fe-TENGs) are developed using electrospun fibrous surfaces based on P(VDFTrFE) with dispersed BaTiO3 (BTO) nanofillers in either cubic (CBTO) or tetragonal (TBTO) form in this thesis. TENGs with three types of tribo-negative surface were investigated and output increased progressively. Critically, P(VDF-TrFE)/TBTO produced higher output than P(VDFTrFE)/ CBTO even though permittivity is nearly identical. Thus, it is shown that BTO fillers boost output, not just by increasing permittivity, but also by enhancing the crystallinity and amount of the β-phase (as TBTO produced a more crystalline β-phase present in greater amounts)
    • …
    corecore