36 research outputs found

    Autonomously Reconfigurable Artificial Neural Network on a Chip

    Get PDF
    Artificial neural network (ANN), an established bio-inspired computing paradigm, has proved very effective in a variety of real-world problems and particularly useful for various emerging biomedical applications using specialized ANN hardware. Unfortunately, these ANN-based systems are increasingly vulnerable to both transient and permanent faults due to unrelenting advances in CMOS technology scaling, which sometimes can be catastrophic. The considerable resource and energy consumption and the lack of dynamic adaptability make conventional fault-tolerant techniques unsuitable for future portable medical solutions. Inspired by the self-healing and self-recovery mechanisms of human nervous system, this research seeks to address reliability issues of ANN-based hardware by proposing an Autonomously Reconfigurable Artificial Neural Network (ARANN) architectural framework. Leveraging the homogeneous structural characteristics of neural networks, ARANN is capable of adapting its structures and operations, both algorithmically and microarchitecturally, to react to unexpected neuron failures. Specifically, we propose three key techniques --- Distributed ANN, Decoupled Virtual-to-Physical Neuron Mapping, and Dual-Layer Synchronization --- to achieve cost-effective structural adaptation and ensure accurate system recovery. Moreover, an ARANN-enabled self-optimizing workflow is presented to adaptively explore a "Pareto-optimal" neural network structure for a given application, on the fly. Implemented and demonstrated on a Virtex-5 FPGA, ARANN can cover and adapt 93% chip area (neurons) with less than 1% chip overhead and O(n) reconfiguration latency. A detailed performance analysis has been completed based on various recovery scenarios

    Advanced Applications of Rapid Prototyping Technology in Modern Engineering

    Get PDF
    Rapid prototyping (RP) technology has been widely known and appreciated due to its flexible and customized manufacturing capabilities. The widely studied RP techniques include stereolithography apparatus (SLA), selective laser sintering (SLS), three-dimensional printing (3DP), fused deposition modeling (FDM), 3D plotting, solid ground curing (SGC), multiphase jet solidification (MJS), laminated object manufacturing (LOM). Different techniques are associated with different materials and/or processing principles and thus are devoted to specific applications. RP technology has no longer been only for prototype building rather has been extended for real industrial manufacturing solutions. Today, the RP technology has contributed to almost all engineering areas that include mechanical, materials, industrial, aerospace, electrical and most recently biomedical engineering. This book aims to present the advanced development of RP technologies in various engineering areas as the solutions to the real world engineering problems

    Investigation of Line-Scan Dispersive Interferometry for In-Line Surface Metrology

    Get PDF
    Advanced manufacturing techniques enable ultra-precision surfaces to be fabricated with various complicated and large-area structures. For instance, the cost-effectiveness of Roll-to-Roll (R2R) manufacturing technology has been widely demonstrated in industries making high volume as well as large-area foil products and flexible electronics. Evaluation of these fine surfaces by an expensive trial-and-error approach is unadvisable due to the high scrap rate. Therefore quality control using in-line metrology of the functional surface plays an important role in the success of employing R2R technology by enabling a high product yield whilst guaranteeing high performance and a long lifespan of these multi-layer products. This thesis presents an environmentally robust line-scan dispersive interferometry (LSDI) technique that is suitable for applications in in-line surface inspection. Obtaining a surface profile in a single shot allows this interferometer to minimise the effect of external perturbations and environmental noise. Additionally, it eliminates the mechanical scanning and has an extended axial measurement range without the 2π phase ambiguity problem by dispersing the output of the spectrometer onto the camera. Benefiting from high-speed camera, general-purpose graphics processing unit and multi-core processor computing technology, the LSDI can achieve high dynamic measurement with a high signal-to-noise ratio and is effective for use on the shop floor. Two proof-of-concept prototypes aimed at different applications are implemented. The cylindrical lens based prototype has a large lateral range up to 6 mm and can be used for characterisation of additively manufactured surface texture, surface form and surface blemish. The second prototype using a 4X microscope objective with a diffraction limited lateral resolution (~ 4 µm) is aiming at characterisation of surface roughness, micro-scale defects, and other imperfections of the ultra-precision surfaces. System design, implementation, fringe analysis algorithms and system calibrations are presented in detail in this thesis. Their performances are evaluated experimentally by measuring several standard step heights as well as Al2O3 coated polyethylene naphthalate (PEN) films. The measurement results acquired using both prototypes and a commercial available instrument (Talysurf CCI 3000) align with each other acceptably. This shows that the developed metrology sensors may potentially be applied to production lines such as R2R surface inspection where only defects present on the surface are concerned in terms of quality assurance. Implementation of these prototypes offers an attractive solution to improve manufacturing processing and reliability for the products in ultra-highprecision engineering

    Aspectos de interconectividade dos moduladores de polímero

    Get PDF
    Orientador: Hugo Enrique Hernández-FigueroaTese (doutorado) - Universidade Estadual de Campinas, Faculdade de Engenharia Elétrica e de ComputaçãoResumo: As interconexões ópticas e elétricas são de grande interese na area de encapsulamento de circuitos integrados híbridos fotônicos. Baixas perdas e banda larga são necessárias para o desenvolvimento de novas tecnologías na área. Nesta tese apresentan-se as seguintes contribuições originais: uma metodologia do modelamento de interconexões elétricas em encapsulamento de moduladores de polímero eletro-óptico, um dispositivo óptico compacto de banda larga para interconectar a plataforma de silício sobre isolante com a plataforma de filmes finos de polímero sobre silícioAbstract: Electrical and optical interconnects are of great interest for photonic integrated circuits with hybrid platforms. Low loss and wide band are essential for the development of new technologies in this area. In this thesis, we present the following original contributions: a methodology for modeling electrical ceramic interconnects inside an electrooptic polymer packaging, and a compact low-loss optical interconnect for the silicon-on-insulator platform to the thin-film polymer on silicon platformDoutoradoTelecomunicações e TelemáticaDoutor em Engenharia Elétrica07/2014-36CAPE

    Rapport annuel 2010-2011

    Get PDF

    A multiple-SIMD architecture for image and tracking analysis

    Get PDF
    The computational requirements for real-time image based applications are such as to warrant the use of a parallel architecture. Commonly used parallel architectures conform to the classifications of Single Instruction Multiple Data (SIMD), or Multiple Instruction Multiple Data (MIMD). Each class of architecture has its advantages and dis-advantages. For example, SIMD architectures can be used on data-parallel problems, such as the processing of an image. Whereas MIMD architectures are more flexible and better suited to general purpose computing. Both types of processing are typically required for the analysis of the contents of an image. This thesis describes a novel massively parallel heterogeneous architecture, implemented as the Warwick Pyramid Machine. Both SIMD and MIMD processor types are combined within this architecture. Furthermore, the SIMD array is partitioned, into smaller SIMD sub-arrays, forming a Multiple-SIMD array. Thus, local data parallel, global data parallel, and control parallel processing are supported. After describing the present options available in the design of massively parallel machines and the nature of the image analysis problem, the architecture of the Warwick Pyramid Machine is described in some detail. The performance of this architecture is then analysed, both in terms of peak available computational power and in terms of representative applications in image analysis and numerical computation. Two tracking applications are also analysed to show the performance of this architecture. In addition, they illustrate the possible partitioning of applications between the SIMD and MIMD processor arrays. Load-balancing techniques are then described which have the potential to increase the utilisation of the Warwick Pyramid Machine at run-time. These include mapping techniques for image regions across the Multiple-SIMD arrays, and for the compression of sparse data. It is envisaged that these techniques may be found useful in other parallel systems

    The 1991 3rd NASA Symposium on VLSI Design

    Get PDF
    Papers from the symposium are presented from the following sessions: (1) featured presentations 1; (2) very large scale integration (VLSI) circuit design; (3) VLSI architecture 1; (4) featured presentations 2; (5) neural networks; (6) VLSI architectures 2; (7) featured presentations 3; (8) verification 1; (9) analog design; (10) verification 2; (11) design innovations 1; (12) asynchronous design; and (13) design innovations 2

    CIRCUITS AND ARCHITECTURE FOR BIO-INSPIRED AI ACCELERATORS

    Get PDF
    Technological advances in microelectronics envisioned through Moore’s law have led to powerful processors that can handle complex and computationally intensive tasks. Nonetheless, these advancements through technology scaling have come at an unfavorable cost of significantly larger power consumption, which has posed challenges for data processing centers and computers at scale. Moreover, with the emergence of mobile computing platforms constrained by power and bandwidth for distributed computing, the necessity for more energy-efficient scalable local processing has become more significant. Unconventional Compute-in-Memory architectures such as the analog winner-takes-all associative-memory and the Charge-Injection Device processor have been proposed as alternatives. Unconventional charge-based computation has been employed for neural network accelerators in the past, where impressive energy efficiency per operation has been attained in 1-bit vector-vector multiplications, and in recent work, multi-bit vector-vector multiplications. In the latter, computation was carried out by counting quanta of charge at the thermal noise limit, using packets of about 1000 electrons. These systems are neither analog nor digital in the traditional sense but employ mixed-signal circuits to count the packets of charge and hence we call them Quasi-Digital. By amortizing the energy costs of the mixed-signal encoding/decoding over compute-vectors with many elements, high energy efficiencies can be achieved. In this dissertation, I present a design framework for AI accelerators using scalable compute-in-memory architectures. On the device level, two primitive elements are designed and characterized as target computational technologies: (i) a multilevel non-volatile cell and (ii) a pseudo Dynamic Random-Access Memory (pseudo-DRAM) bit-cell. At the level of circuit description, compute-in-memory crossbars and mixed-signal circuits were designed, allowing seamless connectivity to digital controllers. At the level of data representation, both binary and stochastic-unary coding are used to compute Vector-Vector Multiplications (VMMs) at the array level. Finally, on the architectural level, two AI accelerator for data-center processing and edge computing are discussed. Both designs are scalable multi-core Systems-on-Chip (SoCs), where vector-processor arrays are tiled on a 2-layer Network-on-Chip (NoC), enabling neighbor communication and flexible compute vs. memory trade-off. General purpose Arm/RISCV co-processors provide adequate bootstrapping and system-housekeeping and a high-speed interface fabric facilitates Input/Output to main memory
    corecore