151 research outputs found

    Low power CMOS vision sensor for foreground segmentation

    Get PDF
    This thesis focuses on the design of a top-ranked algorithm for background subtraction, the Pixel Adaptive Based Segmenter (PBAS), for its mapping onto a CMOS vision sensor on the focal plane processing. The redesign of PBAS into its hardware oriented version, HO-PBAS, has led to a less number of memories per pixel, along with a simpler overall model, yet, resulting in an acceptable loss of accuracy with respect to its counterpart on CPU. This thesis features two CMOS vision sensors. The first one, HOPBAS1K, has laid out a 24 x 56 pixel array onto a miniasic chip in standard 180 nm CMOS technology. The second one, HOPBAS10K, features an array of 98 x 98 pixels in standard 180 nm CMOS technology too. The second chip fixes some issues found in the first chip, and provides good hardware and background performance metrics

    Primitives and design of the intelligent pixel multimedia communicator

    Get PDF
    Communication systems arc an ever more essential component of our modern global society. Mobile communications systems are still in a state of rapid advancement and growth. Technology is constantly evolving at a rapid pace in ever more diverse areas and the emerging mobile multimedia based communication systems offer new challenges for both current and future technologies. To realise the full potential of mobile multimedia communication systems there is a need to explore new options to solve some of the fundamental problems facing the technology. In particular, the complexity of such a system within an infrastructure framework that is inherently limited by its power sources and has very restricted transmission bandwidth demands new methodologies and approaches

    Micro/Nano Devices for Blood Analysis, Volume II

    Get PDF
    The development of micro- and nanodevices for blood analysis continues to be a growing interdisciplinary subject that demands the careful integration of different research fields. Following the success of the book “Micro/Nano Devices for Blood Analysis”, we invited more authors from the scientific community to participate in and submit their research for a second volume. Researchers from different areas and backgrounds cooperated actively and submitted high-quality research, focusing on the latest advances and challenges in micro- and nanodevices for diagnostics and blood analysis; micro- and nanofluidics; technologies for flow visualization and diagnosis; biochips, organ-on-a-chip and lab-on-a-chip devices; and their applications to research and industry

    Simulation and implementation of novel deep learning hardware architectures for resource constrained devices

    Get PDF
    Corey Lammie designed mixed signal memristive-complementary metal–oxide–semiconductor (CMOS) and field programmable gate arrays (FPGA) hardware architectures, which were used to reduce the power and resource requirements of Deep Learning (DL) systems; both during inference and training. Disruptive design methodologies, such as those explored in this thesis, can be used to facilitate the design of next-generation DL systems

    Approximate computing: An integrated cross-layer framework

    Get PDF
    A new design approach, called approximate computing (AxC), leverages the flexibility provided by intrinsic application resilience to realize hardware or software implementations that are more efficient in energy or performance. Approximate computing techniques forsake exact (numerical or Boolean) equivalence in the execution of some of the application’s computations, while ensuring that the output quality is acceptable. While early efforts in approximate computing have demonstrated great potential, they consist of ad hoc techniques applied to a very narrow set of applications, leaving in question the applicability of approximate computing in a broader context. The primary objective of this thesis is to develop an integrated cross-layer approach to approximate computing, and to thereby establish its applicability to a broader range of applications. The proposed framework comprises of three key components: (i) At the circuit level, systematic approaches to design approximate circuits, or circuits that realize a slightly modified function with improved efficiency, (ii) At the architecture level, utilize approximate circuits to build programmable approximate processors, and (iii) At the software level, methods to apply approximate computing to machine learning classifiers, which represent an important class of applications that are being utilized across the computing spectrum. Towards this end, the thesis extends the state-of-the-art in approximate computing in the following important directions. Synthesis of Approximate Circuits: First, the thesis proposes a rigorous framework for the automatic synthesis of approximate circuits , which are the hardware building blocks of approximate computing platforms. Designing approximate circuits involves making judicious changes to the function implemented by the circuit such that its hardware complexity is lowered without violating the specified quality constraint. Inspired by classical approaches to Boolean optimization in logic synthesis, the thesis proposes two synthesis tools called SALSA and SASIMI that are general, i.e., applicable to any given circuit and quality specification. The framework is further extended to automatically design quality configurable circuits , which are approximate circuits with the capability to reconfigure their quality at runtime. Over a wide range of arithmetic circuits, complex modules and complete datapaths, the circuits synthesized using the proposed framework demonstrate significant benefits in area and energy. Programmable AxC Processors: Next, the thesis extends approximate computing to the realm of programmable processors by introducing the concept of quality programmable processors (QPPs). A key principle of QPPs is that the notion of quality is explicitly codified in their HW/SW interface i.e., the instruction set. Instructions in the ISA are extended with quality fields, enabling software to specify the accuracy level that must be met during their execution. The micro-architecture is designed with hardware mechanisms to understand these quality specifications and translate them into energy savings. As a first embodiment of QPPs, the thesis presents a quality programmable 1D/2D vector processor QP-Vec, which contains a 3-tiered hierarchy of processing elements. Based on an implementation of QP-Vec with 289 processing elements, energy benefits up to 2.5X are demonstrated across a wide range of applications. Software and Algorithms for AxC: Finally, the thesis addresses the problem of applying approximate computing to an important class of applications viz. machine learning classifiers such as deep learning networks. To this end, the thesis proposes two approaches—AxNN and scalable effort classifiers. Both approaches leverage domain- specific insights to transform a given application to an energy-efficient approximate version that meets a specified application output quality. In the context of deep learning networks, AxNN adapts backpropagation to identify neurons that contribute less significantly to the network’s accuracy, approximating these neurons (e.g., by using lower precision), and incrementally re-training the network to mitigate the impact of approximations on output quality. On the other hand, scalable effort classifiers leverage the heterogeneity in the inherent classification difficulty of inputs to dynamically modulate the effort expended by machine learning classifiers. This is achieved by building a chain of classifiers of progressively growing complexity (and accuracy) such that the number of stages used for classification scale with input difficulty. Scalable effort classifiers yield substantial energy benefits as a majority of the inputs require very low effort in real-world datasets. In summary, the concepts and techniques presented in this thesis broaden the applicability of approximate computing, thus taking a significant step towards bringing approximate computing to the mainstream. (Abstract shortened by ProQuest.

    Hardware Implementation of Deep Network Accelerators Towards Healthcare and Biomedical Applications

    Get PDF
    With the advent of dedicated Deep Learning (DL) accelerators and neuromorphic processors, new opportunities are emerging for applying deep and Spiking Neural Network (SNN) algorithms to healthcare and biomedical applications at the edge. This can facilitate the advancement of the medical Internet of Things (IoT) systems and Point of Care (PoC) devices. In this paper, we provide a tutorial describing how various technologies ranging from emerging memristive devices, to established Field Programmable Gate Arrays (FPGAs), and mature Complementary Metal Oxide Semiconductor (CMOS) technology can be used to develop efficient DL accelerators to solve a wide variety of diagnostic, pattern recognition, and signal processing problems in healthcare. Furthermore, we explore how spiking neuromorphic processors can complement their DL counterparts for processing biomedical signals. After providing the required background, we unify the sparsely distributed research on neural network and neuromorphic hardware implementations as applied to the healthcare domain. In addition, we benchmark various hardware platforms by performing a biomedical electromyography (EMG) signal processing task and drawing comparisons among them in terms of inference delay and energy. Finally, we provide our analysis of the field and share a perspective on the advantages, disadvantages, challenges, and opportunities that different accelerators and neuromorphic processors introduce to healthcare and biomedical domains. This paper can serve a large audience, ranging from nanoelectronics researchers, to biomedical and healthcare practitioners in grasping the fundamental interplay between hardware, algorithms, and clinical adoption of these tools, as we shed light on the future of deep networks and spiking neuromorphic processing systems as proponents for driving biomedical circuits and systems forward.Comment: Submitted to IEEE Transactions on Biomedical Circuits and Systems (21 pages, 10 figures, 5 tables

    Analog Front-End Circuits for Massive Parallel 3-D Neural Microsystems.

    Full text link
    Understanding dynamics of the brain has tremendously improved due to the progress in neural recording techniques over the past five decades. The number of simultaneously recorded channels has actually doubled every 7 years, which implies that a recording system with a few thousand channels should be available in the next two decades. Nonetheless, a leap in the number of simultaneous channels has remained an unmet need due to many limitations, especially in the front-end recording integrated circuits (IC). This research has focused on increasing the number of simultaneously recorded channels and providing modular design approaches to improve the integration and expansion of 3-D recording microsystems. Three analog front-ends (AFE) have been developed using extremely low-power and small-area circuit techniques on both the circuit and system levels. The three prototypes have investigated some critical circuit challenges in power, area, interface, and modularity. The first AFE (16-channels) has optimized energy efficiency using techniques such as moderate inversion, minimized asynchronous interface for data acquisition, power-scalable sampling operation, and a wide configuration range of gain and bandwidth. Circuits in this part were designed in a 0.25μm CMOS process using a 0.9-V single supply and feature a power consumption of 4μW/channel and an energy-area efficiency of 7.51x10^15 in units of J^-1Vrms^-1mm^-2. The second AFE (128-channels) provides the next level of scaling using dc-coupled analog compression techniques to reject the electrode offset and reduce the implementation area further. Signal processing techniques were also explored to transfer some computational power outside the brain. Circuits in this part were designed in a 180nm CMOS process using a 0.5-V single supply and feature a power consumption of 2.5μW/channel, and energy-area efficiency of 30.2x10^15 J^-1Vrms^-1mm^-2. The last AFE (128-channels) shows another leap in neural recording using monolithic integration of recording circuits on the shanks of neural probes. Monolithic integration may be the most effective approach to allow simultaneous recording of more than 1,024 channels. The probe and circuits in this part were designed in a 150 nm SOI CMOS process using a 0.5-V single supply and feature a power consumption of only 1.4μW/channel and energy-area efficiency of 36.4x10^15 J^-1Vrms^-1mm^-2.PHDElectrical EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/98070/1/ashmouny_1.pd

    Integrated Circuit Design in US High-Energy Physics

    Full text link
    This whitepaper summarizes the status, plans, and challenges in the area of integrated circuit design in the United States for future High Energy Physics (HEP) experiments. It has been submitted to CPAD (Coordinating Panel for Advanced Detectors) and the HEP Community Summer Study 2013(Snowmass on the Mississippi) held in Minnesota July 29 to August 6, 2013. A workshop titled: US Workshop on IC Design for High Energy Physics, HEPIC2013 was held May 30 to June 1, 2013 at Lawrence Berkeley National Laboratory (LBNL). A draft of the whitepaper was distributed to the attendees before the workshop, the content was discussed at the meeting, and this document is the resulting final product. The scope of the whitepaper includes the following topics: Needs for IC technologies to enable future experiments in the three HEP frontiers Energy, Cosmic and Intensity Frontiers; Challenges in the different technology and circuit design areas and the related R&D needs; Motivation for using different fabrication technologies; Outlook of future technologies including 2.5D and 3D; Survey of ICs used in current experiments and ICs targeted for approved or proposed experiments; IC design at US institutes and recommendations for collaboration in the future

    Algorithmic Analysis Techniques for Molecular Imaging

    Get PDF
    This study addresses image processing techniques for two medical imaging modalities: Positron Emission Tomography (PET) and Magnetic Resonance Imaging (MRI), which can be used in studies of human body functions and anatomy in a non-invasive manner. In PET, the so-called Partial Volume Effect (PVE) is caused by low spatial resolution of the modality. The efficiency of a set of PVE-correction methods is evaluated in the present study. These methods use information about tissue borders which have been acquired with the MRI technique. As another technique, a novel method is proposed for MRI brain image segmen- tation. A standard way of brain MRI is to use spatial prior information in image segmentation. While this works for adults and healthy neonates, the large variations in premature infants preclude its direct application. The proposed technique can be applied to both healthy and non-healthy premature infant brain MR images. Diffusion Weighted Imaging (DWI) is a MRI-based technique that can be used to create images for measuring physiological properties of cells on the structural level. We optimise the scanning parameters of DWI so that the required acquisition time can be reduced while still maintaining good image quality. In the present work, PVE correction methods, and physiological DWI models are evaluated in terms of repeatabilityof the results. This gives in- formation on the reliability of the measures given by the methods. The evaluations are done using physical phantom objects, correlation measure- ments against expert segmentations, computer simulations with realistic noise modelling, and with repeated measurements conducted on real pa- tients. In PET, the applicability and selection of a suitable partial volume correction method was found to depend on the target application. For MRI, the data-driven segmentation offers an alternative when using spatial prior is not feasible. For DWI, the distribution of b-values turns out to be a central factor affecting the time-quality ratio of the DWI acquisition. An optimal b-value distribution was determined. This helps to shorten the imaging time without hampering the diagnostic accuracy.Siirretty Doriast

    Insect-vision inspired collision warning vision processor for automobiles

    Get PDF
    Vision is expected to play important roles for car safety enhancement. Imaging systems can be used to enlarging the vision field of the driver. For instance capturing and displaying views of hidden areas around the car which the driver can analyze for safer decision-making. Vision systems go a step further. They can autonomously analyze the visual information, identify dangerous situations and prompt the delivery of warning signals. For instance in case of road lane departure, if an overtaking car is in the blind spot, if an object is approaching within collision course, etc. Processing capabilities are also needed for applications viewing the car interior such as >intelligent airbag systems> that base deployment decisions on passenger features. On-line processing of visual information for car safety involves multiple sensors and views, huge amount of data per view and large frame rates. The associated computational load may be prohibitive for conventional processing architectures. Dedicated systems with embedded local processing capabilities may be needed to confront the challenges. This paper describes a dedicated sensory-processing architecture for collision warning which is inspired by insect vision. Particularly, the paper relies on the exploitation of the knowledge about the behavior of Locusta Migratoria to develop dedicated chips and systems which are integrated into model cars as well as into a commercial car (Volvo XC90) and tested to deliver collision warnings in real traffic scenarios.Gobierno de España TEC2006-15722European Community IST:2001-3809
    corecore