1,650 research outputs found

    A switchable light field camera architecture with Angle Sensitive Pixels and dictionary-based sparse coding

    Get PDF
    We propose a flexible light field camera architecture that is at the convergence of optics, sensor electronics, and applied mathematics. Through the co-design of a sensor that comprises tailored, Angle Sensitive Pixels and advanced reconstruction algorithms, we show that-contrary to light field cameras today-our system can use the same measurements captured in a single sensor image to recover either a high-resolution 2D image, a low-resolution 4D light field using fast, linear processing, or a high-resolution light field using sparsity-constrained optimization.National Science Foundation (U.S.) (NSF Grant IIS-1218411)National Science Foundation (U.S.) (NSF Grant IIS-1116452)MIT Media Lab ConsortiumNational Science Foundation (U.S.) (NSF Graduate Research Fellowship)Natural Sciences and Engineering Research Council of Canada (NSERC Postdoctoral Fellowship)Alfred P. Sloan Foundation (Research Fellowship)United States. Defense Advanced Research Projects Agency (DARPA Young Faculty Award

    Image Feature Extraction Acceleration

    Get PDF
    Image feature extraction is instrumental for most of the best-performing algorithms in computer vision. However, it is also expensive in terms of computational and memory resources for embedded systems due to the need of dealing with individual pixels at the earliest processing levels. In this regard, conventional system architectures do not take advantage of potential exploitation of parallelism and distributed memory from the very beginning of the processing chain. Raw pixel values provided by the front-end image sensor are squeezed into a high-speed interface with the rest of system components. Only then, after deserializing this massive dataflow, parallelism, if any, is exploited. This chapter introduces a rather different approach from an architectural point of view. We present two Application-Specific Integrated Circuits (ASICs) where the 2-D array of photo-sensitive devices featured by regular imagers is combined with distributed memory supporting concurrent processing. Custom circuitry is added per pixel in order to accelerate image feature extraction right at the focal plane. Specifically, the proposed sensing-processing chips aim at the acceleration of two flagships algorithms within the computer vision community: the Viola-Jones face detection algorithm and the Scale Invariant Feature Transform (SIFT). Experimental results prove the feasibility and benefits of this architectural solution.Ministerio de Economía y Competitividad TEC2012-38921-C02, IPT-2011- 1625-430000, IPC-20111009Junta de Andalucía TIC 2338-2013Xunta de Galicia EM2013/038Office of NavalResearch (USA) N00014141035

    Fast Objective Coupled Planar Illumination Microscopy

    Get PDF
    Among optical imaging techniques light sheet fluorescence microscopy stands out as one of the most attractive for capturing high-speed biological dynamics unfolding in three dimensions. The technique is potentially millions of times faster than point-scanning techniques such as two-photon microscopy. This potential is especially poignant for neuroscience applications due to the fact that interactions between neurons transpire over mere milliseconds within tissue volumes spanning hundreds of cubic microns. However current-generation light sheet microscopes are limited by volume scanning rate and/or camera frame rate. We begin by reviewing the optical principles underlying light sheet fluorescence microscopy and the origin of these rate bottlenecks. We present an analysis leading us to the conclusion that Objective Coupled Planar Illumination (OCPI) microscopy is a particularly promising technique for recording the activity of large populations of neurons at high sampling rate. We then present speed-optimized OCPI microscopy, the first fast light sheet technique to avoid compromising image quality or photon efficiency. We enact two strategies to develop the fast OCPI microscope. First, we devise a set of optimizations that increase the rate of the volume scanning system to 40 Hz for volumes up to 700 microns thick. Second, we introduce Multi-Camera Image Sharing (MCIS), a technique to scale imaging rate by incorporating additional cameras. MCIS can be applied not only to OCPI but to any widefield imaging technique, circumventing the limitations imposed by the camera. Detailed design drawings are included to aid in dissemination to other research groups. We also demonstrate fast calcium imaging of the larval zebrafish brain and find a heartbeat-induced motion artifact. We recommend a new preprocessing step to remove the artifact through filtering. This step requires a minimal sampling rate of 15 Hz, and we expect it to become a standard procedure in zebrafish imaging pipelines. In the last chapter we describe essential computational considerations for controlling a fast OCPI microscope and processing the data that it generates. We introduce a new image processing pipeline developed to maximize computational efficiency when analyzing these multi-terabyte datasets, including a novel calcium imaging deconvolution algorithm. Finally we provide a demonstration of how combined innovations in microscope hardware and software enable inference of predictive relationships between neurons, a promising complement to more conventional correlation-based analyses

    Efficient Implementation of Stochastic Inference on Heterogeneous Clusters and Spiking Neural Networks

    Get PDF
    Neuromorphic computing refers to brain inspired algorithms and architectures. This paradigm of computing can solve complex problems which were not possible with traditional computing methods. This is because such implementations learn to identify the required features and classify them based on its training, akin to how brains function. This task involves performing computation on large quantities of data. With this inspiration, a comprehensive multi-pronged approach is employed to study and efficiently implement neuromorphic inference model using heterogeneous clusters to address the problem using traditional Von Neumann architectures and by developing spiking neural networks (SNN) for native and ultra-low power implementation. In this regard, an extendable high-performance computing (HPC) framework and optimizations are proposed for heterogeneous clusters to modularize complex neuromorphic applications in a distributed manner. To achieve best possible throughput and load balancing for such modularized architectures a set of algorithms are proposed to suggest the optimal mapping of different modules as an asynchronous pipeline to the available cluster resources while considering the complex data dependencies between stages. On the other hand, SNNs are more biologically plausible and can achieve ultra-low power implementation due to its sparse spike based communication, which is possible with emerging non-Von Neumann computing platforms. As a significant progress in this direction, spiking neuron models capable of distributed online learning are proposed. A high performance SNN simulator (SpNSim) is developed for simulation of large scale mixed neuron model networks. An accompanying digital hardware neuron RTL is also proposed for efficient real time implementation of SNNs capable of online learning. Finally, a methodology for mapping probabilistic graphical model to off-the-shelf neurosynaptic processor (IBM TrueNorth) as a stochastic SNN is presented with ultra-low power consumption

    RNA polymerase II clusters form in line with surface condensation on regulatory chromatin

    Get PDF
    It is essential for cells to control which genes are transcribed into RNA. In eukaryotes, two major control points are recruitment of RNA polymerase II (Pol II) into a paused state, and subsequent pause release toward transcription. Pol II recruitment and pause release occur in association with macromolecular clusters, which were proposed to be formed by a liquid–liquid phase separation mechanism. How such a phase separation mechanism relates to the interaction of Pol II with DNA during recruitment and transcription, however, remains poorly understood. Here, we use live and super-resolution microscopy in zebrafish embryos to reveal Pol II clusters with a large variety of shapes, which can be explained by a theoretical model in which regulatory chromatin regions provide surfaces for liquid-phase condensation at concentrations that are too low for canonical liquid–liquid phase separation. Model simulations and chemical perturbation experiments indicate that recruited Pol II contributes to the formation of these surface-associated condensates, whereas elongating Pol II is excluded from these condensates and thereby drives their unfolding

    Automated Correlative Light and Electron Microscopy using FIB-SEM as a tool to screen for ultrastructural phenotypes

    Get PDF
    In Correlative Light and Electron Microscopy (CLEM), two imaging modalities are combined to take advantage of the localization capabilities of light microscopy (LM) to guide the capture of high-resolution details in the electron microscope (EM). However, traditional approaches have proven to be very laborious, thus yielding a too low throughput for quantitative or exploratory studies of populations. Recently, in the electron microscopy field, FIB-SEM (Focused Ion Beam -Scanning Electron Microscope) tomography has emerged as a flexible method that enables semi-automated 3D volume acquisitions. During my thesis, I developed CLEMSite, a tool that takes advantage of the semi-automation and scanning capabilities of the FIB-SEM to automatically acquire volumes of adherent cultured cells. CLEMSite is a combination of computer vision and machine learning applications with a library for controlling the microscope ( product from a collaboration with Carl Zeiss GmbH and Fibics Inc.). Thanks to this, the microscope was able to automatically track, find and acquire cell regions previously identified in the light microscope. More specifically, two main modules were implemented. First, a correlation module was designed to detect and record reference points from a grid pattern present on the culture substrate in both modalities (LM and EM). Second, I designed a module that retrieves the regions of interest in the FIB-SEM and that drives the acquisition of image stacks between different targets in an unattended fashion. The automated CLEM approach is demonstrated on a project where 3D EM volumes are examined upon multiple siRNA treatments for knocking down genes involved in the morphogenesis of the Golgi apparatus. Additionally, the power of CLEM approaches using FIB-SEM is demonstrated with the detailed structural analysis of two events: the breakage of the nuclear envelope within constricted cells and an intriguing catastrophic DNA Damage Response in binucleated cells. Our results demonstrate that executing high throughput volume acquisition in electron microscopy is possible and that EM can provide incredible insights to guide new biological discoveries

    HMC-Based Accelerator Design For Compressed Deep Neural Networks

    Get PDF
    Deep Neural Networks (DNNs) offer remarkable performance of classifications and regressions in many high dimensional problems and have been widely utilized in real-word cognitive applications. In DNN applications, high computational cost of DNNs greatly hinder their deployment in resource-constrained applications, real-time systems and edge computing platforms. Moreover, energy consumption and performance cost of moving data between memory hierarchy and computational units are higher than that of the computation itself. To overcome the memory bottleneck, data locality and temporal data reuse are improved in accelerator design. In an attempt to further improve data locality, memory manufacturers have invented 3D-stacked memory where multiple layers of memory arrays are stacked on top of each other. Inherited from the concept of Process-In-Memory (PIM), some 3D-stacked memory architectures also include a logic layer that can integrate general-purpose computational logic directly within main memory to take advantages of high internal bandwidth during computation. In this dissertation, we are going to investigate hardware/software co-design for neural network accelerator. Specifically, we introduce a two-phase filter pruning framework for model compression and an accelerator tailored for efficient DNN execution on HMC, which can dynamically offload the primitives and functions to PIM logic layer through a latency-aware scheduling controller. In our compression framework, we formulate filter pruning process as an optimization problem and propose a filter selection criterion measured by conditional entropy. The key idea of our proposed approach is to establish a quantitative connection between filters and model accuracy. We define the connection as conditional entropy over filters in a convolutional layer, i.e., distribution of entropy conditioned on network loss. Based on the definition, different pruning efficiencies of global and layer-wise pruning strategies are compared, and two-phase pruning method is proposed. The proposed pruning method can achieve a reduction of 88% filters and 46% inference time reduction on VGG16 within 2% accuracy degradation. In this dissertation, we are going to investigate hardware/software co-design for neural network accelerator. Specifically, we introduce a two-phase filter pruning framework for model compres- sion and an accelerator tailored for efficient DNN execution on HMC, which can dynamically offload the primitives and functions to PIM logic layer through a latency-aware scheduling con- troller. In our compression framework, we formulate filter pruning process as an optimization problem and propose a filter selection criterion measured by conditional entropy. The key idea of our proposed approach is to establish a quantitative connection between filters and model accuracy. We define the connection as conditional entropy over filters in a convolutional layer, i.e., distribution of entropy conditioned on network loss. Based on the definition, different pruning efficiencies of global and layer-wise pruning strategies are compared, and two-phase pruning method is proposed. The proposed pruning method can achieve a reduction of 88% filters and 46% inference time reduction on VGG16 within 2% accuracy degradation

    2022 roadmap on neuromorphic computing and engineering

    Full text link
    Modern computation based on von Neumann architecture is now a mature cutting-edge science. In the von Neumann architecture, processing and memory units are implemented as separate blocks interchanging data intensively and continuously. This data transfer is responsible for a large part of the power consumption. The next generation computer technology is expected to solve problems at the exascale with 1018^{18} calculations each second. Even though these future computers will be incredibly powerful, if they are based on von Neumann type architectures, they will consume between 20 and 30 megawatts of power and will not have intrinsic physically built-in capabilities to learn or deal with complex data as our brain does. These needs can be addressed by neuromorphic computing systems which are inspired by the biological concepts of the human brain. This new generation of computers has the potential to be used for the storage and processing of large amounts of digital information with much lower power consumption than conventional processors. Among their potential future applications, an important niche is moving the control from data centers to edge devices. The aim of this roadmap is to present a snapshot of the present state of neuromorphic technology and provide an opinion on the challenges and opportunities that the future holds in the major areas of neuromorphic technology, namely materials, devices, neuromorphic circuits, neuromorphic algorithms, applications, and ethics. The roadmap is a collection of perspectives where leading researchers in the neuromorphic community provide their own view about the current state and the future challenges for each research area. We hope that this roadmap will be a useful resource by providing a concise yet comprehensive introduction to readers outside this field, for those who are just entering the field, as well as providing future perspectives for those who are well established in the neuromorphic computing community
    • …
    corecore