25 research outputs found

    Large-Scale Optical Neural Networks based on Photoelectric Multiplication

    Full text link
    Recent success in deep neural networks has generated strong interest in hardware accelerators to improve speed and energy consumption. This paper presents a new type of photonic accelerator based on coherent detection that is scalable to large (N106N \gtrsim 10^6) networks and can be operated at high (GHz) speeds and very low (sub-aJ) energies per multiply-and-accumulate (MAC), using the massive spatial multiplexing enabled by standard free-space optical components. In contrast to previous approaches, both weights and inputs are optically encoded so that the network can be reprogrammed and trained on the fly. Simulations of the network using models for digit- and image-classification reveal a "standard quantum limit" for optical neural networks, set by photodetector shot noise. This bound, which can be as low as 50 zJ/MAC, suggests performance below the thermodynamic (Landauer) limit for digital irreversible computation is theoretically possible in this device. The proposed accelerator can implement both fully-connected and convolutional networks. We also present a scheme for back-propagation and training that can be performed in the same hardware. This architecture will enable a new class of ultra-low-energy processors for deep learning.Comment: Text: 10 pages, 5 figures, 1 table. Supplementary: 8 pages, 5, figures, 2 table

    Deep Learning with Coherent VCSEL Neural Networks

    Full text link
    Deep neural networks (DNNs) are reshaping the field of information processing. With their exponential growth challenging existing electronic hardware, optical neural networks (ONNs) are emerging to process DNN tasks in the optical domain with high clock rates, parallelism and low-loss data transmission. However, to explore the potential of ONNs, it is necessary to investigate the full-system performance incorporating the major DNN elements, including matrix algebra and nonlinear activation. Existing challenges to ONNs are high energy consumption due to low electro-optic (EO) conversion efficiency, low compute density due to large device footprint and channel crosstalk, and long latency due to the lack of inline nonlinearity. Here we experimentally demonstrate an ONN system that simultaneously overcomes all these challenges. We exploit neuron encoding with volume-manufactured micron-scale vertical-cavity surface-emitting laser (VCSEL) transmitter arrays that exhibit high EO conversion (<5 attojoule/symbol with VπV_\pi=4 mV), high operation bandwidth (up to 25 GS/s), and compact footprint (<0.01 mm2^2 per device). Photoelectric multiplication allows low-energy matrix operations at the shot-noise quantum limit. Homodyne detection-based nonlinearity enables nonlinear activation with instantaneous response. The full-system energy efficiency and compute density reach 7 femtojoules per operation (fJ/OP) and 25 TeraOP/(mm2^2\cdot s), both representing a >100-fold improvement over state-of-the-art digital computers, with substantially several more orders of magnitude for future improvement. Beyond neural network inference, its feature of rapid weight updating is crucial for training deep learning models. Our technique opens an avenue to large-scale optoelectronic processors to accelerate machine learning tasks from data centers to decentralized edge devices.Comment: 10 pages, 5 figure

    Retrotransposon-Induced Heterochromatin Spreading in the Mouse Revealed by Insertional Polymorphisms

    Get PDF
    The “arms race” relationship between transposable elements (TEs) and their host has promoted a series of epigenetic silencing mechanisms directed against TEs. Retrotransposons, a class of TEs, are often located in repressed regions and are thought to induce heterochromatin formation and spreading. However, direct evidence for TE–induced local heterochromatin in mammals is surprisingly scarce. To examine this phenomenon, we chose two mouse embryonic stem (ES) cell lines that possess insertionally polymorphic retrotransposons (IAP, ETn/MusD, and LINE elements) at specific loci in one cell line but not the other. Employing ChIP-seq data for these cell lines, we show that IAP elements robustly induce H3K9me3 and H4K20me3 marks in flanking genomic DNA. In contrast, such heterochromatin is not induced by LINE copies and only by a minority of polymorphic ETn/MusD copies. DNA methylation is independent of the presence of IAP copies, since it is present in flanking regions of both full and empty sites. Finally, such spreading into genes appears to be rare, since the transcriptional start sites of very few genes are less than one Kb from an IAP. However, the B3galtl gene is subject to transcriptional silencing via IAP-induced heterochromatin. Hence, although rare, IAP-induced local heterochromatin spreading into nearby genes may influence expression and, in turn, host fitness

    Erratum: Global, regional, and national comparative risk assessment of 84 behavioural, environmental and occupational, and metabolic risks or clusters of risks for 195 countries and territories, 1990–2017: a systematic analysis for the Global Burden of Disease Study 2017

    Get PDF
    Interpretation: By quantifying levels and trends in exposures to risk factors and the resulting disease burden, this assessment offers insight into where past policy and programme efforts might have been successful and highlights current priorities for public health action. Decreases in behavioural, environmental, and occupational risks have largely offset the effects of population growth and ageing, in relation to trends in absolute burden. Conversely, the combination of increasing metabolic risks and population ageing will probably continue to drive the increasing trends in non-communicable diseases at the global level, which presents both a public health challenge and opportunity. We see considerable spatiotemporal heterogeneity in levels of risk exposure and risk-attributable burden. Although levels of development underlie some of this heterogeneity, O/E ratios show risks for which countries are overperforming or underperforming relative to their level of development. As such, these ratios provide a benchmarking tool to help to focus local decision making. Our findings reinforce the importance of both risk exposure monitoring and epidemiological research to assess causal connections between risks and health outcomes, and they highlight the usefulness of the GBD study in synthesising data to draw comprehensive and robust conclusions that help to inform good policy and strategic health planning

    Many Labs 2: Investigating Variation in Replicability Across Samples and Settings

    Get PDF
    We conducted preregistered replications of 28 classic and contemporary published findings, with protocols that were peer reviewed in advance, to examine variation in effect magnitudes across samples and settings. Each protocol was administered to approximately half of 125 samples that comprised 15,305 participants from 36 countries and territories. Using the conventional criterion of statistical significance (p < .05), we found that 15 (54%) of the replications provided evidence of a statistically significant effect in the same direction as the original finding. With a strict significance criterion (p < .0001), 14 (50%) of the replications still provided such evidence, a reflection of the extremely highpowered design. Seven (25%) of the replications yielded effect sizes larger than the original ones, and 21 (75%) yielded effect sizes smaller than the original ones. The median comparable Cohen’s ds were 0.60 for the original findings and 0.15 for the replications. The effect sizes were small (< 0.20) in 16 of the replications (57%), and 9 effects (32%) were in the direction opposite the direction of the original effect. Across settings, the Q statistic indicated significant heterogeneity in 11 (39%) of the replication effects, and most of those were among the findings with the largest overall effect sizes; only 1 effect that was near zero in the aggregate showed significant heterogeneity according to this measure. Only 1 effect had a tau value greater than .20, an indication of moderate heterogeneity. Eight others had tau values near or slightly above .10, an indication of slight heterogeneity. Moderation tests indicated that very little heterogeneity was attributable to the order in which the tasks were performed or whether the tasks were administered in lab versus online. Exploratory comparisons revealed little heterogeneity between Western, educated, industrialized, rich, and democratic (WEIRD) cultures and less WEIRD cultures (i.e., cultures with relatively high and low WEIRDness scores, respectively). Cumulatively, variability in the observed effect sizes was attributable more to the effect being studied than to the sample or setting in which it was studied.UCR::Vicerrectoría de Investigación::Unidades de Investigación::Ciencias Sociales::Instituto de Investigaciones Psicológicas (IIP

    Ultrahigh-resolution, deep-penetration spectral-domain OCT

    No full text
    Thesis: S.M., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2018.Cataloged from PDF version of thesis.Includes bibliographical references (pages 73-77).Optical coherence tomography (OCT) is a label-free optical imaging modality that allows non-invasive in-depth visualization of microscopic structures in samples. With a typical resolution of 10-15 [mu]m and a penetration of up to a few mm, OCT is widely used for medical diagnoses in fields such as ophthalmology and cardiology. However, the more common diagnostic tool in the microscopic regime of medical imaging is histology, an invasive technique requiring tissue biopsy. Its resolution can be as small as 0.2 [mu]m, allowing the visualization of subcellular structures. To help bridge this gap between OCT and histology, ultrahigh-resolution OCT systems have been developed, with resolutions on the order of 1 [mu]m. Yet their application remains limited, since they employ shorter-wavelength sources, reducing penetration in tissue. We have designed and built a spectral-domain ultrahigh-resolution, deep-penetration OCT system centered at 1290 nm with axial and lateral resolutions of 2 and 5 [mu]m, respectively. To our knowledge, this is the best axial resolution obtained for a highspeed OCT system centered this deeply in the infrared. We demonstrate imaging of the cardiac conduction system, which could eventually be used for intraoperative identification of conducting tissue. In addition, we show images of the corneo-scleral angle, which could help properly diagnose primary angle-closure glaucoma. Other potential applications are also discussed.by Liane Bernstein.S.M

    Large-Scale Optical Neural Networks Based on Photoelectric Multiplication

    No full text
    © 2019 authors. Published by the American Physical Society. Published by the American Physical Society under the terms of the »https://creativecommons.org/licenses/by/4.0/» Creative Commons Attribution 4.0 International license. Further distribution of this work must maintain attribution to the author(s) and the published article's title, journal citation, and DOI. Recent success in deep neural networks has generated strong interest in hardware accelerators to improve speed and energy consumption. This paper presents a new type of photonic accelerator based on coherent detection that is scalable to large (N106) networks and can be operated at high (gigahertz) speeds and very low (subattojoule) energies per multiply and accumulate (MAC), using the massive spatial multiplexing enabled by standard free-space optical components. In contrast to previous approaches, both weights and inputs are optically encoded so that the network can be reprogrammed and trained on the fly. Simulations of the network using models for digit and image classification reveal a "standard quantum limit" for optical neural networks, set by photodetector shot noise. This bound, which can be as low as 50 zJ/MAC, suggests that performance below the thermodynamic (Landauer) limit for digital irreversible computation is theoretically possible in this device. The proposed accelerator can implement both fully connected and convolutional networks. We also present a scheme for backpropagation and training that can be performed in the same hardware. This architecture will enable a new class of ultralow-energy processors for deep learning

    A scalable optical neural network architecture using coherent detection

    No full text
    © COPYRIGHT SPIE. Downloading of the abstract is permitted for personal use only. Storing, proceßing, and learning from data is a central task in both industrial practice and modern science. Recent advances in modern statistical learning, particularly Deep Neural Networks (DNNs), have given record breaking performance on tasks in game playing,1, 2 natural language proceßing,3 computer vision,4 computational biology,5, 6 and many others. The rapid growth of the field has been driven by an increase in the amount of public datasets,7 improvements to algorithms,8 and a substantial growth in computing power.9 In order to perform well on these tasks networks have had to grow in size, learning more complicated statistical features. The training and deployment of these large neural networks has spurred the creation of many neural network accelerators to aid in the computation of these networks.10-12 Existing general purpose computing devices such as CPUs and GPUs are limited both by thermal dißipation per unit area and yield aßociated with large chips.13, 14 The design of Application Specific Integrated circuits (ASICs) has aided in decreasing the energy consumption per workload substantially by limiting the supported operations on chip. An example of this is the first generation tensor proceßing unit (TPU)15 which is able to perform the inference of large convolutional neural networks in datacenter in <10ms with an idle power of 28W and an workload power of 40W. It may seen counterintuitive then that the limiting factor for the implementation of DNNs is not computation, but rather the energy and bandwidth aßociated with reading and writing data from memory as well as the energy cost of moving data inside of the ASIC.15, 16 Several emerging technologies, such as in-memory computing,17 memristive croßbar arrays18 promise increased performance, but these emerging architectures suffer from calibration ißues and limited accuracy.19 Photonics as a field has had tremendous succeß in improving the energy efficiency of data interconnects.20 This has motivated the creation of optical neural networks (ONNs) based on 3D-printed diffractive elements,21 spiking neural networks utilizing ring-resonators,22 reservoir computing23 and nanophotonic circuits.24 However, these architectures have several ißues. 3D-printed diffractive networks and schemes requiring spatial light modulators are non-programmable, meaning that they are unable to perform the task of training. Nanophotonic circuits allow for an O(N2) array of interferometers to be programmed, providing paßive matrix-vector multiplication. However, the large (1mm2) size of on chip electro-optic interferometers means that scaling to an array of 100x100 would require 10; 000mm2 of silicon, demonstrating the limitations of scaling this architecture. To date no architecture has demonstrated high-speed (GHz) speed computation with more than N ≥ 10; 000 neurons. Here we present an architecture that is scalable to N ≥ 106 neurons. The key mechanism of this architecture is balanced homodyne detection. By scaling the architecture to such a large size we show that we can decimate energy costs per operation aßociated with the optical component of this architecture, reaching a bound set by shot noise on the receiving photodetectors which leads to claßification error. We call this bound a standard quantum limit (SQL) which reaches 100zJ/MAC on problems such as MNIST. We also analyze the energy consumption using existing technologies and show that sub-fJ/MAC energy consumption should be poßible. This paper is organized as follows: In section 1 we will discuß the function of this architecture as a matrixmatrix proceßor. In section 2 we will analyze the energy consumption of the architecture. In section 3 we will discuß methods for training and extending the accelerator to a broader scope of problems, namely convolutionally neural networks (CNNs)
    corecore