583 research outputs found
Data compression techniques applied to high resolution high frame rate video technology
An investigation is presented of video data compression applied to microgravity space experiments using High Resolution High Frame Rate Video Technology (HHVT). An extensive survey of methods of video data compression, described in the open literature, was conducted. The survey examines compression methods employing digital computing. The results of the survey are presented. They include a description of each method and assessment of image degradation and video data parameters. An assessment is made of present and near term future technology for implementation of video data compression in high speed imaging system. Results of the assessment are discussed and summarized. The results of a study of a baseline HHVT video system, and approaches for implementation of video data compression, are presented. Case studies of three microgravity experiments are presented and specific compression techniques and implementations are recommended
Lossy compression and real-time geovisualization for ultra-low bandwidth telemetry from untethered underwater vehicles
Submitted in partial fulfillment of the requirements for the degree of Master of Science at the Massachusetts Institute of Technology and the Woods Hole Oceanographic Institution September 2008Oceanographic applications of robotics are as varied as the undersea environment itself. As
underwater robotics moves toward the study of dynamic processes with multiple vehicles,
there is an increasing need to distill large volumes of data from underwater vehicles and
deliver it quickly to human operators. While tethered robots are able to communicate data
to surface observers instantly, communicating discoveries is more difficult for untethered
vehicles. The ocean imposes severe limitations on wireless communications; light is quickly
absorbed by seawater, and tradeoffs between frequency, bitrate and environmental effects
result in data rates for acoustic modems that are routinely as low as tens of bits per second.
These data rates usually limit telemetry to state and health information, to the exclusion
of mission-specific science data.
In this thesis, I present a system designed for communicating and presenting science
telemetry from untethered underwater vehicles to surface observers. The system's goals
are threefold: to aid human operators in understanding oceanographic processes, to enable
human operators to play a role in adaptively responding to mission-specific data, and to accelerate mission planning from one vehicle dive to the next. The system uses standard lossy
compression techniques to lower required data rates to those supported by commercially
available acoustic modems (O(10)-O(100) bits per second).
As part of the system, a method for compressing time-series science data based upon
the Discrete Wavelet Transform (DWT) is explained, a number of low-bitrate image compression techniques are compared, and a novel user interface for reviewing transmitted
telemetry is presented. Each component is motivated by science data from a variety of
actual Autonomous Underwater Vehicle (AUV) missions performed in the last year.National Science Foundation Center for Subsurface Sensing and Imaging (CenSSIS ERC
CUTIE: Beyond PetaOp/s/W Ternary DNN Inference Acceleration with Better-than-Binary Energy Efficiency
We present a 3.1 POp/s/W fully digital hardware accelerator for ternary
neural networks. CUTIE, the Completely Unrolled Ternary Inference Engine,
focuses on minimizing non-computational energy and switching activity so that
dynamic power spent on storing (locally or globally) intermediate results is
minimized. This is achieved by 1) a data path architecture completely unrolled
in the feature map and filter dimensions to reduce switching activity by
favoring silencing over iterative computation and maximizing data re-use, 2)
targeting ternary neural networks which, in contrast to binary NNs, allow for
sparse weights which reduce switching activity, and 3) introducing an optimized
training method for higher sparsity of the filter weights, resulting in a
further reduction of the switching activity. Compared with state-of-the-art
accelerators, CUTIE achieves greater or equal accuracy while decreasing the
overall core inference energy cost by a factor of 4.8x-21x
Holistic Optimization of Embedded Computer Vision Systems
Despite strong interest in embedded computer vision, the computational demands of Convolutional Neural Network (CNN) inference far exceed the resources available in embedded devices. Thankfully, the typical embedded device has a number of desirable properties that can be leveraged to significantly reduce the time and energy required for CNN inference. This thesis presents three independent and synergistic methods for optimizing embedded computer vision: 1) Reducing the time and energy needed to capture and preprocess input images by optimizing the image capture pipeline for the needs of CNNs rather than humans. 2) Exploiting temporal redundancy within incoming video streams to perform computationally cheap motion estimation and compensation in lieu of full CNN inference for the majority of frames. 3) Leveraging the sparsity of CNN activations within the frequency domain to significantly reduce the number of operations needed for inference. Collectively these techniques significantly reduce the time and energy needed for computer vision at the edge, enabling a wide variety of exciting new applications
Doctor of Philosophy
dissertationDeep Neural Networks (DNNs) are the state-of-art solution in a growing number of tasks including computer vision, speech recognition, and genomics. However, DNNs are computationally expensive as they are carefully trained to extract and abstract features from raw data using multiple layers of neurons with millions of parameters. In this dissertation, we primarily focus on inference, e.g., using a DNN to classify an input image. This is an operation that will be repeatedly performed on billions of devices in the datacenter, in self-driving cars, in drones, etc. We observe that DNNs spend a vast majority of their runtime to runtime performing matrix-by-vector multiplications (MVM). MVMs have two major bottlenecks: fetching the matrix and performing sum-of-product operations. To address these bottlenecks, we use in-situ computing, where the matrix is stored in programmable resistor arrays, called crossbars, and sum-of-product operations are performed using analog computing. In this dissertation, we propose two hardware units, ISAAC and Newton.In ISAAC, we show that in-situ computing designs can outperform DNN digital accelerators, if they leverage pipelining, smart encodings, and can distribute a computation in time and space, within crossbars, and across crossbars. In the ISAAC design, roughly half the chip area/power can be attributed to the analog-to-digital conversion (ADC), i.e., it remains the key design challenge in mixed-signal accelerators for deep networks. In spite of the ADC bottleneck, ISAAC is able to out-perform the computational efficiency of the state-of-the-art design (DaDianNao) by 8x. In Newton, we take advantage of a number of techniques to address ADC inefficiency. These techniques exploit matrix transformations, heterogeneity, and smart mapping of computation to the analog substrate. We show that Newton can increase the efficiency of in-situ computing by an additional 2x. Finally, we show that in-situ computing, unfortunately, cannot be easily adapted to handle training of deep networks, i.e., it is only suitable for inference of already-trained networks. By improving the efficiency of DNN inference with ISAAC and Newton, we move closer to low-cost deep learning that in turn will have societal impact through self-driving cars, assistive systems for the disabled, and precision medicine
- …