1,061 research outputs found

    FPGA-Based Hardware Accelerators for Deep Learning in Mobile Robotics

    Get PDF
    The increasing demand for real-time low-power hardware processing systems, endowed with the capacity to perform compute-intensive applications, accentuated the inadequacy of the conventional architecture of multicore general-purpose processors. In an effort to meet this demand, edge computing hardware accelerators have come to the forefront, notably with regard to deep learning and robotic systems. This thesis explores preeminent hardware accelerators and examines the performance, accuracy, and power consumption of a GPU and an FPGA-based platform, both specifically designed for edge computing applications. The experiments were conducted using three deep neural network models, namely AlexNet, GoogLeNet, and ResNet-18, trained to perform binary image classification in a known environment. Our results demonstrate that the FPGA-based platform, particularly a Kria KV260 Vision AI starter kit, exhibited an inference speed of up to nine and a half times faster than that of the GPU-based Jetson Nano developer kit. Additionally, the empirical findings of this work reported as much as a quintuple efficiency over the Jetson Nano in terms of inference speed per watt with a mere 5.4\% drop in accuracy caused by the quantization process required by the FPGA. However, the Jetson Nano showed a 1.6 times faster inference rate with the AlexNet model over the KV260 and its deployment process proved to be less challenging

    Telecommunications media for the delivery of educational programming

    Get PDF
    The technical characteristics of various telecommunications media are examined for incorporation into educational networks. FM radio, AM radio, and VHF and UHF television are considered along with computer-aided instruction. The application of iteration networks to library systems, and microform technology are discussed. The basic principles of the communications theory are outlined, and the operation of the PLATO 4 random access system is described

    Digital image forensics

    Get PDF
    Digital image forensics is a relatively new research field that aims to expose the origin and composition of, and the history of processing applied to digital images. Hence, the digital image forensics is expected to be of significant importance to our modern society in which the digital media are getting more and more popular. In this thesis, image tampering detection and classification of double JPEG compression are the two major subjects studied. Since any manipulation applied to digital images changes image statistics, identifying statistical artifacts becomes critically important in image forensics. In this thesis, a few typical forensic techniques have been studied. Finally, it is foreseen that the investigations on endless confliction between forensics and anti-forensics are to deepen our understanding on image statistics and advance civilization of our society

    Vector Quantization of True-Color Images

    Get PDF
    Vector quantization (VQ) has recently emerged as a powerful and efficient technique for digital speech and image coding. The goal of such a process is data compression: to minimize communication channel capacity or digital storage memory requirements while maintaining an acceptable fidelity level of the data. A review of various VQ algorithms and their respective design considerations as applied to color images is given. Fidelity measurements and signal-to-noise ratio calculations are discussed. A modified mean-residual vector quantizer using the LBG design algorithm with color signal preprocessing is described. The algorithm is developed to yield a bit rate of 0.709 bits per pixel per color with the goal of easy implementation even using a simple microcomputer . Photographic and numeric results of original versus compressed-uncompressed color images are presented. Several modifications to the described algorithm are tested with good results

    Scout: a hardware-accelerated system for quantitatively driven visualization and analysis

    Get PDF
    Journal ArticleQuantitative techniques for visualization are critical to the successful analysis of both acquired and simulated scientific data. Many visualization techniques rely on indirect mappings, such as transfer functions, to produce the final imagery. In many situations, it is preferable and more powerful to express these mappings as mathematical expressions, or queries, that can then be directly applied to the data. In this paper, we present a hardware-accelerated system that provides such capabilities and exploits current graphics hardware for portions of the computational tasks that would otherwise be executed on the CPU. In our approach, the direct programming of the graphics processor using a concise data parallel language, gives scientists the capability to efficiently explore and visualize data sets

    GLM-130B: An Open Bilingual Pre-trained Model

    Full text link
    We introduce GLM-130B, a bilingual (English and Chinese) pre-trained language model with 130 billion parameters. It is an attempt to open-source a 100B-scale model at least as good as GPT-3 (davinci) and unveil how models of such a scale can be successfully pre-trained. Over the course of this effort, we face numerous unexpected technical and engineering challenges, particularly on loss spikes and divergence. In this paper, we introduce the training process of GLM-130B including its design choices, training strategies for both efficiency and stability, and engineering efforts. The resultant GLM-130B model offers significant outperformance over GPT-3 175B (davinci) on a wide range of popular English benchmarks while the performance advantage is not observed in OPT-175B and BLOOM-176B. It also consistently and significantly outperforms ERNIE TITAN 3.0 260B -- the largest Chinese language model -- across related benchmarks. Finally, we leverage a unique scaling property of GLM-130B to reach INT4 quantization without post training, with almost no performance loss, making it the first among 100B-scale models and more importantly, allowing its effective inference on 4×\timesRTX 3090 (24G) or 8×\timesRTX 2080 Ti (11G) GPUs, the most affordable GPUs required for using 100B-scale models. The GLM-130B model weights are publicly accessible and its code, training logs, related toolkit, and lessons learned are open-sourced at \url{https://github.com/THUDM/GLM-130B/}.Comment: Accepted to ICLR 202
    • …
    corecore