8,216 research outputs found

    Efficient Implementation on Low-Cost SoC-FPGAs of TLSv1.2 Protocol with ECC_AES Support for Secure IoT Coordinators

    Get PDF
    Security management for IoT applications is a critical research field, especially when taking into account the performance variation over the very different IoT devices. In this paper, we present high-performance client/server coordinators on low-cost SoC-FPGA devices for secure IoT data collection. Security is ensured by using the Transport Layer Security (TLS) protocol based on the TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256 cipher suite. The hardware architecture of the proposed coordinators is based on SW/HW co-design, implementing within the hardware accelerator core Elliptic Curve Scalar Multiplication (ECSM), which is the core operation of Elliptic Curve Cryptosystems (ECC). Meanwhile, the control of the overall TLS scheme is performed in software by an ARM Cortex-A9 microprocessor. In fact, the implementation of the ECC accelerator core around an ARM microprocessor allows not only the improvement of ECSM execution but also the performance enhancement of the overall cryptosystem. The integration of the ARM processor enables to exploit the possibility of embedded Linux features for high system flexibility. As a result, the proposed ECC accelerator requires limited area, with only 3395 LUTs on the Zynq device used to perform high-speed, 233-bit ECSMs in 413 µs, with a 50 MHz clock. Moreover, the generation of a 384-bit TLS handshake secret key between client and server coordinators requires 67.5 ms on a low cost Zynq 7Z007S device

    Register Transfer Level Implementation Of Pooling - Based Feature Extraction For Finger Vein Identification

    Get PDF
    Recently, finger vein biometric identification methods have had more attention among the researchers due to its various advantages such as: uniqueness to individuals, immunity to ages and invisibility to human eye (hard to duplicate). Many improvements methods were utilized to increase the speed and accuracy of the identification. Feature extraction techniques based on global feature extraction such as Principle Component Analysis (PCA) were implemented. However, the results did not show robustness to occlusions and misalignments on the finger vein images. Therefore, local feature extraction techniques were used to overcome these issues. A pooling based feature extraction technique for finger vein identification was implemented in this research. The proposed algorithm extracted the local feature information of the finger vein pattern (patches), and used these patches to improve the robustness of the identification. The algorithm was mainly inspired by spatial pyramid pooling in generic image classification combined with PCA. With patch size = 4, four pyramid levels = [1x1, 2x2, 3x3, 4x4] and ~38 % dimension reduction on the extracted features vector (10 PCA coefficient), the accuracy of the identification was 88.69 % which was higher than PCA by 10.10%. The proposed algorithm was implemented on hardware using Verilog-HDL, and targeting Field Programmable Gate Array (FPGA) applications. The result showed an outstanding speed improvement compared to software implementation. The time consumed by the hardware for extracting the features of one image was 310X time faster than the consumed time for software implementation. With those improvements in accuracy and the speed, the proposed algorithm contributes to the advancement of finger vein biometric system

    Concepts for on-board satellite image registration. Volume 3: Impact of VLSI/VHSIC on satellite on-board signal processing

    Get PDF
    Anticipated major advances in integrated circuit technology in the near future are described as well as their impact on satellite onboard signal processing systems. Dramatic improvements in chip density, speed, power consumption, and system reliability are expected from very large scale integration. Improvements are expected from very large scale integration enable more intelligence to be placed on remote sensing platforms in space, meeting the goals of NASA's information adaptive system concept, a major component of the NASA End-to-End Data System program. A forecast of VLSI technological advances is presented, including a description of the Defense Department's very high speed integrated circuit program, a seven-year research and development effort

    CMOS-3D smart imager architectures for feature detection

    Get PDF
    This paper reports a multi-layered smart image sensor architecture for feature extraction based on detection of interest points. The architecture is conceived for 3-D integrated circuit technologies consisting of two layers (tiers) plus memory. The top tier includes sensing and processing circuitry aimed to perform Gaussian filtering and generate Gaussian pyramids in fully concurrent way. The circuitry in this tier operates in mixed-signal domain. It embeds in-pixel correlated double sampling, a switched-capacitor network for Gaussian pyramid generation, analog memories and a comparator for in-pixel analog-to-digital conversion. This tier can be further split into two for improved resolution; one containing the sensors and another containing a capacitor per sensor plus the mixed-signal processing circuitry. Regarding the bottom tier, it embeds digital circuitry entitled for the calculation of Harris, Hessian, and difference-of-Gaussian detectors. The overall system can hence be configured by the user to detect interest points by using the algorithm out of these three better suited to practical applications. The paper describes the different kind of algorithms featured and the circuitry employed at top and bottom tiers. The Gaussian pyramid is implemented with a switched-capacitor network in less than 50 μs, outperforming more conventional solutions.Xunta de Galicia 10PXIB206037PRMinisterio de Ciencia e Innovación TEC2009-12686, IPT-2011-1625-430000Office of Naval Research N00014111031

    GAMER: a GPU-Accelerated Adaptive Mesh Refinement Code for Astrophysics

    Full text link
    We present the newly developed code, GAMER (GPU-accelerated Adaptive MEsh Refinement code), which has adopted a novel approach to improve the performance of adaptive mesh refinement (AMR) astrophysical simulations by a large factor with the use of the graphic processing unit (GPU). The AMR implementation is based on a hierarchy of grid patches with an oct-tree data structure. We adopt a three-dimensional relaxing TVD scheme for the hydrodynamic solver, and a multi-level relaxation scheme for the Poisson solver. Both solvers have been implemented in GPU, by which hundreds of patches can be advanced in parallel. The computational overhead associated with the data transfer between CPU and GPU is carefully reduced by utilizing the capability of asynchronous memory copies in GPU, and the computing time of the ghost-zone values for each patch is made to diminish by overlapping it with the GPU computations. We demonstrate the accuracy of the code by performing several standard test problems in astrophysics. GAMER is a parallel code that can be run in a multi-GPU cluster system. We measure the performance of the code by performing purely-baryonic cosmological simulations in different hardware implementations, in which detailed timing analyses provide comparison between the computations with and without GPU(s) acceleration. Maximum speed-up factors of 12.19 and 10.47 are demonstrated using 1 GPU with 4096^3 effective resolution and 16 GPUs with 8192^3 effective resolution, respectively.Comment: 60 pages, 22 figures, 3 tables. More accuracy tests are included. Accepted for publication in ApJ

    Evaluation of the PlayStation 2 as a cluster computing node

    Get PDF
    Cluster computing is currently a popular, cost-effective solution to the increasing computational demands of many applications in scientific computing and image processing. A cluster computer is comprised of several networked computers known as nodes. Since the goal of cluster computing is to provide a cost-effective means to processing computationally demanding applications, nodes that can be obtained at a low price with minimal performance tradeoff are always attractive. Presently, the most common cluster computers are comprised of networks of workstations constructed from commodity components. Recent trends have shown that computers being developed and deployed for purposes other than traditional personal computers or workstations have presented new candidates for cluster computing nodes. The new computing node candidates being considered may provide a competitive and even less expensive alternative to the cluster computing nodes being used today. Machines such as video game consoles, whose prices are kept extremely low due to intense marketplace competition, are a prime example of such machines. The Sony PlayStation 2, in particular, provides the user with low-level hardware devices that are often found in more expensive machines. This work presents and evaluation of the PlayStation 2 video game console as a cluster computing node for scientific and image processing applications. From this evaluation, a determination is made as to whether the PlayStation 2 is a viable alternative to the cluster computing nodes being used today
    corecore