19 research outputs found

    Optimización en GPU de algoritmos para la mejora del realce y segmentación en imágenes hepáticas

    Get PDF
    This doctoral thesis deepens the GPU acceleration for liver enhancement and segmentation. With this motivation, detailed research is carried out here in a compendium of articles. The work developed is structured in three scientific contributions, the first one is based upon enhancement and tumor segmentation, the second one explores the vessel segmentation and the last is published on liver segmentation. These works are implemented on GPU with significant speedups with great scientific impact and relevance in this doctoral thesis The first work proposes cross-modality based contrast enhancement for tumor segmentation on GPU. To do this, it takes target and guidance images as an input and enhance the low quality target image by applying two dimensional histogram approach. Further it has been observed that the enhanced image provides more accurate tumor segmentation using GPU based dynamic seeded region growing. The second contribution is about fast parallel gradient based seeded region growing where static approach has been proposed and implemented on GPU for accurate vessel segmentation. The third contribution describes GPU acceleration of Chan-Vese model and cross-modality based contrast enhancement for liver segmentation

    Polynomial Time Cryptanalytic Extraction of Neural Network Models

    Full text link
    Billions of dollars and countless GPU hours are currently spent on training Deep Neural Networks (DNNs) for a variety of tasks. Thus, it is essential to determine the difficulty of extracting all the parameters of such neural networks when given access to their black-box implementations. Many versions of this problem have been studied over the last 30 years, and the best current attack on ReLU-based deep neural networks was presented at Crypto 2020 by Carlini, Jagielski, and Mironov. It resembles a differential chosen plaintext attack on a cryptosystem, which has a secret key embedded in its black-box implementation and requires a polynomial number of queries but an exponential amount of time (as a function of the number of neurons). In this paper, we improve this attack by developing several new techniques that enable us to extract with arbitrarily high precision all the real-valued parameters of a ReLU-based DNN using a polynomial number of queries and a polynomial amount of time. We demonstrate its practical efficiency by applying it to a full-sized neural network for classifying the CIFAR10 dataset, which has 3072 inputs, 8 hidden layers with 256 neurons each, and over million neuronal parameters. An attack following the approach by Carlini et al. requires an exhaustive search over 2 to the power 256 possibilities. Our attack replaces this with our new techniques, which require only 30 minutes on a 256-core computer

    Workload-Balanced Pruning for Sparse Spiking Neural Networks

    Full text link
    Pruning for Spiking Neural Networks (SNNs) has emerged as a fundamental methodology for deploying deep SNNs on resource-constrained edge devices. Though the existing pruning methods can provide extremely high weight sparsity for deep SNNs, the high weight sparsity brings a workload imbalance problem. Specifically, the workload imbalance happens when a different number of non-zero weights are assigned to hardware units running in parallel, which results in low hardware utilization and thus imposes longer latency and higher energy costs. In preliminary experiments, we show that sparse SNNs (\sim98% weight sparsity) can suffer as low as \sim59% utilization. To alleviate the workload imbalance problem, we propose u-Ticket, where we monitor and adjust the weight connections of the SNN during Lottery Ticket Hypothesis (LTH) based pruning, thus guaranteeing the final ticket gets optimal utilization when deployed onto the hardware. Experiments indicate that our u-Ticket can guarantee up to 100% hardware utilization, thus reducing up to 76.9% latency and 63.8% energy cost compared to the non-utilization-aware LTH method

    ACE-HoT: Accelerating an extreme amount of symmetric Cipher Evaluations for High-Order avalanche Tests

    Get PDF
    In this work, we tackle the problem of estimating the security of iterated symmetric ciphers in an efficient manner, with tests that do not require a deep analysis of the internal structure of the cipher. This is particularly useful during the design phase of these ciphers, especially for quickly testing several combinations of possible parameters defining several cipher design variants. We consider a popular statistical test that allows us to determine the probability of flipping each cipher output bit, given a small variation in the input of the cipher. From these probabilities, one can compute three measurable metrics related to the well-known full diffusion, avalanche and strict avalanche criteria. This highly parallelizable testing process scales linearly with the number of samples, i.e., cipher inputs, to be evaluated and the number of design variants to be tested. But, the number of design variants might grow exponentially with respect to some parameters. The high cost of CPUs, makes them a bad candidate for this kind of parallelization. As a main contribution, we propose a framework, ACE-HoT, to parallelize the testing process using multi-GPU. Our implementation does not perform any intermediate CPU-GPU data transfers. The diffusion and avalanche criteria can be seen as an application of discrete first-order derivatives. As a secondary contribution, we generalize these criteria to their high-order version. Our generalization requires an exponentially larger number of samples, in order to compute sufficiently accurate probabilities. As a case study, we apply ACE-HoT on most of the finalists of the NIST lightweight standardization process, with a special focus on the winner ASCON

    Polynomial Time Cryptanalytic Extraction of Neural Network Models

    Get PDF
    Billions of dollars and countless GPU hours are currently spent on training Deep Neural Networks (DNNs) for a variety of tasks. Thus, it is essential to determine the difficulty of extracting all the parameters of such neural networks when given access to their black-box implementations. Many versions of this problem have been studied over the last 30 years, and the best current attack on ReLU-based deep neural networks was presented at Crypto’20 by Carlini, Jagielski, and Mironov. It resembles a differential chosen plaintext attack on a cryptosystem, which has a secret key embedded in its black-box implementation and requires a polynomial number of queries but an exponential amount of time (as a function of the number of neurons). In this paper, we improve this attack by developing several new techniques that enable us to extract with arbitrarily high precision all the real-valued parameters of a ReLU-based DNN using a polynomial number of queries and a polynomial amount of time. We demonstrate its practical efficiency by applying it to a full-sized neural network for classifying the CIFAR10 dataset, which has 3072 inputs, 8 hidden layers with 256 neurons each, and about 1.2 million neuronal parameters. An attack following the approach by Carlini et al. requires an exhaustive search over 2^256 possibilities. Our attack replaces this with our new techniques, which require only 30 minutes on a 256-core computer

    Investigation of mechanical motion amplification for vibration energy harvesting

    No full text
    Vibration Energy Harvesting is being investigated for autonomous sensors and actuators that mainly utilize ambient and machine induced vibrations. Recently mechanical motion amplification is incorporated for improving power to weight ratio of vibration harvesters. The present study is motivated to investigate mechanical motion amplification characteristics with different configurations. The parameters investigated are motion amplification ratio, force transmissibility characteristics, weight of the electrical generator, effective damping coefficient achieved and linear nature of damping. Numerical analysis has been performed to compare important characteristics of device operating without amplification to that of with amplification with different configuration. The study has been concluded with comments on application of suitable type of amplification mechanism depending on weight/space constraints and desired effective damping coefficient

    Energy harvesting shock absorber with linear generator and mechanical motion amplification

    No full text
    Energy harvesting shock absorbers can generate about 15-20 W of electric power for normal suspension velocities. However, higher weight, fail safe characteristics and space limitations have restricted development of regenerative shock absorbers to research prototypes. Power to weight ratio of regenerative shock absorbers can be improved by incorporating motion amplification. In the presented work, an innovative design of energy harvesting shock absorber has been presented that uses motion amplification for improving harvesting efficiency. Apart from improving electric power, the proposed solution is fail safe and can be easily incorporated in existing vehicles with only marginal change in suspension layout. Study includes detailed numerical analysis for vibration transmissibility to investigate comfort and safety. Further, a prototype has been fabricated and experimentation has been performed to compute electric power generated and comfort. Simulations have been performed on real size model with utilization of harvested electric power which indicates about 19% of overall harvesting efficiency

    A Flexible Mechanism Based Vibration Isolator for Machine Tool Application

    No full text
    The paper presents novel design of vibration absorber with innovative features including use of flexible link based mechanism at the interface of tool holder and cutting tool. The mechanism ensures modification of the dynamic force interaction at the damping element and results in lower force transmissibility. It ensures amplification of the relative velocity at the damping element, which results in significant reduction of the damping element mass used for energy dissipation. The presented absorber has advantages of passive and economical operation in comparison to the active and semi-active solutions. Further, the proposed solution results in up to 53% reduction in the force transmissibility. A real size design has been presented for frequency range of 0-1100 Hz and maximum force amplitude of 700 N. Numerical simulations have been performed with consideration of flexible joint and structural element dynamics. Simulation results with FEA and PRBM approach have been compared with detailed analysis of the important design parameters
    corecore