170 research outputs found

    Developing Efficient Discrete Simulations on Multicore and GPU Architectures

    Get PDF
    In this paper we show how to efficiently implement parallel discrete simulations on multicoreandGPUarchitecturesthrougharealexampleofanapplication: acellularautomatamodel of laser dynamics. We describe the techniques employed to build and optimize the implementations using OpenMP and CUDA frameworks. We have evaluated the performance on two different hardware platforms that represent different target market segments: high-end platforms for scientific computing, using an Intel Xeon Platinum 8259CL server with 48 cores, and also an NVIDIA Tesla V100GPU,bothrunningonAmazonWebServer(AWS)Cloud;and on a consumer-oriented platform, using an Intel Core i9 9900k CPU and an NVIDIA GeForce GTX 1050 TI GPU. Performance results were compared and analyzed in detail. We show that excellent performance and scalability can be obtained in both platforms, and we extract some important issues that imply a performance degradation for them. We also found that current multicore CPUs with large core numbers can bring a performance very near to that of GPUs, and even identical in some cases.Ministerio de Economía, Industria y Competitividad, Gobierno de España (MINECO), and the Agencia Estatal de Investigación (AEI) of Spain, cofinanced by FEDER funds (EU) TIN2017-89842

    Design of secure and trustworthy system-on-chip architectures using hardware-based root-of-trust techniques

    Get PDF
    Cyber-security is now a critical concern in a wide range of embedded computing modules, communications systems, and connected devices. These devices are used in medical electronics, automotive systems, power grid systems, robotics, and avionics. The general consensus today is that conventional approaches and software-only schemes are not sufficient to provide desired security protections and trustworthiness. Comprehensive hardware-software security solutions so far have remained elusive. One major challenge is that in current system-on-chip (SoCs) designs, processing elements (PEs) and executable codes with varying levels of trust, are all integrated on the same computing platform to share resources. This interdependency of modules creates a fertile attack ground and represents the Achilles’ heel of heterogeneous SoC architectures. The salient research question addressed in this dissertation is “can one design a secure computer system out of non-secure or untrusted computing IP components and cores?”. In response to this question, we establish a generalized, user/designer-centric set of design principles which intend to advance the construction of secure heterogeneous multi-core computing systems. We develop algorithms, models of computation, and hardware security primitives to integrate secure and non-secure processing elements into the same chip design while aiming for: (a) maintaining individual core’s security; (b) preventing data leakage and corruption; (c) promoting data and resource sharing among the cores; and (d) tolerating malicious behaviors from untrusted processing elements and software applications. The key contributions of this thesis are: 1. The introduction of a new architectural model for integrating processing elements with different security and trust levels, i.e., secure and non-secure cores with trusted and untrusted provenances; 2. A generalized process isolation design methodology for the new architecture model that covers both the software and hardware layers to (i) create hardware-assisted virtual logical zones, and (ii) perform both static and runtime security, privilege level and trust authentication checks; 3. A set of secure protocols and hardware root-of-trust (RoT) primitives to support the process isolation design and to provide the following functionalities: (i) hardware immutable identities – using physical unclonable functions, (ii) core hijacking and impersonation resistance – through a blind signature scheme, (iii) threshold-based data access control – with a robust and adaptive secure secret sharing algorithm, (iv) privacy-preserving authorization verification – by proposing a group anonymous authentication algorithm, and (v) denial of resource or denial of service attack avoidance – by developing an interconnect network routing algorithm and a memory access mechanism according to user-defined security policies. 4. An evaluation of the security of the proposed hardware primitives in the post-quantum era, and possible extensions and algorithmic modifications for their post-quantum resistance. In this dissertation, we advance the practicality of secure-by-construction methodologies in SoC architecture design. The methodology allows for the use of unsecured or untrusted processing elements in the construction of these secure architectures and tries to extend their effectiveness into the post-quantum computing era

    New Enhanced Chaotic Number Generators

    Full text link
    We introduce new families of enhanced chaotic number generators in order to compute very fast long series of pseudorandom numbers. The key feature of these generators being the use of chaotic numbers themselves for sampling chaotic subsequence of chaotic numbers in order to hide the generating function. We explore numerically the properties of these new families and underline their very high qualities and usefulness as CPRNG when series are computed up to 10 trillions iterations.Comment: 42 pages, 17 figures, to be published in Proceeding 8th International Conference of Indian Soc. of Indust. and Appl. Math., Jammu,India, 31st March - 3rd April 2007, Invited conferenc

    Survey of FPGA applications in the period 2000 – 2015 (Technical Report)

    Get PDF
    Romoth J, Porrmann M, Rückert U. Survey of FPGA applications in the period 2000 – 2015 (Technical Report).; 2017.Since their introduction, FPGAs can be seen in more and more different fields of applications. The key advantage is the combination of software-like flexibility with the performance otherwise common to hardware. Nevertheless, every application field introduces special requirements to the used computational architecture. This paper provides an overview of the different topics FPGAs have been used for in the last 15 years of research and why they have been chosen over other processing units like e.g. CPUs

    Design of Discrete-time Chaos-Based Systems for Hardware Security Applications

    Get PDF
    Security of systems has become a major concern with the advent of technology. Researchers are proposing new security solutions every day in order to meet the area, power and performance specifications of the systems. The additional circuit required for security purposes can consume significant area and power. This work proposes a solution which utilizes discrete-time chaos-based logic gates to build a system which addresses multiple hardware security issues. The nonlinear dynamics of chaotic maps is leveraged to build a system that mitigates IC counterfeiting, IP piracy, overbuilding, disables hardware Trojan insertion and enables authentication of connecting devices (such as IoT and mobile). Chaos-based systems are also used to generate pseudo-random numbers for cryptographic applications.The chaotic map is the building block for the design of discrete-time chaos-based oscillator. The analog output of the oscillator is converted to digital value using a comparator in order to build logic gates. The logic gate is reconfigurable since different parameters in the circuit topology can be altered to implement multiple Boolean functions using the same system. The tuning parameters are control input, bifurcation parameter, iteration number and threshold voltage of the comparator. The proposed system is a hybrid between standard CMOS logic gates and reconfigurable chaos-based logic gates where original gates are replaced by chaos-based gates. The system works in two modes: logic locking and authentication. In logic locking mode, the goal is to ensure that the system achieves logic obfuscation in order to mitigate IC counterfeiting. The secret key for logic locking is made up of the tuning parameters of the chaotic oscillator. Each gate has 10-bit key which ensures that the key space is large which exponentially increases the computational complexity of any attack. In authentication mode, the aim of the system is to provide authentication of devices so that adversaries cannot connect to devices to learn confidential information. Chaos-based computing system is susceptible to process variation which can be leveraged to build a chaos-based PUF. The proposed system demonstrates near ideal PUF characteristics which means systems with large number of primary outputs can be used for authenticating devices

    A Survey of Spiking Neural Network Accelerator on FPGA

    Full text link
    Due to the ability to implement customized topology, FPGA is increasingly used to deploy SNNs in both embedded and high-performance applications. In this paper, we survey state-of-the-art SNN implementations and their applications on FPGA. We collect the recent widely-used spiking neuron models, network structures, and signal encoding formats, followed by the enumeration of related hardware design schemes for FPGA-based SNN implementations. Compared with the previous surveys, this manuscript enumerates the application instances that applied the above-mentioned technical schemes in recent research. Based on that, we discuss the actual acceleration potential of implementing SNN on FPGA. According to our above discussion, the upcoming trends are discussed in this paper and give a guideline for further advancement in related subjects

    Secure quantum communication technologies and systems: From labs to markets

    Get PDF
    We provide a broad overview of current quantum communication by analyzing the recent discoveries on the topic and by identifying the potential bottlenecks requiring further investigation. The analysis follows an industrial perspective, first identifying the state or the art in terms of protocols, systems, and devices for quantum communication. Next, we classify the applicative fields where short- and medium-term impact is expected by emphasizing the potential and challenges of different approaches. The direction and the methodology with which the scientific community is proceeding are discussed. Finally, with reference to the European guidelines within the Quantum Flagship initiative, we suggest a roadmap to match the effort community-wise, with the objective of maximizing the impact that quantum communication may have on our society

    Secure covert communications over streaming media using dynamic steganography

    Get PDF
    Streaming technologies such as VoIP are widely embedded into commercial and industrial applications, so it is imperative to address data security issues before the problems get really serious. This thesis describes a theoretical and experimental investigation of secure covert communications over streaming media using dynamic steganography. A covert VoIP communications system was developed in C++ to enable the implementation of the work being carried out. A new information theoretical model of secure covert communications over streaming media was constructed to depict the security scenarios in streaming media-based steganographic systems with passive attacks. The model involves a stochastic process that models an information source for covert VoIP communications and the theory of hypothesis testing that analyses the adversary‘s detection performance. The potential of hardware-based true random key generation and chaotic interval selection for innovative applications in covert VoIP communications was explored. Using the read time stamp counter of CPU as an entropy source was designed to generate true random numbers as secret keys for streaming media steganography. A novel interval selection algorithm was devised to choose randomly data embedding locations in VoIP streams using random sequences generated from achaotic process. A dynamic key updating and transmission based steganographic algorithm that includes a one-way cryptographical accumulator integrated into dynamic key exchange for covert VoIP communications, was devised to provide secure key exchange for covert communications over streaming media. The discrete logarithm problem in mathematics and steganalysis using t-test revealed the algorithm has the advantage of being the most solid method of key distribution over a public channel. The effectiveness of the new steganographic algorithm for covert communications over streaming media was examined by means of security analysis, steganalysis using non parameter Mann-Whitney-Wilcoxon statistical testing, and performance and robustness measurements. The algorithm achieved the average data embedding rate of 800 bps, comparable to other related algorithms. The results indicated that the algorithm has no or little impact on real-time VoIP communications in terms of speech quality (< 5% change in PESQ with hidden data), signal distortion (6% change in SNR after steganography) and imperceptibility, and it is more secure and effective in addressing the security problems than other related algorithms

    Signal design and processing for noise radar

    Get PDF
    An efficient and secure use of the electromagnetic spectrum by different telecommunications and radar systems represents, today, a focal research point, as the coexistence of different radio-frequency sources at the same time and in the same frequency band requires the solution of a non-trivial interference problem. Normally, this is addressed with diversity in frequency, space, time, polarization, or code. In some radar applications, a secure use of the spectrum calls for the design of a set of transmitted waveforms highly resilient to interception and exploitation, i.e., with low probability of intercept/ exploitation capability. In this frame, the noise radar technology (NRT) transmits noise-like waveforms and uses correlation processing of radar echoes for their optimal reception. After a review of the NRT as developed in the last decades, the aim of this paper is to show that NRT can represent a valid solution to the aforesaid problems

    Analysis of hybrid parallelization strategies: simulation of Anderson localization and Kalman Filter for LHCb triggers

    Get PDF
    This thesis presents two experiences of hybrid programming applied to condensed matter and high energy physics. The two projects differ in various aspects, but both of them aim to analyse the benefits of using accelerated hardware to speedup the calculations in current science-research scenarios. The first project enables massively parallelism in a simulation of the Anderson localisation phenomenon in a disordered quantum system. The code represents a Hamiltonian in momentum space, then it executes a diagonalization of the corresponding matrix using linear algebra libraries, and finally it analyses the energy-levels spacing statistics averaged over several realisations of the disorder. The implementation combines different parallelization approaches in an hybrid scheme. The averaging over the ensemble of disorder realisations exploits massively parallelism with a master-slave configuration based on both multi-threading and message passing interface (MPI). This framework is designed and implemented to easily interface similar application commonly adopted in scientific research, for example in Monte Carlo simulations. The diagonalization uses multi-core and GPU hardware interfacing with MAGMA, PLASMA or MKL libraries. The access to the libraries is modular to guarantee portability, maintainability and the extension in a near future. The second project is the development of a Kalman Filter, including the porting on GPU architectures and autovectorization for online LHCb triggers. The developed codes provide information about the viability and advantages for the application of GPU technologies in the first triggering step for Large Hadron Collider beauty experiment (LHCb). The optimisation introduced on both codes for CPU and GPU delivered a relevant speedup on the Kalman Filter. The two GPU versions in CUDA R and OpenCLTM have similar performances and are adequate to be considered in the upgrade and in the corresponding implementations of the Gaudi framework. In both projects we implement optimisation techniques in the CPU code. This report presents extensive benchmark analyses of the correctness and of the performances for both projects
    corecore