135 research outputs found

    FPGA based technical solutions for high throughput data processing and encryption for 5G communication: A review

    Get PDF
    The field programmable gate array (FPGA) devices are ideal solutions for high-speed processing applications, given their flexibility, parallel processing capability, and power efficiency. In this review paper, at first, an overview of the key applications of FPGA-based platforms in 5G networks/systems is presented, exploiting the improved performances offered by such devices. FPGA-based implementations of cloud radio access network (C-RAN) accelerators, network function virtualization (NFV)-based network slicers, cognitive radio systems, and multiple input multiple output (MIMO) channel characterizers are the main considered applications that can benefit from the high processing rate, power efficiency and flexibility of FPGAs. Furthermore, the implementations of encryption/decryption algorithms by employing the Xilinx Zynq Ultrascale+MPSoC ZCU102 FPGA platform are discussed, and then we introduce our high-speed and lightweight implementation of the well-known AES-128 algorithm, developed on the same FPGA platform, and comparing it with similar solutions already published in the literature. The comparison results indicate that our AES-128 implementation enables efficient hardware usage for a given data-rate (up to 28.16 Gbit/s), resulting in higher efficiency (8.64 Mbps/slice) than other considered solutions. Finally, the applications of the ZCU102 platform for high-speed processing are explored, such as image and signal processing, visual recognition, and hardware resource management

    Multi-Tenant Cloud FPGA: A Survey on Security

    Full text link
    With the exponentially increasing demand for performance and scalability in cloud applications and systems, data center architectures evolved to integrate heterogeneous computing fabrics that leverage CPUs, GPUs, and FPGAs. FPGAs differ from traditional processing platforms such as CPUs and GPUs in that they are reconfigurable at run-time, providing increased and customized performance, flexibility, and acceleration. FPGAs can perform large-scale search optimization, acceleration, and signal processing tasks compared with power, latency, and processing speed. Many public cloud provider giants, including Amazon, Huawei, Microsoft, Alibaba, etc., have already started integrating FPGA-based cloud acceleration services. While FPGAs in cloud applications enable customized acceleration with low power consumption, it also incurs new security challenges that still need to be reviewed. Allowing cloud users to reconfigure the hardware design after deployment could open the backdoors for malicious attackers, potentially putting the cloud platform at risk. Considering security risks, public cloud providers still don't offer multi-tenant FPGA services. This paper analyzes the security concerns of multi-tenant cloud FPGAs, gives a thorough description of the security problems associated with them, and discusses upcoming future challenges in this field of study

    Within-Die Delay Variation Measurement And Analysis For Emerging Technologies Using An Embedded Test Structure

    Get PDF
    Both random and systematic within-die process variations (PV) are growing more severe with shrinking geometries and increasing die size. Escalation in the variations in delay and power with reductions in feature size places higher demands on the accuracy of variation models. Their availability can be used to improve yield, and the corresponding profitability and product quality of the fabricated integrated circuits (ICs). Sources of within-die variations include optical source limitations, and layout-based systematic effects (pitch, line-width variability, and microscopic etch loading). Unfortunately, accurate models of within-die PVs are becoming more difficult to derive because of their increasingly sensitivity to design-context. Embedded test structures (ETS) continue to play an important role in the development of models of PVs and as a mechanism to improve correlations between hardware and models. Variations in path delays are increasing with scaling, and are increasingly affected by neighborhood\u27 interactions. In order to fully characterize within-die variations, delays must be measured in the context of actual core-logic macros. Doing so requires the use of an embedded test structure, as opposed to traditional scribe line test structures such as ring oscillators (RO). Accurate measurements of within-die variations can be used, e.g., to better tune models to actual hardware (model-to-hardware correlations). In this research project, I propose an embedded test structure called REBEL (Regional dELay BEhavior) that is designed to measure path delays in a minimally invasive fashion; and its architecture measures the path delays more accurately. Design for manufacture-ability (DFM) analysis is done on the on 90 nm ASIC chips and 28nm Zynq 7000 series FPGA boards. I present ASIC results on within-die path delay variations in a floating-point unit (FPU) fabricated in IBM\u27s 90 nm technology, with 5 pipeline stages, used as a test vehicle in chip experiments carried out at nine different temperature/voltage (TV) corners. Also experimental data has been analyzed for path delay variations in short vs long paths. FPGA results on within-die variation and die-to-die variations on Advanced Encryption System (AES) using single pipelined stage are also presented. Other analysis that have been performed on the calibrated path delays are Flip Flop propagation delays for both rising and falling edge (tpHL and tpLH), uncertainty analysis, path distribution analysis, short versus long path variations and mid-length path within-die variation. I also analyze the impact on delay when the chips are subjected to industrial-level temperature and voltage variations. From the experimental results, it has been established that the proposed REBEL provides capabilities similar to an off-chip logic analyzer, i.e., it is able to capture the temporal behavior of the signal over time, including any static and dynamic hazards that may occur on the tested path. The ASIC results further show that path delays are correlated to the launch-capture (LC) interval used to time them. Therefore, calibration as proposed in this work must be carried out in order to obtain an accurate analysis of within-die variations. Results on ASIC chips show that short paths can vary up to 35% on average, while long paths vary up to 20% at nominal temperature and voltage. A similar trend occurs for within-die variations of mid-length paths where magnitudes reduced to 20% and 5%, respectively. The magnitude of delay variations in both these analyses increase as temperature and voltage are changed to increase performance. The high level of within-die delay variations are undesirable from a design perspective, but they represent a rich source of entropy for applications that make use of \u27secrets\u27 such as authentication, hardware metering and encryption. Physical unclonable functions (PUFs) are a class of primitives that leverage within-die-variations as a means of generating random bit strings for these types of applications, including hardware security and trust. Zynq FPGAs Die-to-Die and within-die variation study shows that on average there is 5% of within-Die variation and the range of die-to-Die variation can go upto 3ns. The die-to-Die variations can be explored in much further detail to study the variations spatial dependance. Additionally, I also carried out research in the area data mining to cater for big data by focusing the work on decision tree classification (DTC) to speed-up the classification step in hardware implementation. For this purpose, I devised a pipelined architecture for the implementation of axis parallel binary decision tree classification for meeting up with the requirements of execution time and minimal resource usage in terms of area. The motivation for this work is that analyzing larger data-sets have created abundant opportunities for algorithmic and architectural developments, and data-mining innovations, thus creating a great demand for faster execution of these algorithms, leading towards improving execution time and resource utilization. Decision trees (DT) have since been implemented in software programs. Though, the software implementation of DTC is highly accurate, the execution times and the resource utilization still require improvement to meet the computational demands in the ever growing industry. On the other hand, hardware implementation of DT has not been thoroughly investigated or reported in detail. Therefore, I propose a hardware acceleration of pipelined architecture that incorporates the parallel approach in acquiring the data by having parallel engines working on different partitions of data independently. Also, each engine is processing the data in a pipelined fashion to utilize the resources more efficiently and reduce the time for processing all the data records/tuples. Experimental results show that our proposed hardware acceleration of classification algorithms has increased throughput, by reducing the number of clock cycles required to process the data and generate the results, and it requires minimal resources hence it is area efficient. This architecture also enables algorithms to scale with increasingly large and complex data sets. We developed the DTC algorithm in detail and explored techniques for adapting it to a hardware implementation successfully. This system is 3.5 times faster than the existing hardware implementation of classification.\u2

    Hardware-Software Co-Design, Acceleration and Prototyping of Control Algorithms on Reconfigurable Platforms

    Full text link
    Differential equations play a significant role in many disciplines of science and engineering. Solving and implementing Ordinary Differential Equations (ODEs) and partial Differential Equations (PDEs) effectively are very essential as most complex dynamic systems are modeled based on these equations. High Performance Computing (HPC) methodologies are required to compute and implement complex and data intensive applications modeled by differential equations at higher speed. There are, however, some challenges and limitations in implementing dynamic system, modeled by non-linear ordinary differential equations, on digital hardware. Modeling an integrator involves data approximation which results in accuracy error if data values are not considered properly. Accuracy and precision are dependent on the data types defined for each block of a system and subsystems. Also, digital hardware mostly works on fixed point data which leads to some data approximations. Using Field Programmable Gate Array (FPGA), it is possible to solve ordinary differential equations (ODE) at high speed. FPGA also provides scalable, flexible and reconfigurable features. The goal of this thesis is to explore and compare implementation of control algorithms on reconfigurable logic. This thesis focuses on implementing control algorithms modeled by second and fourth order PDEs and ODEs using Xilinx System Generator (XSG) and LabVIEW FPGA module synthesis tools. Xilinx System Generator for DSP allows integration of legacy HDL code, embedded IP cores, MATLAB functions, and hardware components targeted for Xilinx FPGAs to create complete system models that can be simulated and synthesized within the Simulink environment. The National Instruments (NI) LabVIEW FPGA Module extends LabVIEW graphical development to Field-Programmable Gate Arrays (FPGAs) on NI Reconfigurable I/O hardware. This thesis also focuses on efficient implementation and performance comparison of these implementations. Optimization of area, latency and power has also been explored during implementation and comparison results are discussed

    Development in building fire detection and evacuation system-a comprehensive review

    Get PDF
    Fire is both beneficial to man and his environment as well as destructive and deadly among all the natural disasters. A fire Accident occurs very rarely, but once it crops up its consequences will be devastating. The early detection of fire will help to avoid further consequences and saves the life of people. During the fire accidents, it is also important to guide people within the building to exit safely. Because of this, the paper gives a review of literature related to recent advancements in building fire detection and emergency evacuation system. It is intended to provide details about fire simulation tools with features, suitable hardware, communication methods, and effective user interface

    Software Defined Radio Implementation Of Ds-Cdma In Inter-Satellite Communications For Small Satellites

    Get PDF
    The increased usage of CubeSats recently has changed the communication philosophy from long-range point-to-point propagations to a multi-hop network of small orbiting nodes. Separating system tasks into many dispersed satellites can increase system survivability, versatility, configurability, adaptability, and autonomy. Inter-satellite links (ISL) enable the satellites to exchange information and share resources while reducing the traffic load to the ground. Establishment and stability of the ISL are impacted by factors such as the satellite orbit and attitude, antenna configuration, constellation topology, mobility, and link range. Software Defined Radio (SDR) is beginning to be heavily used in small satellite communications for applications such as base stations. A software-defined radio is a software program that does the functionality of a hardware system. The digital signal processing blocks are incorporated into the software giving it more flexibility and modulation. With this, the idea of a remote upgrade from the ground as well as the potential to accommodate new applications and future services without hardware changes is very promising. Realizing this, my idea is to create an inter-satellite link using software defined radio. The advantages of this are higher data rates, modification of operating frequencies, possibility of reaching higher frequency bands for higher throughputs, flexible modulation, demodulation and encoding schemes, and ground modifications. However, there are several challenges in utilizing the software-defined radio to create an inter-satellite link communication for small satellites. In this paper, we designed and implemented a multi-user inter-satellite communication network using SDRs, where Code Division Multiple Access (CDMA) technique is utilized to manage the multiple accesses to shared communication channel among the satellites. This model can be easily reconfigured to support any encoding/decoding, modulation, and other signal processing schemes

    Software Defined Radio Implementation Of Ds-Cdma In Inter-Satellite Communications For Small Satellites

    Get PDF
    The increased usage of CubeSats recently has changed the communication philosophy from long-range point-to-point propagations to a multi-hop network of small orbiting nodes. Separating system tasks into many dispersed satellites can increase system survivability, versatility, configurability, adaptability, and autonomy. Inter-satellite links (ISL) enable the satellites to exchange information and share resources while reducing the traffic load to the ground. Establishment and stability of the ISL are impacted by factors such as the satellite orbit and attitude, antenna configuration, constellation topology, mobility, and link range. Software Defined Radio (SDR) is beginning to be heavily used in small satellite communications for applications such as base stations. A software-defined radio is a software program that does the functionality of a hardware system. The digital signal processing blocks are incorporated into the software giving it more flexibility and modulation. With this, the idea of a remote upgrade from the ground as well as the potential to accommodate new applications and future services without hardware changes is very promising. Realizing this, my idea is to create an inter-satellite link using software defined radio. The advantages of this are higher data rates, modification of operating frequencies, possibility of reaching higher frequency bands for higher throughputs, flexible modulation, demodulation and encoding schemes, and ground modifications. However, there are several challenges in utilizing the software-defined radio to create an inter-satellite link communication for small satellites. In this paper, we designed and implemented a multi-user inter-satellite communication network using SDRs, where Code Division Multiple Access (CDMA) technique is utilized to manage the multiple accesses to shared communication channel among the satellites. This model can be easily reconfigured to support any encoding/decoding, modulation, and other signal processing schemes

    An Adaptive Modular Redundancy Technique to Self-regulate Availability, Area, and Energy Consumption in Mission-critical Applications

    Get PDF
    As reconfigurable devices\u27 capacities and the complexity of applications that use them increase, the need for self-reliance of deployed systems becomes increasingly prominent. A Sustainable Modular Adaptive Redundancy Technique (SMART) composed of a dual-layered organic system is proposed, analyzed, implemented, and experimentally evaluated. SMART relies upon a variety of self-regulating properties to control availability, energy consumption, and area used, in dynamically-changing environments that require high degree of adaptation. The hardware layer is implemented on a Xilinx Virtex-4 Field Programmable Gate Array (FPGA) to provide self-repair using a novel approach called a Reconfigurable Adaptive Redundancy System (RARS). The software layer supervises the organic activities within the FPGA and extends the self-healing capabilities through application-independent, intrinsic, evolutionary repair techniques to leverage the benefits of dynamic Partial Reconfiguration (PR). A SMART prototype is evaluated using a Sobel edge detection application. This prototype is shown to provide sustainability for stressful occurrences of transient and permanent fault injection procedures while still reducing energy consumption and area requirements. An Organic Genetic Algorithm (OGA) technique is shown capable of consistently repairing hard faults while maintaining correct edge detector outputs, by exploiting spatial redundancy in the reconfigurable hardware. A Monte Carlo driven Continuous Markov Time Chains (CTMC) simulation is conducted to compare SMART\u27s availability to industry-standard Triple Modular Technique (TMR) techniques. Based on nine use cases, parameterized with realistic fault and repair rates acquired from publically available sources, the results indicate that availability is significantly enhanced by the adoption of fast repair techniques targeting aging-related hard-faults. Under harsh environments, SMART is shown to improve system availability from 36.02% with lengthy repair techniques to 98.84% with fast ones. This value increases to five nines (99.9998%) under relatively more favorable conditions. Lastly, SMART is compared to twenty eight standard TMR benchmarks that are generated by the widely-accepted BL-TMR tools. Results show that in seven out of nine use cases, SMART is the recommended technique, with power savings ranging from 22% to 29%, and area savings ranging from 17% to 24%, while still maintaining the same level of availability

    Digital Design of New Chaotic Ciphers for Ethernet Traffic

    Get PDF
    Durante los últimos años, ha habido un gran desarrollo en el campo de la criptografía, y muchos algoritmos de encriptado así como otras funciones criptográficas han sido propuestos.Sin embargo, a pesar de este desarrollo, hoy en día todavía existe un gran interés en crear nuevas primitivas criptográficas o mejorar las ya existentes. Algunas de las razones son las siguientes:• Primero, debido el desarrollo de las tecnologías de la comunicación, la cantidad de información que se transmite está constantemente incrementándose. En este contexto, existen numerosas aplicaciones que requieren encriptar una gran cantidad de datos en tiempo real o en un intervalo de tiempo muy reducido. Un ejemplo de ello puede ser el encriptado de videos de alta resolución en tiempo real. Desafortunadamente, la mayoría de los algoritmos de encriptado usados hoy en día no son capaces de encriptar una gran cantidad de datos a alta velocidad mientras mantienen altos estándares de seguridad.• Debido al gran aumento de la potencia de cálculo de los ordenadores, muchos algoritmos que tradicionalmente se consideraban seguros, actualmente pueden ser atacados por métodos de “fuerza bruta” en una cantidad de tiempo razonable. Por ejemplo, cuando el algoritmo de encriptado DES (Data Encryption Standard) fue lanzado por primera vez, el tamaño de la clave era sólo de 56 bits mientras que, hoy en día, el NIST (National Institute of Standards and Technology) recomienda que los algoritmos de encriptado simétricos tengan una clave de, al menos, 112 bits. Por otro lado, actualmente se está investigando y logrando avances significativos en el campo de la computación cuántica y se espera que, en el futuro, se desarrollen ordenadores cuánticos a gran escala. De ser así, se ha demostrado que algunos algoritmos que se usan actualmente como el RSA (Rivest Shamir Adleman) podrían ser atacados con éxito.• Junto al desarrollo en el campo de la criptografía, también ha habido un gran desarrollo en el campo del criptoanálisis. Por tanto, se están encontrando nuevas vulnerabilidades y proponiendo nuevos ataques constantemente. Por consiguiente, es necesario buscar nuevos algoritmos que sean robustos frente a todos los ataques conocidos para sustituir a los algoritmos en los que se han encontrado vulnerabilidades. En este aspecto, cabe destacar que algunos algoritmos como el RSA y ElGamal están basados en la suposición de que algunos problemas como la factorización del producto de dos números primos o el cálculo de logaritmos discretos son difíciles de resolver. Sin embargo, no se ha descartado que, en el futuro, se puedan desarrollar algoritmos que resuelvan estos problemas de manera rápida (en tiempo polinomial).• Idealmente, las claves usadas para encriptar los datos deberían ser generadas de manera aleatoria para ser completamente impredecibles. Dado que las secuencias generadas por generadores pseudoaleatorios, PRNGs (Pseudo Random Number Generators) son predecibles, son potencialmente vulnerables al criptoanálisis. Por tanto, las claves suelen ser generadas usando generadores de números aleatorios verdaderos, TRNGs (True Random Number Generators). Desafortunadamente, los TRNGs normalmente generan los bits a menor velocidad que los PRNGs y, además, las secuencias generadas suelen tener peores propiedades estadísticas, lo que hace necesario que pasen por una etapa de post-procesado. El usar un TRNG de baja calidad para generar claves, puede comprometer la seguridad de todo el sistema de encriptado, como ya ha ocurrido en algunas ocasiones. Por tanto, el diseño de nuevos TRNGs con buenas propiedades estadísticas es un tema de gran interés.En resumen, es claro que existen numerosas líneas de investigación en el ámbito de la criptografía de gran importancia. Dado que el campo de la criptografía es muy amplio, esta tesis se ha centra en tres líneas de investigación: el diseño de nuevos TRNGs, el diseño de nuevos cifradores de flujo caóticos rápidos y seguros y, finalmente, la implementación de nuevos criptosistemas para comunicaciones ópticas Gigabit Ethernet a velocidades de 1 Gbps y 10 Gbps. Dichos criptosistemas han estado basados en los algoritmos caóticos propuestos, pero se han adaptado para poder realizar el encriptado en la capa física, manteniendo el formato de la codificación. De esta forma, se ha logrado que estos sistemas sean capaces no sólo de encriptar los datos sino que, además, un atacante no pueda saber si se está produciendo una comunicación o no. Los principales aspectos cubiertos en esta tesis son los siguientes:• Estudio del estado del arte, incluyendo los algoritmos de encriptado que se usan actualmente. En esta parte se analizan los principales problemas que presentan los algoritmos de encriptado standard actuales y qué soluciones han sido propuestas. Este estudio es necesario para poder diseñar nuevos algoritmos que resuelvan estos problemas.• Propuesta de nuevos TRNGs adecuados para la generación de claves. Se exploran dos diferentes posibilidades: el uso del ruido generado por un acelerómetro MEMS (Microelectromechanical Systems) y el ruido generado por DNOs (Digital Nonlinear Oscillators). Ambos casos se analizan en detalle realizando varios análisis estadísticos a secuencias obtenidas a distintas frecuencias de muestreo. También se propone y se implementa un algoritmo de post-procesado simple para mejorar la aleatoriedad de las secuencias generadas. Finalmente, se discute la posibilidad de usar estos TRNGs como generadores de claves. • Se proponen nuevos algoritmos de encriptado que son rápidos, seguros y que pueden implementarse usando una cantidad reducida de recursos. De entre todas las posibilidades, esta tesis se centra en los sistemas caóticos ya que, gracias a sus propiedades intrínsecas como la ergodicidad o su comportamiento similar al comportamiento aleatorio, pueden ser una buena alternativa a los sistemas de encriptado clásicos. Para superar los problemas que surgen cuando estos sistemas son digitalizados, se proponen y estudian diversas estrategias: usar un sistema de multi-encriptado, cambiar los parámetros de control de los sistemas caóticos y perturbar las órbitas caóticas.• Se implementan los algoritmos propuestos. Para ello, se usa una FPGA Virtex 7. Las distintas implementaciones son analizadas y comparadas, teniendo en cuenta diversos aspectos tales como el consumo de potencia, uso de área, velocidad de encriptado y nivel de seguridad obtenido. Uno de estos diseños, se elige para ser implementado en un ASIC (Application Specific Integrate Circuit) usando una tecnología de 0,18 um. En cualquier caso, las soluciones propuestas pueden ser también implementadas en otras plataformas y otras tecnologías.• Finalmente, los algoritmos propuestos se adaptan y aplican a comunicaciones ópticas Gigabit Ethernet. En particular, se implementan criptosistemas que realizan el encriptado al nivel de la capa física para velocidades de 1 Gbps y 10 Gbps. Para realizar el encriptado en la capa física, los algoritmos propuestos en las secciones anteriores se adaptan para que preserven el formato de la codificación, 8b/10b en el caso de 1 Gb Ethernet y 64b/10b en el caso de 10 Gb Ethernet. En ambos casos, los criptosistemas se implementan en una FPGA Virtex 7 y se diseña un set experimental, que incluye dos módulos SFP (Small Form-factor Pluggable) capaces de transmitir a una velocidad de hasta 10.3125 Gbps sobre una fibra multimodo de 850 nm. Con este set experimental, se comprueba que los sistemas de encriptado funcionan correctamente y de manera síncrona. Además, se comprueba que el encriptado es bueno (pasa todos los test de seguridad) y que el patrón del tráfico de datos está oculto.<br /

    FLEXIBLE LOW-COST HW/SW ARCHITECTURES FOR TEST, CALIBRATION AND CONDITIONING OF MEMS SENSOR SYSTEMS

    Get PDF
    During the last years smart sensors based on Micro-Electro-Mechanical systems (MEMS) are widely spreading over various fields as automotive, biomedical, optical and consumer, and nowadays they represent the outstanding state of the art. The reasons of their diffusion is related to the capability to measure physical and chemical information using miniaturized components. The developing of this kind of architectures, due to the heterogeneities of their components, requires a very complex design flow, due to the utilization of both mechanical parts typical of the MEMS sensor and electronic components for the interfacing and the conditioning. In these kind of systems testing activities gain a considerable importance, and they concern various phases of the life-cycle of a MEMS based system. Indeed, since the design phase of the sensor, the validation of the design by the extraction of characteristic parameters is important, because they are necessary to design the sensor interface circuit. Moreover, this kind of architecture requires techniques for the calibration and the evaluation of the whole system in addition to the traditional methods for the testing of the control circuitry. The first part of this research work addresses the testing optimization by the developing of different hardware/software architecture for the different testing stages of the developing flow of a MEMS based system. A flexible and low-cost platform for the characterization and the prototyping of MEMS sensors has been developed in order to provide an environment that allows also to support the design of the sensor interface. To reduce the reengineering time requested during the verification testing a universal client-server architecture has been designed to provide a unique framework to test different kind of devices, using different development environment and programming languages. Because the use of ATE during the engineering phase of the calibration algorithm is expensive in terms of ATE’s occupation time, since it requires the interruption of the production process, a flexible and easily adaptable low-cost hardware/software architecture for the calibration and the evaluation of the performance has been developed in order to allow the developing of the calibration algorithm in a user-friendly environment that permits also to realize a small and medium volume production. The second part of the research work deals with a topic that is becoming ever more important in the field of applications for MEMS sensors, and concerns the capability to combine information extracted from different typologies of sensors (typically accelerometers, gyroscopes and magnetometers) to obtain more complex information. In this context two different algorithm for the sensor fusion has been analyzed and developed: the first one is a fully software algorithm that has been used as a means to estimate how much the errors in MEMS sensor data affect the estimation of the parameter computed using a sensor fusion algorithm; the second one, instead, is a sensor fusion algorithm based on a simplified Kalman filter. Starting from this algorithm, a bit-true model in Mathworks Simulink(TM) has been created as a system study for the implementation of the algorithm on chip
    • …
    corecore