454 research outputs found

    Reconfigurable Logic Embedded Architecture of Support Vector Machine Linear Kernel

    Get PDF
    Support Vector  Machine  (SVM) is a linear  binary classifier  that  requires a  kernel  function  to  handle  non-linear problems.  Most  previous  SVM  implementations for  embedded systems  in literature were  built  targeting a certain  application; where analyses were done through comparison  with software im- plementations only. The impact  of different  application datasets towards  SVM hardware performance were not analyzed.  In this work,  we propose  a parameterizable linear  kernel  architecture that  is fully pipelined.  It  is prototyped and  analyzed  on Altera Cyclone  IV  platform   and  results  are  verified  with  equivalent software  model.  Further analysis  is  done  on  determining the effect  of  the  number of  features   and  support   vectors  on  the performance of the  hardware architecture. From  our  proposed linear  kernel  implementation, the number of features  determine the maximum  operating frequency  and amount  of logic resource utilization,  whereas  the  number of support   vectors  determines the  amount  of on-chip  memory  usage  and  also the  throughput of the system

    Optimal load shedding for microgrids with unlimited DGs

    Get PDF
    Recent years, increasing trends on electrical supply demand, make us to search for the new alternative in supplying the electrical power. A study in micro grid system with embedded Distribution Generations (DGs) to the system is rapidly increasing. Micro grid system basically is design either operate in islanding mode or interconnect with the main grid system. In any condition, the system must have reliable power supply and operating at low transmission power loss. During the emergency state such as outages of power due to electrical or mechanical faults in the system, it is important for the system to shed any load in order to maintain the system stability and security. In order to reduce the transmission loss, it is very important to calculate best size of the DGs as well as to find the best positions in locating the DG itself.. Analytical Hierarchy Process (AHP) has been applied to find and calculate the load shedding priorities based on decision alternatives which have been made. The main objective of this project is to optimize the load shedding in the micro grid system with unlimited DG’s by applied optimization technique Gravitational Search Algorithm (GSA). The technique is used to optimize the placement and sizing of DGs, as well as to optimal the load shedding. Several load shedding schemes have been proposed and studied in this project such as load shedding with fixed priority index, without priority index and with dynamic priority index. The proposed technique was tested on the IEEE 69 Test Bus Distribution system

    Reconfigurable acceleration of Recurrent Neural Networks

    Get PDF
    Recurrent Neural Networks (RNNs) have been successful in a wide range of applications involving temporal sequences such as natural language processing, speech recognition and video analysis. However, RNNs often require a significant amount of memory and computational resources. In addition, the recurrent nature and data dependencies in RNN computations can lead to system stall, resulting in low throughput and high latency. This work describes novel parallel hardware architectures for accelerating RNN inference using Field-Programmable Gate Array (FPGA) technology, which considers the data dependencies and high computational costs of RNNs. The first contribution of this thesis is a latency-hiding architecture that utilizes column-wise matrix-vector multiplication instead of the conventional row-wise operation to eliminate data dependencies and improve the throughput of RNN inference designs. This architecture is further enhanced by a configurable checkerboard tiling strategy which allows large dimensions of weight matrices, while supporting element-based parallelism and vector-based parallelism. The presented reconfigurable RNN designs show significant speedup over CPU, GPU, and other FPGA designs. The second contribution of this thesis is a weight reuse approach for large RNN models with weights stored in off-chip memory, running with a batch size of one. A novel blocking-batching strategy is proposed to optimize the throughput of large RNN designs on FPGAs by reusing the RNN weights. Performance analysis is also introduced to enable FPGA designs to achieve the best trade-off between area, power consumption and performance. Promising power efficiency improvement has been achieved in addition to speeding up over CPU and GPU designs. The third contribution of this thesis is a low latency design for RNNs based on a partially-folded hardware architecture. It also introduces a technique that balances initiation interval of multi-layer RNN inferences to increase hardware efficiency and throughput while reducing latency. The approach is evaluated on a variety of applications, including gravitational wave detection and Bayesian RNN-based ECG anomaly detection. To facilitate the use of this approach, we open source an RNN template which enables the generation of low-latency FPGA designs with efficient resource utilization using high-level synthesis tools.Open Acces

    Boosting the hardware-efficiency of cascade support vector machines for embedded classification applications

    Get PDF
    Support Vector Machines (SVMs) are considered as a state-of-the-art classification algorithm capable of high accuracy rates for a different range of applications. When arranged in a cascade structure, SVMs can efficiently handle problems where the majority of data belongs to one of the two classes, such as image object classification, and hence can provide speedups over monolithic (single) SVM classifiers. However, the SVM classification process is still computationally demanding due to the number of support vectors. Consequently, in this paper we propose a hardware architecture optimized for cascaded SVM processing to boost performance and hardware efficiency, along with a hardware reduction method in order to reduce the overheads from the implementation of additional stages in the cascade, leading to significant resource and power savings. The architecture was evaluated for the application of object detection on 800×600 resolution images on a Spartan 6 Industrial Video Processing FPGA platform achieving over 30 frames-per-second. Moreover, by utilizing the proposed hardware reduction method we were able to reduce the utilization of FPGA custom-logic resources by ∼30%, and simultaneously observed ∼20% peak power reduction compared to a baseline implementation

    Control Strategies for Open-End Winding Drives Operating in the Flux-Weakening Region

    Get PDF
    This paper presents and compares control strategies for three-phase open-end winding drives operating in the flux-weakening region. A six-leg inverter with a single dc-link is associated with the machine in order to use a single energy source. With this topology, the zero-sequence circuit has to be considered since the zero-sequence current can circulate in the windings. Therefore, conventional over-modulation strategies are not appropriate when the machine enters in the flux-weakening region. A few solutions dealing with the zero-sequence circuit have been proposed in literature. They use a modified space vector modulation or a conventional modulation with additional voltage limitations. The paper describes the aforementioned strategies and then a new strategy is proposed. This new strategy takes into account the magnitudes and phase angles of the voltage harmonic components. This yields better voltage utilization in the dq frame. Furthermore, inverter saturation is avoided in the zero-sequence frame and therefore zero-sequence current control is maintained. Three methods are implemented on a test bed composed of a three-phase permanent-magnet synchronous machine, a six-leg inverter and a hybrid DSP/FPGA controller. Experimental results are presented and compared for all strategies. A performance analysis is conducted as regards the region of operation and the machine parameters.Projet SOFRACI/FU

    Multi-Tenant Cloud FPGA: A Survey on Security

    Full text link
    With the exponentially increasing demand for performance and scalability in cloud applications and systems, data center architectures evolved to integrate heterogeneous computing fabrics that leverage CPUs, GPUs, and FPGAs. FPGAs differ from traditional processing platforms such as CPUs and GPUs in that they are reconfigurable at run-time, providing increased and customized performance, flexibility, and acceleration. FPGAs can perform large-scale search optimization, acceleration, and signal processing tasks compared with power, latency, and processing speed. Many public cloud provider giants, including Amazon, Huawei, Microsoft, Alibaba, etc., have already started integrating FPGA-based cloud acceleration services. While FPGAs in cloud applications enable customized acceleration with low power consumption, it also incurs new security challenges that still need to be reviewed. Allowing cloud users to reconfigure the hardware design after deployment could open the backdoors for malicious attackers, potentially putting the cloud platform at risk. Considering security risks, public cloud providers still don't offer multi-tenant FPGA services. This paper analyzes the security concerns of multi-tenant cloud FPGAs, gives a thorough description of the security problems associated with them, and discusses upcoming future challenges in this field of study
    corecore