228 research outputs found

    Low power processor architecture and multicore approach for embedded systems

    Get PDF
    13301甲第4319号博士(工学)金沢大学博士論文本文Full 以下に掲載:1.IEICE Transactions Vol. E98-C(7) pp.544-549 2015. IEICE. 共著者: S. Otani, H. Kondo. /2.Reuse 許可エビデンス送

    Digital implementation of an upstream DOCSIS QAM modulator and channel emulator

    Get PDF
    The concept of cable television, originally called community antenna television (CATV), began in the 1940's. The information and services provided by cable operators have changed drastically since the early days. Cable service providers are no longer simply providing their customers with broadcast television but are providing a multi-purpose, two-way link to the digital world. Custom programming, telephone service, radio, and high-speed internet access are just a few of the services offered by cable service providers in the 21st century. At the dawn of the internet the dominant mode of access was through telephone lines. Despite advances in dial-up modem technology, the telephone system was unable to keep pace with the demand for data throughput. In the late 1990's an industry consortium known as Cable Television Laboratories, Inc. developed a standard protocol for providing high-speed internet access through the existing CATV infrastructure. This protocol is known as Data Over Cable Service Interface Specification (DOCSIS) and it helped to usher in the era of the information superhighway. CATV systems use different parts of the radio frequency (RF) spectrum for communication to and from the user. The downstream portion (data destined for the user) consumes the bulk of the spectrum and is located at relatively high frequencies. The upstream portion (data destined to the network from the user) of the spectrum is smaller and located at the low end of the spectrum. This lower frequency region of the RF spectrum is particularly prone to impairments such as micro-reflections, which can be viewed as a type of multipath interference. Upstream data transfer in the presence of these impairments is therefore problematic and requires complex signal correction algorithms to be employed in the receiver. The quality of a receiver is largely determined by how well it mitigates the signal impairments introduced by the channel. For this reason, engineers developing a receiver require a piece of equipment that can emulate the channel impairments in any permutation in order to test their receiver. The conventional test methodology uses a hardware RF channel emulator connected between the transmitter and the receiver under test. This method not only requires an expensive RF channel emulator, but a functioning analog front-end as well. Of these two problems, the expense of the hardware emulator is likely less important than the delay in development caused by waiting for a functional analog front-end. Receiver design is an iterative, time consuming process that requires the receiver's digital signal processing (DSP) algorithms be tested as early as possible to reduce the time-to-market. This thesis presents a digital implementation of a DOCSIS-compliant channel emulator whereby cable micro-reflections and thermal noise at the analog front-end of the receiver are modelled digitally at baseband. The channel emulator and the modulator are integrated into a single hardware structure to produce a compact circuit that, during receiver testing, resides inside the same field programmable gate array (FPGA) as the receiver. This approach removes the dependence on the analog front-end allowing it to be developed concurrently with the receiver's DSP circuits, thus reducing the time-to-market. The approach taken in this thesis produces a fully programmable channel emulator that can be loaded onto FPGAs as needed by engineers working independently on different receiver designs. The channel emulator uses 3 independent data streams to produce a 3-channel signal, whereby a main channel with micro-reflections is flanked on either side by adjacent channels. Thermal noise normally generated by the receiver's analog front-end is emulated and injected into the signal. The resulting structure utilizes 43 dedicated multipliers and 401.125 KB of RAM, and achieves a modulation error ratio (MER) of 55.29 dB

    Low power signal processing research at Stanford

    Get PDF
    This paper gives an overview of the research being conducted at Stanford University's Space, Telecommunications, and Radioscience Laboratory in the area of low energy computation. It discusses the work we are doing in large scale digital VLSI neural networks, interleaved processor and pipelined memory architectures, energy estimation and optimization, multichip module packaging, and low voltage digital logic

    FPGA-based EMT simulation

    Get PDF
    With the massive integration of renewable energy and power electronics, the structure of modern power system becomes increasingly complex. This is a great challenge for the secure and reliable operation of large power systems. It has been recognised that Electromagnetic Transients (EMT) simulations play an important role in analysing large power system operation, control and protection. Hence it is desirable to investigate faster and more efficient simulation methods for large-scale power systems. The existing off-line software packages, such as MATLAB and PSCAD, are time-consuming to simulate EMTs of large scale power systems. There is a growing demand to develop cheaper and faster EMT simulation technologies/techniques. The rapid evolution of computing technologies makes it possible to enhance EMT simulations in real-time domain. With fast calculation speed, configurable resources and programmable structure, Field-Programmable Gate Array (FPGA) is a powerful platform for EMT simulation. Therefore, this thesis aims to propose high-performance FPGA-based EMT simulation models and algorithms with improved accuracy, speed, and efficiency. First, Single-Precision, Double-Precision and Mixed-Precision algorithms are proposed and compared to enhance numerical accuracy for the first time. Existing research publications only consider Single-Precision as default option and ignore possible numerical phase shift in FPGA. As a basis for EMT simulation, a library of fundamental power system components is built up, including linear and nonlinear components. Single-Precision and Double-Precision algorithm are using the same precision for all components. Consider component dynamics, the key of Mixed-Precision is simulating linear and non-linear component using Single-Precision and Double-Precision respectively to eliminate phase shift. Hardware structure is also optimized to make these different precision algorithms achievable on single FPGA board. By comparing with the same referenced models in MATLAB, three algorithms are tested via case studies, including linear components, rotating non-linear components on smaller and larger systems. The adaptability of these algorithms is verified effectively in terms of accuracy, resource utilization, and timing. Second, four initialization methods are developed to minimize accumulated error for FPGA-based EMT simulation for the first time. Most researches from non-initialized time point, which will increase random error. To simulate from initial steady state, fast and reliable initialization are worth to be investigated. For necessary information, key issues for FPGA-based initialization are discussed first, including initialization model type, memory unit and sequence. Both VHDL file (.VHD) and Coefficient (.COE) file allows defining initial data. To maximize flexibility, four initialization methods, including Method 1 (physical interface), Method 2 (signal declaration), Method 3 (signal assignment) and Method 4 (COE) file are proposed and provided with detailed programming codes. For ahead-of-time evaluation, routing and timing performances are compared between these four methods, and Method 4 is the simplest method. The implementation structure and algorithm of Method 4 are developed to allow flexible data transfer between different platforms. For flexible scalability, device-level and system-level case studies are both provided to compare practical performance of Method 1-4. Third, a generic MATLAB-to-FPGA toolbox is developed for users to simplify hardware design. This is motivated by FPGA programming is complex and using FPGA to model EMT is more time-consuming for beginners. Without any EMT functionalities, existing translation toolboxes are focused on direct translation from other language to VHDL/Verilog. Therefore, the development toolbox can accelerate beginners to familiarize FPGA-based EMT. In a user-friendly environment, development framework, design features and requirements are developed for fast processing. For general processing form, data format, structure and partition are developed using intelligent MATLAB built-in functions. To support low-level calculations, translation and resource reutilization for using IP CORES are presented. To integrate and control low-level calculations, high-level main controllers using FSM (Finite State Machine) is setting up sequencers for pipelined and non-pipelined stage. A 39-bus network case study is provided to verify the effectiveness of proposed MATLAB-to-FPGA toolbox. To support high-frequency switching, power electronic devices and control systems are also developed for FPGA-based EMT simulation. Existing research focuses on using multi-FPGA to simulate HVDC-MMC system, this research aims at implementing whole HVDC-MMC system operation and control on single FPGA platform. This can help reduce FPGA area cost and improve resource utilization efficiency. As a supplement to existing power system components, power electronic devices, such as IGBT and MMC (Modular Multi-level Converter), are built up in discrete-time mathematical models. Based on trapezoidal rule, the aggregate model of MMC is derived to use only one equivalent module to represent all modules, regardless of arbitrary modules. Control system, such as PWM control and current control loop, is also modelled and simplified to be more suitable for hardware implementation. Optimized strategies, such as shift memory and interpolation, are also proposed to get lower resource utilization and faster calculation speed in FPGA. With these strategies, PWM control block and HVDC-MMC case studies can both be successfully implemented on single FPGA board with high-performance accuracy and resource utilization

    A configurable vector processor for accelerating speech coding algorithms

    Get PDF
    The growing demand for voice-over-packer (VoIP) services and multimedia-rich applications has made increasingly important the efficient, real-time implementation of low-bit rates speech coders on embedded VLSI platforms. Such speech coders are designed to substantially reduce the bandwidth requirements thus enabling dense multichannel gateways in small form factor. This however comes at a high computational cost which mandates the use of very high performance embedded processors. This thesis investigates the potential acceleration of two major ITU-T speech coding algorithms, namely G.729A and G.723.1, through their efficient implementation on a configurable extensible vector embedded CPU architecture. New scalar and vector ISAs were introduced which resulted in up to 80% reduction in the dynamic instruction count of both workloads. These instructions were subsequently encapsulated into a parametric, hybrid SISD (scalar processor)–SIMD (vector) processor. This work presents the research and implementation of the vector datapath of this vector coprocessor which is tightly-coupled to a Sparc-V8 compliant CPU, the optimization and simulation methodologies employed and the use of Electronic System Level (ESL) techniques to rapidly design SIMD datapaths

    Superscalar RISC-V Processor with SIMD Vector Extension

    Get PDF
    With the increasing number of digital products in the market, the need for robust and highly configurable processors rises. The demand is convened by the stable and extensible open-sourced RISC-V instruction set architecture. RISC-V processors are becoming popular in many fields of applications and research. This thesis presents a dual-issue superscalar RISC-V processor design with dynamic execution. The proposed design employs the global sharing scheme for branch prediction and Tomasulo algorithm for out-of-order execution. The processor is capable of speculative execution with five checkpoints. Data flow in the instruction dispatch and commit stages is optimized to achieve higher instruction throughput. The superscalar processor is extended with a customized vector instruction set of single-instruction-multiple-data computations to specifically improve the performance on machine learning tasks. According to the definition of the proposed vector instruction set, the scratchpad memory and element-wise arithmetic units are implemented in the vector co-processor. Different test programs are evaluated on the fully-tested superscalar processor. Compared to the reference work, the proposed design improves 18.9% on average instruction throughput and 4.92% on average prediction hit rate, with 16.9% higher operating clock frequency synthesized on the Intel Arria 10 FPGA board. The forward propagation of a convolution neural network model is evaluated by the standalone superscalar processor and the integration of the vector co-processor. The vector program with software-level optimizations achieves 9.53× improvement on instruction throughput and 10.18× improvement on real-time throughput. Moreover, the integration also provides 2.22× energy efficiency compared with the superscalar processor along

    FPGA based technical solutions for high throughput data processing and encryption for 5G communication: A review

    Get PDF
    The field programmable gate array (FPGA) devices are ideal solutions for high-speed processing applications, given their flexibility, parallel processing capability, and power efficiency. In this review paper, at first, an overview of the key applications of FPGA-based platforms in 5G networks/systems is presented, exploiting the improved performances offered by such devices. FPGA-based implementations of cloud radio access network (C-RAN) accelerators, network function virtualization (NFV)-based network slicers, cognitive radio systems, and multiple input multiple output (MIMO) channel characterizers are the main considered applications that can benefit from the high processing rate, power efficiency and flexibility of FPGAs. Furthermore, the implementations of encryption/decryption algorithms by employing the Xilinx Zynq Ultrascale+MPSoC ZCU102 FPGA platform are discussed, and then we introduce our high-speed and lightweight implementation of the well-known AES-128 algorithm, developed on the same FPGA platform, and comparing it with similar solutions already published in the literature. The comparison results indicate that our AES-128 implementation enables efficient hardware usage for a given data-rate (up to 28.16 Gbit/s), resulting in higher efficiency (8.64 Mbps/slice) than other considered solutions. Finally, the applications of the ZCU102 platform for high-speed processing are explored, such as image and signal processing, visual recognition, and hardware resource management

    A Survey on FPGA-Based Sensor Systems: Towards Intelligent and Reconfigurable Low-Power Sensors for Computer Vision, Control and Signal Processing

    Get PDF
    The current trend in the evolution of sensor systems seeks ways to provide more accuracy and resolution, while at the same time decreasing the size and power consumption. The use of Field Programmable Gate Arrays (FPGAs) provides specific reprogrammable hardware technology that can be properly exploited to obtain a reconfigurable sensor system. This adaptation capability enables the implementation of complex applications using the partial reconfigurability at a very low-power consumption. For highly demanding tasks FPGAs have been favored due to the high efficiency provided by their architectural flexibility (parallelism, on-chip memory, etc.), reconfigurability and superb performance in the development of algorithms. FPGAs have improved the performance of sensor systems and have triggered a clear increase in their use in new fields of application. A new generation of smarter, reconfigurable and lower power consumption sensors is being developed in Spain based on FPGAs. In this paper, a review of these developments is presented, describing as well the FPGA technologies employed by the different research groups and providing an overview of future research within this field.The research leading to these results has received funding from the Spanish Government and European FEDER funds (DPI2012-32390), the Valencia Regional Government (PROMETEO/2013/085) and the University of Alicante (GRE12-17)

    Fast Implementation of Lifting Based DWT Architecture For Image Compression

    Get PDF
    Technological growth in semiconductor industry have led to unprecedented demand for faster area efficient and low power VLSI circuits for complex image processing applications DWT-IDWT is one of the most popular IP that is used for image transformation In this work a high speed low power DWT IDWT architecture is designed and implemented on ASIC using 130nm Technology 2D DWT architecture based on lifting scheme architecture uses multipliers and adders thus consuming power This paper addresses power reduction in multiplier by proposing a modified algorithm for BZFAD multiplier The proposed BZFAD multiplier is 65 faster and occupies 44 less area compared with the generic multipliers The DWT architecture designed based on modified BZFAD multiplier achieves 35 less power reduction and operates at frequency of 200MHz with latency of 1536 clock cycles for 512x512 image The developed DWT can be used as an IP for VLSI implementatio
    corecore