1,917 research outputs found
Low Power Processor Architectures and Contemporary Techniques for Power Optimization – A Review
The technological evolution has increased the number of transistors for a given die area significantly and increased the switching speed from few MHz to GHz range. Such inversely proportional decline in size and boost in performance consequently demands shrinking of supply voltage and effective power dissipation in chips with millions of transistors. This has triggered substantial amount of research in power reduction techniques into almost every aspect of the chip and particularly the processor cores contained in the chip. This paper presents an overview of techniques for achieving the power efficiency mainly at the processor core level but also visits related domains such as buses and memories. There are various processor parameters and features such as supply voltage, clock frequency, cache and pipelining which can be optimized to reduce the power consumption of the processor. This paper discusses various ways in which these parameters can be optimized. Also, emerging power efficient processor architectures are overviewed and research activities are discussed which should help reader identify how these factors in a processor contribute to power consumption. Some of these concepts have been already established whereas others are still active research areas. © 2009 ACADEMY PUBLISHER
Baseband Processing for 5G and Beyond: Algorithms, VLSI Architectures, and Co-design
In recent years the number of connected devices and the demand for high data-rates have been significantly increased. This enormous growth is more pronounced by the introduction of the Internet of things (IoT) in which several devices are interconnected to exchange data for various applications like smart homes and smart cities. Moreover, new applications such as eHealth, autonomous vehicles, and connected ambulances set new demands on the reliability, latency, and data-rate of wireless communication systems, pushing forward technology developments. Massive multiple-input multiple-output (MIMO) is a technology, which is employed in the 5G standard, offering the benefits to fulfill these requirements. In massive MIMO systems, base station (BS) is equipped with a very large number of antennas, serving several users equipments (UEs) simultaneously in the same time and frequency resource. The high spatial multiplexing in massive MIMO systems, improves the data rate, energy and spectral efficiencies as well as the link reliability of wireless communication systems. The link reliability can be further improved by employing channel coding technique. Spatially coupled serially concatenated codes (SC-SCCs) are promising channel coding schemes, which can meet the high-reliability demands of wireless communication systems beyond 5G (B5G). Given the close-to-capacity error correction performance and the potential to implement a high-throughput decoder, this class of code can be a good candidate for wireless systems B5G. In order to achieve the above-mentioned advantages, sophisticated algorithms are required, which impose challenges on the baseband signal processing. In case of massive MIMO systems, the processing is much more computationally intensive and the size of required memory to store channel data is increased significantly compared to conventional MIMO systems, which are due to the large size of the channel state information (CSI) matrix. In addition to the high computational complexity, meeting latency requirements is also crucial. Similarly, the decoding-performance gain of SC-SCCs also do come at the expense of increased implementation complexity. Moreover, selecting the proper choice of design parameters, decoding algorithm, and architecture will be challenging, since spatial coupling provides new degrees of freedom in code design, and therefore the design space becomes huge. The focus of this thesis is to perform co-optimization in different design levels to address the aforementioned challenges/requirements. To this end, we employ system-level characteristics to develop efficient algorithms and architectures for the following functional blocks of digital baseband processing. First, we present a fast Fourier transform (FFT), an inverse FFT (IFFT), and corresponding reordering scheme, which can significantly reduce the latency of orthogonal frequency-division multiplexing (OFDM) demodulation and modulation as well as the size of reordering memory. The corresponding VLSI architectures along with the application specific integrated circuit (ASIC) implementation results in a 28 nm CMOS technology are introduced. In case of a 2048-point FFT/IFFT, the proposed design leads to 42% reduction in the latency and size of reordering memory. Second, we propose a low-complexity massive MIMO detection scheme. The key idea is to exploit channel sparsity to reduce the size of CSI matrix and eventually perform linear detection followed by a non-linear post-processing in angular domain using the compressed CSI matrix. The VLSI architecture for a massive MIMO with 128 BS antennas and 16 UEs along with the synthesis results in a 28 nm technology are presented. As a result, the proposed scheme reduces the complexity and required memory by 35%–73% compared to traditional detectors while it has better detection performance. Finally, we perform a comprehensive design space exploration for the SC-SCCs to investigate the effect of different design parameters on decoding performance, latency, complexity, and hardware cost. Then, we develop different decoding algorithms for the SC-SCCs and discuss the associated decoding performance and complexity. Also, several high-level VLSI architectures along with the corresponding synthesis results in a 12 nm process are presented, and various design tradeoffs are provided for these decoding schemes
Center for Aeronautics and Space Information Sciences
This report summarizes the research done during 1991/92 under the Center for Aeronautics and Space Information Science (CASIS) program. The topics covered are computer architecture, networking, and neural nets
Edge computing infrastructure for 5G networks: a placement optimization solution
This thesis focuses on how to optimize the placement of the Edge Computing infrastructure for upcoming 5G networks. To this aim, the core contributions of this research are twofold: 1) a novel heuristic called Hybrid Simulated Annealing to tackle the NP-hard nature of the problem and, 2) a framework called EdgeON providing a practical tool for real-life deployment optimization.
In more detail, Edge Computing has grown into a key solution to 5G latency, reliability and scalability requirements. By bringing computing, storage and networking resources to the edge of the network, delay-sensitive applications, location-aware systems and upcoming real-time services leverage the benefits of a reduced physical and logical path between the end-user and the data or service host.
Nevertheless, the edge node placement problem raises critical concerns regarding deployment and operational expenditures (i.e., mainly due to the number of nodes to be deployed), current backhaul network capabilities and non-technical placement limitations. Common approaches to the placement of edge nodes are based on: Mobile Edge Computing (MEC), where the processing capabilities are deployed at the Radio Access Network nodes and Facility Location Problem variations, where a simplistic cost function is used to determine where to optimally place the infrastructure. However, these methods typically lack the flexibility to be used for edge node placement under the strict technical requirements identified for 5G networks. They fail to place resources at the network edge for 5G ultra-dense networking environments in a network-aware manner.
This doctoral thesis focuses on rigorously defining the Edge Node Placement Problem (ENPP) for 5G use cases and proposes a novel framework called EdgeON aiming at reducing the overall expenses when deploying and operating an Edge Computing network, taking into account the usage and characteristics of the in-place backhaul network and the strict requirements of a 5G-EC ecosystem. The developed framework implements several placement and optimization strategies thoroughly assessing its suitability to solve the network-aware ENPP. The core of the framework is an in-house developed heuristic called Hybrid Simulated Annealing (HSA), seeking to address the high complexity of the ENPP while avoiding the non-convergent behavior of other traditional heuristics (i.e., when applied to similar problems).
The findings of this work validate our approach to solve the network-aware ENPP, the effectiveness of the heuristic proposed and the overall applicability of EdgeON. Thorough performance evaluations were conducted on the core placement solutions implemented revealing the superiority of HSA when compared to widely used heuristics and common edge placement approaches (i.e., a MEC-based strategy). Furthermore, the practicality of EdgeON was tested through two main case studies placing services and virtual network functions over the previously optimally placed edge nodes.
Overall, our proposal is an easy-to-use, effective and fully extensible tool that can be used by operators seeking to optimize the placement of computing, storage and networking infrastructure at the users’ vicinity. Therefore, our main contributions not only set strong foundations towards a cost-effective deployment and operation of an Edge Computing network, but directly impact the feasibility of upcoming 5G services/use cases and the extensive existing research regarding the placement of services and even network service chains at the edge
Recommended from our members
Efficient FPGA implementation and power modelling of image and signal processing IP cores
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.Field Programmable Gate Arrays (FPGAs) are the technology of choice in a number ofimage
and signal processing application areas such as consumer electronics, instrumentation,
medical data processing and avionics due to their reasonable energy consumption, high performance, security, low design-turnaround time and reconfigurability. Low power FPGA
devices are also emerging as competitive solutions for mobile and thermally constrained platforms. Most computationally intensive image and signal processing algorithms also consume a lot of power leading to a number of issues including reduced mobility, reliability concerns and increased design cost among others. Power dissipation has become one of the most important challenges, particularly for FPGAs. Addressing this problem requires optimisation and awareness at all levels in the design flow. The key achievements of the
work presented in this thesis are summarised here. Behavioural level optimisation strategies have been used for implementing matrix product and inner product through the use of mathematical techniques such as Distributed Arithmetic (DA) and its variations including offset binary coding, sparse factorisation and novel vector level transformations. Applications to test the impact of these algorithmic and arithmetic transformations include the fast Hadamard/Walsh transforms and Gaussian mixture models. Complete design space exploration has been performed on these cores, and where appropriate, they have been shown to clearly outperform comparable existing implementations. At the architectural level, strategies such as parallelism, pipelining and systolisation have been successfully applied for the design and optimisation of a number of
cores including colour space conversion, finite Radon transform, finite ridgelet transform and circular convolution. A pioneering study into the influence of supply voltage scaling for FPGA based designs, used in conjunction with performance enhancing strategies such as parallelism and pipelining has been performed. Initial results are very promising and indicated significant potential for future research in this area.
A key contribution of this work includes the development of a novel high level power macromodelling technique for design space exploration and characterisation of custom IP cores for FPGAs, called Functional Level Power Analysis and Modelling (FLPAM). FLPAM
is scalable, platform independent and compares favourably with existing approaches. A hybrid, top-down design flow paradigm integrating FLPAM with commercially available design tools for systematic optimisation of IP cores has also been developed
Recommended from our members
ENERGY EFFICIENCY EXPLORATION OF COARSE-GRAIN RECONFIGURABLE ARCHITECTURE WITH EMERGING NONVOLATILE MEMORY
With the rapid growth in consumer electronics, people expect thin, smart and powerful devices, e.g. Google Glass and other wearable devices. However, as portable electronic products become smaller, energy consumption becomes an issue that limits the development of portable systems due to battery lifetime. In general, simply reducing device size cannot fully address the energy issue.
To tackle this problem, we propose an on-chip interconnect infrastructure and pro- gram storage structure for a coarse-grained reconfigurable architecture (CGRA) with emerging non-volatile embedded memory (MRAM). The interconnect is composed of a matrix of time-multiplexed switchboxes which can be dynamically reconfigured with the goal of energy reduction. The number of processors performing computation can also be adapted. The use of MRAM provides access to high-density storage and lower memory energy consumption versus more standard SRAM technologies. The combination of CGRA, MRAM, and flexible on-chip interconnection is considered for signal processing. This application domain is of interest based on its time-varying computing demands.
To evaluate CGRA architectural features, prototype architectures have been pro- totyped in a field-programmable gate array (FPGA). Measurements of energy, power, instruction count, and execution time performance are considered for a scalable num- ber of processors. Applications such as adaptive Viterbi decoding and Reed Solomon coding are used for evaluation. To complete this thesis, a time-scheduled switchbox was integrated into our CGRA model. This model was prototyped on an FPGA. It is shown that energy consumption can be reduced by about 30% if dynamic design reconfiguration is performed
A Case Study of Edge Computing Implementations: Multi-access Edge Computing, Fog Computing and Cloudlet
With the explosive growth of intelligent and mobile devices, the current centralized cloud computing paradigm is encountering difficult challenges. Since the primary requirements have shifted towards implementing real-time response and supporting context awareness and mobility, there is an urgent need to bring resources and functions of centralized clouds to the edge of networks, which has led to the emergence of the edge computing paradigm. Edge computing increases the responsibilities of network edges by hosting computation and services, therefore enhancing performances and improving quality of experience (QoE). Fog computing, multi-access edge computing (MEC), and cloudlet are three typical and promising implementations of edge computing. Fog computing aims to build a system that enables cloud-to-thing service connectivity and works in concert with clouds, MEC is seen as a key technology of the fifth generation (5G) system, and Cloudlet is a micro-data center deployed in close proximity. In terms of deployment scenarios, Fog computing focuses on the Internet of Things (IoT), MEC mainly provides mobile RAN application solutions for 5G systems, and cloudlet offloads computing power at the network edge. In this paper, we present a comprehensive case study on these three edge computing implementations, including their architectures, differences, and their respective application scenario in IoT, 5G wireless systems, and smart edge. We discuss the requirements, benefits, and mechanisms of typical co-deployment cases for each paradigm and identify challenges and future directions in edge computing
- …