156 research outputs found
State of the art baseband DSP platforms for Software Defined Radio: A survey
Software Defined Radio (SDR) is an innovative approach which is becoming a more and more promising technology for future mobile handsets. Several proposals in the field of embedded systems have been introduced by different universities and industries to support SDR applications. This article presents an overview of current platforms and analyzes the related architectural choices, the current issues in SDR, as well as potential future trends.Peer reviewe
Multi-core architectures with coarse-grained dynamically reconfigurable processors for broadband wireless access technologies
Broadband Wireless Access technologies have significant market potential, especially the
WiMAX protocol which can deliver data rates of tens of Mbps. Strong demand for high
performance WiMAX solutions is forcing designers to seek help from multi-core processors
that offer competitive advantages in terms of all performance metrics, such as speed, power
and area. Through the provision of a degree of flexibility similar to that of a DSP and
performance and power consumption advantages approaching that of an ASIC,
coarse-grained dynamically reconfigurable processors are proving to be strong candidates
for processing cores used in future high performance multi-core processor systems.
This thesis investigates multi-core architectures with a newly emerging dynamically
reconfigurable processor – RICA, targeting WiMAX physical layer applications. A novel
master-slave multi-core architecture is proposed, using RICA processing cores. A SystemC
based simulator, called MRPSIM, is devised to model this multi-core architecture. This
simulator provides fast simulation speed and timing accuracy, offers flexible architectural
options to configure the multi-core architecture, and enables the analysis and investigation
of multi-core architectures. Meanwhile a profiling-driven mapping methodology is
developed to partition the WiMAX application into multiple tasks as well as schedule and
map these tasks onto the multi-core architecture, aiming to reduce the overall system
execution time. Both the MRPSIM simulator and the mapping methodology are seamlessly
integrated with the existing RICA tool flow.
Based on the proposed master-slave multi-core architecture, a series of diverse
homogeneous and heterogeneous multi-core solutions are designed for different fixed
WiMAX physical layer profiles. Implemented in ANSI C and executed on the MRPSIM
simulator, these multi-core solutions contain different numbers of cores, combine various memory architectures and task partitioning schemes, and deliver high throughputs at
relatively low area costs. Meanwhile a design space exploration methodology is developed
to search the design space for multi-core systems to find suitable solutions under certain
system constraints. Finally, laying a foundation for future multithreading exploration on the
proposed multi-core architecture, this thesis investigates the porting of a real-time operating
system – Micro C/OS-II to a single RICA processor. A multitasking version of WiMAX is
implemented on a single RICA processor with the operating system support
Building Programmable Wireless Networks: An Architectural Survey
In recent times, there have been a lot of efforts for improving the ossified
Internet architecture in a bid to sustain unstinted growth and innovation. A
major reason for the perceived architectural ossification is the lack of
ability to program the network as a system. This situation has resulted partly
from historical decisions in the original Internet design which emphasized
decentralized network operations through co-located data and control planes on
each network device. The situation for wireless networks is no different
resulting in a lot of complexity and a plethora of largely incompatible
wireless technologies. The emergence of "programmable wireless networks", that
allow greater flexibility, ease of management and configurability, is a step in
the right direction to overcome the aforementioned shortcomings of the wireless
networks. In this paper, we provide a broad overview of the architectures
proposed in literature for building programmable wireless networks focusing
primarily on three popular techniques, i.e., software defined networks,
cognitive radio networks, and virtualized networks. This survey is a
self-contained tutorial on these techniques and its applications. We also
discuss the opportunities and challenges in building next-generation
programmable wireless networks and identify open research issues and future
research directions.Comment: 19 page
System-on-chip Computing and Interconnection Architectures for Telecommunications and Signal Processing
This dissertation proposes novel architectures and design techniques targeting SoC building blocks for telecommunications and signal processing applications.
Hardware implementation of Low-Density Parity-Check decoders is approached at both the algorithmic and the architecture level. Low-Density Parity-Check codes are a promising coding scheme for future communication standards due to their outstanding error correction performance.
This work proposes a methodology for analyzing effects of finite precision arithmetic on error correction performance and hardware complexity. The methodology is throughout employed for co-designing the decoder. First, a low-complexity check node based on the P-output decoding principle is designed and characterized on a CMOS standard-cells library. Results demonstrate implementation loss below 0.2 dB down to BER of 10^{-8} and a saving in complexity up to 59% with respect to other works in recent literature. High-throughput and low-latency issues are addressed with modified single-phase decoding schedules. A new "memory-aware" schedule is proposed requiring down to 20% of memory with respect to the traditional two-phase flooding decoding. Additionally, throughput is doubled and logic complexity reduced of 12%. These advantages are traded-off with error correction performance, thus making the solution attractive only for long codes, as those adopted in the DVB-S2 standard. The "layered decoding" principle is extended to those codes not specifically conceived for this technique. Proposed architectures exhibit complexity savings in the order of 40% for both area and power consumption figures, while implementation loss is smaller than 0.05 dB.
Most modern communication standards employ Orthogonal Frequency Division Multiplexing as part of their physical layer. The core of OFDM is the Fast Fourier Transform and its inverse in charge of symbols (de)modulation. Requirements on throughput and energy efficiency call for FFT hardware implementation, while ubiquity of FFT suggests the design of parametric, re-configurable and re-usable IP hardware macrocells. In this context, this thesis describes an FFT/IFFT core compiler particularly suited for implementation of OFDM communication systems. The tool employs an accuracy-driven configuration engine which automatically profiles the internal arithmetic and generates a core with minimum operands bit-width and thus minimum circuit complexity. The engine performs a closed-loop optimization over three different internal arithmetic models (fixed-point, block floating-point and convergent block floating-point) using the numerical accuracy budget given by the user as a reference point. The flexibility and re-usability of the proposed macrocell are illustrated through several case studies which encompass all current state-of-the-art OFDM communications standards (WLAN, WMAN, xDSL, DVB-T/H, DAB and UWB). Implementations results are presented for two deep sub-micron standard-cells libraries (65 and 90 nm) and commercially available FPGA devices. Compared with other FFT core compilers, the proposed environment produces macrocells with lower circuit complexity and same system level performance (throughput, transform size and numerical accuracy).
The final part of this dissertation focuses on the Network-on-Chip design paradigm whose goal is building scalable communication infrastructures connecting hundreds of core. A low-complexity link architecture for mesochronous on-chip communication is discussed. The link enables skew constraint looseness in the clock tree synthesis, frequency speed-up, power consumption reduction and faster back-end turnarounds. The proposed architecture reaches a maximum clock frequency of 1 GHz on 65 nm low-leakage CMOS standard-cells library. In a complex test case with a full-blown NoC infrastructure, the link overhead is only 3% of chip area and 0.5% of leakage power consumption.
Finally, a new methodology, named metacoding, is proposed. Metacoding generates correct-by-construction technology independent RTL codebases for NoC building blocks. The RTL coding phase is abstracted and modeled with an Object Oriented framework, integrated within a commercial tool for IP packaging (Synopsys CoreTools suite). Compared with traditional coding styles based on pre-processor directives, metacoding produces 65% smaller codebases and reduces the configurations to verify up to three orders of magnitude
Recommended from our members
Photonic Interconnection Networks for Applications in Heterogeneous Utility Computing Systems
Growing demands in heterogeneous utility computing systems in future cloud and high performance computing systems are driving the development of processor-hardware accelerator interconnects with greater performance, flexibility, and dynamism. Recent innovations in the field of utility computing have led to an emergence in the use of heterogeneous compute elements. By leveraging the computing advantages of hardware accelerators alongside typical general purpose processors, performance efficiency can be maximized. The network linking these compute nodes is increasingly becoming the bottleneck in these architectures, limiting the hardware accelerators to be restricted to localized computing.
A high-bandwidth, agile interconnect is an imperative enabler for hardware accelerator delocalization in heterogeneous utility computing. A redesign of these systems' interconnect and architecture will be essential to establishing high-bandwidth, low-latency, efficient, and dynamic heterogeneous systems that can meet the challenges of next-generation utility computing.
By leveraging an optics-based approach, this dissertation presents the design and implementation of optically-connected hardware accelerators (OCHA) that exploit the distance-independent energy dissipation and bandwidth density of photonic transceivers, in combination with the flexibility, efficiency and data parallelization offered by optical networks. By replacing the electronic buses with an optical interconnection network, architectures that delocalize hardware accelerators can be created that are otherwise infeasible.
With delocalized optically-connected hardware accelerator nodes accessible by processors at run time, the system can alleviate the network latency issues plague current heterogeneous systems. Accelerators that would otherwise sit idle, waiting for it's master CPU to feed it data, can instead operate at high utilization rates, leading to dramatic improvements in overall system performance.
This work presents a prototype optically-connect hardware accelerator module and custom optical-network-aware, dynamic hardware accelerator allocator that communicate transparently and optically across an optical interconnection network. The hardware accelerators and processor are optimized to enable hardware acceleration across an optical network using fast packet-switching. The versatility of the optical network enables additional performance benefits including optical multicasting to exploit the data parallelism found in many accelerated data sets. The integration of hardware acceleration, heterogeneous computing, and optics constitutes a critical step for both computing and optics.
The massive data parallelism, application dependent-location and function, as well as network latency, and bandwidth limitations facing networks today complement well with the strength of optical communications-based systems. Moreover, ongoing efforts focusing on development of low-cost optical components and subsystems that are suitable for computing environment may benefit from the high-volume heterogeneous computing market. This work, therefore, takes the first steps in merging the areas of hardware acceleration and optics by developing architectures, protocols, and systems to interface with the two technologies and demonstrating areas of potential benefits and areas for future work. Next-generation heterogeneous utility computing systems will indubitably benefit from the use of efficient, flexible and high-performance optically connect hardware acceleration
Architecture and Analysis for Next Generation Mobile Signal Processing.
Mobile devices have proliferated at a spectacular rate, with more than 3.3 billion active cell phones in the world. With sales totaling hundreds of billions every year, the mobile phone has arguably become the dominant computing platform, replacing the personal computer. Soon, improvements to today’s smart phones, such as high-bandwidth internet access, high-definition video processing, and human-centric interfaces that integrate voice recognition and video-conferencing will be commonplace. Cost effective and power efficient support for these applications will be required.
Looking forward to the next generation of mobile computing, computation requirements will increase by one to three orders of magnitude due to higher
data rates, increased complexity algorithms, and greater computation diversity but the power requirements will be
just as stringent to ensure reasonable battery lifetimes. The design of the next generation of mobile platforms must address three critical challenges: efficiency, programmability, and adaptivity. The computational efficiency of existing solutions is inadequate and straightforward scaling by increasing the number of cores or the amount of data-level parallelism will not suffice. Programmability provides the opportunity for a single platform to support multiple applications and even multiple standards within each application domain. Programmability also provides: faster time to market as hardware and software development can proceed in parallel; the ability to fix bugs and add features after manufacturing; and, higher chip volumes as a single platform can support a family of mobile devices. Lastly, hardware adaptivity is necessary to maintain efficiency as the computational characteristics of the applications change. Current solutions are tailored specifically for wireless signal processing algorithms, but lose their efficiency when other application domains like high definition video are processed.
This thesis addresses these challenges by presenting analysis of next generation mobile signal processing applications and proposing an advanced signal processing architecture to deal with the stringent requirements. An application-centric design approach is taken to design our architecture. First, a next generation wireless protocol and high definition video is analyzed and algorithmic characterizations discussed. From these characterizations, key architectural implications are presented, which form the basis for the advanced signal processor architecture, AnySP.Ph.D.Electrical EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/86344/1/mwoh_1.pd
Reconfigurable architectures for the next generation of mobile device telecommunications systems
Mobile devices have become a dominant tool in our daily lives. Business and
personal usage has escalated tremendously since the emergence of smartphones
and tablets. The combination of powerful processing in mobile devices, such as
smartphones and the Internet, have established a new era for communications
systems. This has put further pressure on the performance and efficiency of
telecommunications systems in delivering the aspirations of users. Mobile device
users no longer want devices that merely perform phone calls and messaging.
Rather, they look for further interactive applications such as video streaming,
navigation and real time social interaction. Such applications require a new set of
hardware and standards. The WiFi (IEEE 802.11) standard has been at the forefront
of reliable and high-speed internet access telecommunications. This is due to its
high signal quality (quality of service) and speed (throughput). However, its limited
availability and short range highlights the need for further protocols, in particular
when far away from access points or base stations. This led to the emergence of 3G
followed by 4G and the upcoming 5G standard that, if fully realised, will provide
another dimension in “anywhere, anytime internet connectivity.” On the other
hand, the WiMAX (IEEE 802.16) standard promises to exceed the WiFi signal
coverage range. The coverage range could be extended to kilometres at least with a
better or similar WiFi signal level.
This thesis considers a dynamically reconfigurable architecture that is capable of
processing various modules within telecommunications systems. Forward error
correction, coder and navigation modules are deployed in a unified low power
communication platform. These modules have been selected since they are among
those with the highest demand in terms of processing power, strict processing time
or throughput. The modules are mainly realised within WiFi and WiMAX systems
in addition to global positioning systems (GPS). The idea behind the selection of
these modules is to investigate the possibility of designing an architecture capable
of processing various systems and dynamically reconfiguring between them. The
GPS system is a power-hungry application and, at the same time, it is not needed
all of the time. Hence, one key idea presented in this thesis is to effectively exploit
the dynamic reconfiguration capability so as to reconfigure the architecture (GPS)
when it is not needed in order to process another needed application or function
such as WiFi or WiMAX. This will allow lower energy consumption and the
optimum usage of the hardware available on the device.
This work investigates the major current coarse-grain reconfigurable architectures.
A novel multi-rate convolution encoder is then designed and realised as a
reconfigurable fabric. This demonstrates the ability to adapt the algorithms
involved to meet various requirements. A throughput of between 200 and 800
Mbps has been achieved for the rates 1/2 to 7/8, which is a great achievement for
the proposed novel architecture. A reconfigurable interleaver is designed as a
standalone fabric and on a dynamically reconfigurable processor. High throughputs
exceeding 90 Mbps are achieved for the various supported block sizes. The Reed
Solomon coder is the next challenging system to be designed into a dynamically
reconfigurable processor. A novel Galois Field multiplier is designed and
integrated into the developed Reed Solomon reconfigurable processor. As a result
of this work, throughputs of 200Mbps and 93Mbps respectively for RS encoding
and decoding are achieved. A GPS correlation module is also investigated in this
work. This is the main part of the GPS receiver responsible for continuously
tracking GPS satellites and extracting messages from them. The challenging aspect
of this part is its real-time nature and the associated critical time constraints. This
work resulted in a novel dynamically reconfigurable multi-channel GPS correlator
with up to 72 simultaneous channels.
This work is a contribution towards a global unified processing platform that is
capable of processing communication-related operations efficiently and
dynamically with minimum energy consumption
- …