Search CORE

11 research outputs found

VLSI implementation of a multi-mode turbo/LDPC decoder architecture

Author: Condo Carlo
Masera Guido
Maurizio Martina
Publication venue: IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC, 445 HOES LANE, PISCATAWAY, NJ 08855 USA
Publication date: 01/01/2013
Field of study

Flexible and reconfigurable architectures have gained wide popularity in the communications field. In particular, reconfigurable architectures for the physical layer are an attractive solution not only to switch among different coding modes but also to achieve interoperability. This work concentrates on the design of a reconfigurable architecture for both turbo and LDPC codes decoding. The novel contributions of this paper are: i) tackling the reconfiguration issue introducing a formal and systematic treatment that, to the best of our knowledge, was not previously addressed; ii) proposing a reconfigurable NoCbased turbo/LDPC decoder architecture and showing that wide flexibility can be achieved with a small complexity overhead. Obtained results show that dynamic switching between most of considered communication standards is possible without pausing the decoding activity. Moreover, post-layout results show that tailoring the proposed architecture to the WiMAX standard leads to an area occupation of 2.75 mm2 and a power consumption of 101.5 mW in the worst case

Crossref

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino

Domain specific high performance reconfigurable architecture for a communication platform

Author: Ahmed Imran
Publication venue: The University of Edinburgh
Publication date: 01/01/2007
Field of study

Edinburgh Research Archive

VLSI decoding architectures: flexibility, robustness and performance

Author: Condo Carlo
Publication venue
Publication date: 01/01/2014
Field of study

Stemming from previous studies on flexible LDPC decoders, this thesis work has been mainly focused on the development of flexible turbo and LDPC decoder designs, and on the narrowing of the power, area and speed gap they might present with respect to dedicated solutions. Additional studies have been carried out within the field of increased code performance and of decoder resiliency to hardware errors. The first chapter regroups several main contributions in the design and implementation of flexible channel decoders. The first part concerns the design of a Network-on-Chip (NoC) serving as an interconnection network for a partially parallel LDPC decoder. A best-fit NoC architecture is designed and a complete multi-standard turbo/LDPC decoder is designed and implemented. Every time the code is changed, the decoder must be reconfigured. A number of variables influence the duration of the reconfiguration process, starting from the involved codes down to decoder design choices. These are taken in account in the flexible decoder designed, and novel traffic reduction and optimization methods are then implemented. In the second chapter a study on the early stopping of iterations for LDPC decoders is presented. The energy expenditure of any LDPC decoder is directly linked to the iterative nature of the decoding algorithm. We propose an innovative multi-standard early stopping criterion for LDPC decoders that observes the evolution of simple metrics and relies on on-the-fly threshold computation. Its effectiveness is evaluated against existing techniques both in terms of saved iterations and, after implementation, in terms of actual energy saving. The third chapter portrays a study on the resilience of LDPC decoders under the effect of memory errors. Given that the purpose of channel decoders is to correct errors, LDPC decoders are intrinsically characterized by a certain degree of resistance to hardware faults. This characteristic, together with the soft nature of the stored values, results in LDPC decoders being affected differently according to the meaning of the wrong bits: ad-hoc error protection techniques, like the Unequal Error Protection devised in this chapter, can consequently be applied to different bits according to their significance. In the fourth chapter the serial concatenation of LDPC and turbo codes is presented. The concatenated FEC targets very high error correction capabilities, joining the performance of turbo codes at low SNR with that of LDPC codes at high SNR, and outperforming both current deep-space FEC schemes and concatenation-based FECs. A unified decoder for the concatenated scheme is subsequently propose

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino

Architectures multi-Asip pour turbo récepteur flexible

Author: Jafri Atif Raza
Publication venue: HAL CCSD
Publication date: 28/01/2011
Field of study

Rapidly evolving wireless standards use modern techniques such as turbo codes, Bit Interleaved coded Modulation (BICM), high order QAM constellation, Signal Space Diversity (SSD), Multi-Input Multi-Output (MIMO) Spatial Multiplexing (SM) and Space Time Codes (STC) with different parameters for reliable high rate data transmissions. Adoption of such techniques in the transmitter can impact the receiver architecture in three ways: (1) the complex processing related to advanced techniques such as turbo codes, encourage to perform iterative processing in the receiver to improve error rate performance (2) to satisfy high throughput requirement for an iterative receiver, parallel processing is mandatory and finally (3) to allow the support of different techniques and parameters imposed, programmable yet high throughput hardware processing elements are required. In this thesis, to address the high throughput requirement with turbo processing, first of all a study of parallelism on turbo decoding is extended for turbo demodulation and turbo equalization. Based on the results acquired from the parallelism study a flexible high throughput heterogeneous multi-ASIP NoC based unified turbo receiver is proposed. The proposed architecture fulfils the target requirements in a way that: (a) Application Specific Instruction-set Processor (ASIP) exploits metric generation level parallelism and implements the required flexibility, (b) throughputs beyond the capacity of single ASIP in a turbo process are achieved through multiple ASIP elements implementing sub-block parallelism and shuffled processing and finally (c) Network on Chip is used to handle communication conflicts during parallel processing of multiple ASIPs. In pursuit to achieve a hardware model of the proposed architecture two ASIPs are conceived where the first one, namely EquASIP, is dedicated for MMSE-IC equalization and provides a flexible solution for multiple MIMO techniques adopted in multiple wireless standards with a capability to work in turbo equalization context. The second ASIP, named as DemASIP, is a flexible demapper which can be used in MIMO or single antenna environment for any modulation till 256-QAM with or without iterative demodulation. Using available TurbASIP and NoC components, the thesis concludes on an FPGA prototype of heterogeneous multi-ASIP NoC based unified turbo receiver which integrates 9 instances of 3 different ASIPs with 2 NoCs.Les normes de communication sans fil, sans cesse en évolution, imposent l'utilisation de techniques modernes telles que les turbocodes, modulation codée à entrelacement bit (BICM), constellation MAQ d'ordre élevé, diversité de constellation (SSD), multiplexage spatial et codage espace-temps multi-antennes (MIMO) avec des paramètres différents pour des transmissions fiables et de haut débit. L'adoption de ces techniques dans l'émetteur peut influencer l'architecture du récepteur de trois façons: (1) les traitement complexes relatifs aux techniques avancées comme les turbocodes, encourage à effectuer un traitement itératif dans le récepteur pour améliorer la performance en termes de taux d'erreur (2) pour satisfaire l'exigence de haut débit avec un récepteur itératif, le recours au parallélisme est obligatoire et enfin (3) pour assurer le support des différentes techniques et paramètres imposées, des processeurs de traitement matériel flexibles, mais aussi de haute performance, sont nécessaires. Dans cette thèse, pour répondre aux besoins de haut débit dans un contexte de traitement itératif, tout d'abord une étude de parallélisme sur le turbo décodage a été étendue aux applications de turbo démodulation et turbo égalisation. Partant des résultats obtenus à partir de l'étude du parallélisme, un récepteur itératif unifié basé sur un modèle d'architecture multi-ASIP hétérogène intégrant un réseau sur puce (NoC) a été proposé. L'architecture proposée répond aux exigences visées d'une manière où: (a) le concept de processeur à jeu d'instruction dédié à l'application (ASIP) exploite le parallélisme du niveau de génération de métriques et met en oeuvre la flexibilité nécessaire, (b) les débits au-delà de la capacité d'un seul ASIP dans un processus itératif sont obtenus au moyen de multiples ASIP implémentant le parallélisme de sous-blocs et le traitement combiné et enfin (c) le concept de réseau sur puce (NoC) est utilisé pour gérer les conflits de communication au cours du traitement parallèle itératif multi-ASIP. Dans le but de parvenir à un modèle matériel de l'architecture proposée, deux ASIP ont été conçus où le premier, nommé EquASIP, est dédié à l'égalisation MMSE-IC et fournit une solution flexible pour de multiples techniques multi-antennes adoptés dans plusieurs normes sans fil avec la capacité de travailler dans un contexte de turbo égalisation. Le deuxième ASIP, nommé DemASIP, est un démappeur flexible qui peut être utilisé dans un environnement multi-antennes et pour tout type de modulation jusqu'à MAQ-256 avec ou sans démodulation itérative. En intégrant ces ASIP, en plus des NoC et TurbASIP disponibles à Télécom Bretagne, la thèse conclut sur un prototype FPGA d'un récepteur itératif unifié multi-ASIP qui intègre 9 coeurs de 3 différents types d'ASIP avec 2 NoC

Thèses en Ligne

HAL-Université de Bretagne Occidentale

HAL Descartes

Techniques d'exploration architecturale de design à usage spécifique pour l'accélération de boucles

Author: Mbaye Mame Maria
Publication venue
Publication date: 01/08/2010
Field of study

RÉSUMÉ De nos jours, les industriels privilégient les architectures flexibles afin de réduire le temps et les coûts de conception d’un système. Les processeurs à usage spécifique (ASIP) fournissent beaucoup de flexibilité, tout en atteignant des performances élevées. Une tendance qui a de plus en plus de succès dans le processus de conception d’un système sur puce consiste à spécifier le comportement du système en langage évolué tel que le C, SystemC, etc. La spécification est ensuite utilisée durant le partitionement pour déterminer les composantes logicielles et matérielles du système. Avec la maturité des générateurs automatiques de ASIP, les concepteurs peuvent rajouter dans leurs boîtes à outils un nouveau type d’architecture, à savoir les ASIP, en sachant que ces derniers sont conçus à partir d’une spécification décrite en langage évolué. D’un autre côté, dans le monde matériel, et cela depuis très longtemps, les chercheurs ont vu l’avantage de baser le processus de conception sur un langage évolué. Cette recherche a abouti à l’avénement de générateurs automatiques de matériel sur le marché qui sont des outils d’aide à la conception comme CapatultC, Forte’s Cynthetizer, etc. Ainsi, avec tous ces outils basés sur le langage C, les concepteurs ont un choix de types de design élargi mais, d’un autre côté, les options de designs possibles explosent, ce qui peut allonger au lieu de réduire le temps de conception. C’est dans ce cadre que notre thèse doctorale s’inscrit, puisqu’elle présente des méthodologies d’exploration architecturale de design à usage spécifique pour l’accélération de boucles afin de réduire le temps de conception, entre autres. Cette thèse a débuté par l’exploration de designs de ASIP. Les boucles de traitement sont de bonnes candidates à l’accélération, si elles comportent de bonnes possibilités de parallélisme et si ces dernières sont bien exploitées. Le matériel est très efficace à profiter des possibilités de parallélisme au niveau instruction, donc, une méthode de conception a été proposée. Cette dernière extrait le parallélisme d’une boucle afin d’exécuter plus d’opérations concurrentes dans des instructions spécialisées. Notre méthode se base aussi sur l’optimisation des données dans l’architecture du processeur.---------- ABSTRACT Time to market is a very important concern in industry. That is why the industry always looks for new CAD tools that contribute to reducing design time. Application-specific instruction-set processors (ASIPs) provide flexibility and they allow reaching good performance if they are well designed. One trend that gains more and more success is C-based design that uses a high level language such as C, SystemC, etc. The C-based specification is used during the partitionning phase to determine the software and hardware components of the system. Since automatic processor generators are mature now, designers have a new type of tool they can rely on during architecture design. In the hardware world, high level synthesis was and is still a hot research topic. The advances in ESL lead to commercial high-level synthesis tools such as CapatultC, Forte’s Cynthetizer, etc. The designers have more tools in their box but they have more solutions to explore, thus their use can have a reverse effect since the design time can increase instead of being reduced. Our doctoral research tackles this issue by proposing new methodologies for design space exploration of application specific architecture for loop acceleration in order to reduce the design time while reaching some targeted performances. Our thesis starts with the exploration of ASIP design. We propose a method that targets loop acceleration with highly coupled specialized-instructions executing loop operations. Loops are good candidates for acceleration when the parallelism they offer is well exploited (if they have any parallelization opportunities). Hardware components such as specialized-instructions can leverage parallelization opportunities at low level. Thus, we propose to extract loop parallelization opportunities and to execute more concurrent operations in specialized-instructions. The main contribution of this method is a new approach to specialized-instruction (SI) design based on loop acceleration where loop optimization and transformation are done in SIs directly, instead of optimizing the software code. Another contribution is the design of tightly-coupled specialized-instructions associated with loops based on a 5-pattern representation

PolyPublie

Flexible scheduling of turbo decoding on a multiprocessor platform

Author: Sahraii Negin
Publication venue
Publication date: 01/01/2009
Field of study

Basic concepts and literature review -- Universal mobile telecommunication system (UMTS) -- The vocallo architecture -- Performance modeling -- Mapping the system level models into MPSoC platforms -- Multiprocessor scheduling and synchronization -- Worst case execution time (WCET) based design -- Scheduling flexible applications -- Mapping and scheduling of turbo decoding in MPSoC platforms -- Performance modeling -- Steps to create a performance model -- Detailed description of the performance model -- One performance model example -- Scheduling of turbo decoding -- Mapping the uplink WCDMA processing on an MPSoC platform -- Processing variability of the studied turbo decoder -- BER performance of the studied turbo decoder -- Proposed methods for scheduling the turbo decoding -- Simulation results -- Validating investigation -- Elapsed simulation time

PolyPublie

Embedded electronic systems driven by run-time reconfigurable hardware

Author: Fons Lluís Francisco
Publication venue: 'Universitat Rovira I Virgili'
Publication date: 01/01/2012
Field of study

Abstract This doctoral thesis addresses the design of embedded electronic systems based on run-time reconfigurable hardware technology –available through SRAM-based FPGA/SoC devices– aimed at contributing to enhance the life quality of the human beings. This work does research on the conception of the system architecture and the reconfiguration engine that provides to the FPGA the capability of dynamic partial reconfiguration in order to synthesize, by means of hardware/software co-design, a given application partitioned in processing tasks which are multiplexed in time and space, optimizing thus its physical implementation –silicon area, processing time, complexity, flexibility, functional density, cost and power consumption– in comparison with other alternatives based on static hardware (MCU, DSP, GPU, ASSP, ASIC, etc.). The design flow of such technology is evaluated through the prototyping of several engineering applications (control systems, mathematical coprocessors, complex image processors, etc.), showing a high enough level of maturity for its exploitation in the industry.Resumen Esta tesis doctoral abarca el diseño de sistemas electrónicos embebidos basados en tecnología hardware dinámicamente reconfigurable –disponible a través de dispositivos lógicos programables SRAM FPGA/SoC– que contribuyan a la mejora de la calidad de vida de la sociedad. Se investiga la arquitectura del sistema y del motor de reconfiguración que proporcione a la FPGA la capacidad de reconfiguración dinámica parcial de sus recursos programables, con objeto de sintetizar, mediante codiseño hardware/software, una determinada aplicación particionada en tareas multiplexadas en tiempo y en espacio, optimizando así su implementación física –área de silicio, tiempo de procesado, complejidad, flexibilidad, densidad funcional, coste y potencia disipada– comparada con otras alternativas basadas en hardware estático (MCU, DSP, GPU, ASSP, ASIC, etc.). Se evalúa el flujo de diseño de dicha tecnología a través del prototipado de varias aplicaciones de ingeniería (sistemas de control, coprocesadores aritméticos, procesadores de imagen, etc.), evidenciando un nivel de madurez viable ya para su explotación en la industria.Resum Aquesta tesi doctoral està orientada al disseny de sistemes electrònics empotrats basats en tecnologia hardware dinàmicament reconfigurable –disponible mitjançant dispositius lògics programables SRAM FPGA/SoC– que contribueixin a la millora de la qualitat de vida de la societat. S’investiga l’arquitectura del sistema i del motor de reconfiguració que proporcioni a la FPGA la capacitat de reconfiguració dinàmica parcial dels seus recursos programables, amb l’objectiu de sintetitzar, mitjançant codisseny hardware/software, una determinada aplicació particionada en tasques multiplexades en temps i en espai, optimizant així la seva implementació física –àrea de silici, temps de processat, complexitat, flexibilitat, densitat funcional, cost i potència dissipada– comparada amb altres alternatives basades en hardware estàtic (MCU, DSP, GPU, ASSP, ASIC, etc.). S’evalúa el fluxe de disseny d’aquesta tecnologia a través del prototipat de varies aplicacions d’enginyeria (sistemes de control, coprocessadors aritmètics, processadors d’imatge, etc.), demostrant un nivell de maduresa viable ja per a la seva explotació a la indústria

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Tesis Doctorals en Xarxa

Repositori Institucional URV

Dependable Embedded Systems

Author: Dutt Nikil
Henkel Jörg
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

This Open Access book introduces readers to many new techniques for enhancing and optimizing reliability in embedded systems, which have emerged particularly within the last five years. This book introduces the most prominent reliability concerns from today’s points of view and roughly recapitulates the progress in the community so far. Unlike other books that focus on a single abstraction level such circuit level or system level alone, the focus of this book is to deal with the different reliability challenges across different levels starting from the physical level all the way to the system level (cross-layer approaches). The book aims at demonstrating how new hardware/software co-design solution can be proposed to ef-fectively mitigate reliability degradation such as transistor aging, processor variation, temperature effects, soft errors, etc. Provides readers with latest insights into novel, cross-layer methods and models with respect to dependability of embedded systems; Describes cross-layer approaches that can leverage reliability through techniques that are pro-actively designed with respect to techniques at other layers; Explains run-time adaptation and concepts/means of self-organization, in order to achieve error resiliency in complex, future many core systems

OAPEN Library

Speech Recognition

Author
Publication venue: 'IntechOpen'
Publication date: 20/04/2021
Field of study

Chapters in the first part of the book cover all the essential speech processing techniques for building robust, automatic speech recognition systems: the representation for speech signals and the methods for speech-features extraction, acoustic and language modeling, efficient algorithms for searching the hypothesis space, and multimodal approaches to speech recognition. The last part of the book is devoted to other speech processing applications that can use the information from automatic speech recognition for speaker identification and tracking, for prosody modeling in emotion-detection systems and in other speech processing applications that are able to operate in real-world environments, like mobile communication services and smart homes

Directory of Open Access Books (DOAB)

Stopping-free dynamic configuration of a multi-ASIP turbo decoder

Author: Baghdadi Amer
DIGUET Jean-Philippe
Gogniat Guy
HUBNER Michael
Lapotre Vianney
MURUGAPPA VELAYUTHAN Purushotham
Publication venue: HAL CCSD
Publication date: 04/09/2013
Field of study

International audienceThe multiplication of wireless standards is introducing the need of ﬂexible and reconfigurable multistandard baseband receivers. At the physical layer, multiprocessor turbo decoders have been recently developed in order to provide an answer to the increasing throughput requirement of emerging standards. However these solutions do not sufficiently address reconfiguration performance issues which can be a limiting factor in the future. This work focuses on the design of a reconfigurable multiprocessor architecture for turbo decoding achieving very fast reconfiguration without compromising decoding performances. Dynamic reconfiguration can be performed within a single frame decoding duration opening new perspective for reconfigurable multistandard baseband receivers. For that purpose, optimizations at the processing element level and a novel bus-based configuration infrastructure are proposed. Results show that up to 64 processings elements can be dynamically configured in 5.352 µs. This low configuration latency corresponds to a single frame decoding duration when performing 6 decoding iterations for a throughput up to 666 Mbps

HAL-Université de Bretagne Occidentale