32 research outputs found

    Exploring High Level Synthesis to Improve the Design of Turbo Code Error Correction in a Software Defined Radio Context

    Get PDF
    With the ever improving progress of technology, Software Defined Radio (SDR) has become a more widely available technique for implementing radio communication. SDRs are sought after for their advantages over traditional radio communication mostly in flexibility, and hardware simplification. The greatest challenges SDRs face are often with their real time performance requirements. Forward error correction is an example of an SDR block that can exemplify these challenges as the error correction can be very computationally intensive. Due to these constraints, SDR implementations are commonly found in or alongside Field Programmable Gate Arrays (FPGAs) to enable performance that general purpose processors alone cannot achieve. The main challenge with FPGAs however, is in Register Transfer Level (RTL) development. High Level Synthesis (HLS) tools are a method of creating hardware descriptions from high level code, in an effort to ease this development process. In this work a turbo code decoder, a form of computationally intensive error correction codes, was accelerated with the help of FPGAs, using HLS tools. This accelerator was implemented on a Xilinx Zynq platform, which integrates a hard core ARM processor alongside programmable logic on a single chip. Important aspects of the design process using HLS were identified and explained. The design process emphasizes the idea that for the best results the high level code should be created with a hardware mindset, and written in an attempt to describe a hardware design. The power of the HLS tools was demonstrated in its flexibility by providing a method of tailoring the hardware parameters through simply changing values in a macro file, and by exploration the design space through different data types and three different designs, each one improving from what was learned in the previous implementation. Ultimately, the best hardware implementation was over 56 times faster than the optimized software implementation. Comparing the HLS to a manually optimized design shows that the HLS implementation was able to achieve over a 19% throughput, with many areas for further improvement identified, demonstrating the competitiveness of the HLS tools

    Effi cient algorithms for iterative detection and decoding in Multiple-Input and Multiple-Output Communication Systems

    Full text link
    This thesis fits into the Multiple-Input Multiple-Output (MIMO) communication systems. Nowadays, these schemes are the most promising technology in the field of wireless communications. The use of this technology allows to increase the rate and the quality of the transmission through the use of multiple antennas at the transmitter and receiver sides. Furthermore, the MIMO technology can also be used in a multiuser scenario, where a Base Station (BS) equipped with several antennas serves several users that share the spatial dimension causing interference. However, employing precoding algorithms the signal of the multiuser interference can be mitigated. For these reasons, the MIMO technology has become an essential key in many new generation communications standards. On the other hand, Massive MIMO technology or Large MIMO, where the BS is equipped with very large number of antennas (hundreds or thousands) serves many users in the same time-frequency resource. Nevertheless, the advantages provided by the MIMO technology entail a substantial increase in the computational cost. Therefore the design of low-complexity receivers is an important issue which is tackled throughout this thesis. To this end, one of the main contributions of this dissertation is the implementation of efficient soft-output detectors and precoding schemes. First, the problem of efficient soft detection with no iteration at the receiver has been addressed. A detailed overview of the most employed soft detectors is provided. Furthermore, the complexity and performance of these methods are evaluated and compared. Additionally, two low-complexity algorithms have been proposed. The first algorithm is based on the efficient Box Optimization Hard Detector (BOHD) algorithm and provides a low-complexity implementation achieving a suitable performance. The second algorithm tries to reduce the computational cost of the Subspace Marginalization with Interference Suppression (SUMIS) algorithm. Second, soft-input soft-output (SISO) detectors, which are included in an iterative receiver structure, have been investigated. An iterative receiver improves the performance with respect to no iteration, achieving a performance close to the channel capacity. In contrast, its computational cost becomes prohibitive. In this context, three algorithms are presented. Two of them achieve max-log performance reducing the complexity of standard SISO detectors. The last one achieves near max-log performance with low complexity. The precoding problem has been addressed in the third part of this thesis. An analysis of some of the most employed precoding techniques has been carried out. The algorithms have been compared in terms of performance and complexity. In this context, the impact of the channel matrix condition number on the performance of the precoders has been analyzed. This impact has been exploited to propose an hybrid precoding scheme that reduces the complexity of the previously proposed precoders. In addition, in Large MIMO systems, an alternative precoder scheme is proposed. In the last part of the thesis, parallel implementations of the SUMIS algorithm are presented. Several strategies for the parallelization of the algorithm are proposed and evaluated on two different platforms: multicore central processing unit (CPU) and graphics processing unit (GPU). The parallel implementations achieve a significant speedup compared to the CPU version. Therefore, these implementations allow to simulate a scalable quasi optimal soft detector in a Large MIMO system much faster than by conventional simuLa presente tesis se enmarca dentro de los sistemas de comunicaciones de múltiples antenas o sistemas MIMO. Hoy en día, estos sistemas presentan una de las tecnologías más prometedoras dentro de los sistemas comunicaciones inalámbricas. A través del uso de múltiples antenas en ambos lados, transmisor y receptor, la tasa de transmisión y la calidad de la misma es aumentada. Por otro lado, la tecnología MIMO puede ser utilizada en un escenario multiusuario, donde una estación base (BS) la cual está equipada con varias antenas, sirve a varios usuarios al mismo tiempo, estos usuarios comparten dimensión espacial causando interferencias multiusuario. Por todas estas razones, la tecnología MIMO ha sido adoptada en muchos de los estándares de comunicaciones de nueva generación. Por otro lado, la tecnología MIMO Masivo, en la cual la estación base está equipada con un gran número de antenas (cientos o miles) que sirve a muchos usuarios en el mismo recurso de tiempo-frecuencia. Sin embargo, las ventajas proporcionadas por los sistemas MIMO implican un aumento en el coste computacional requerido. Por ello, el diseño de receptores de baja complejidad es una cuestión importante en estos sistemas. Para conseguir esta finalidad, las principales contribuciones de la tesis se basan en la implementación de algoritmos de detección soft y esquemas de precodificación eficientes. En primer lugar, el problema de la detección soft eficiente en un sistema receptor sin iteración es abordado. Una descripción detallada sobre los detectores soft más empleados es presentada. Por otro lado, han sido propuestos dos algoritmos de bajo coste. El primer algoritmo está basado en el algoritmo Box Optimization Hard Detector (BOHD) y proporciona una baja complejidad de implementación logrando un buen rendimiento. El segundo de los algoritmos propuestos intenta reducir el coste computacional del conocido algoritmo Subspace Marginalization with Interference Suppression (SUMIS). En segundo lugar, han sido investidados detectores de entrada y salida soft (SISO, soft-input soft-output) los cuales son ejecutados en estructuras de recepción iterativa. El empleo de un receptor iterativo mejora el rendimiento del sistema con respecto a no realizar realimentación, pudiendo lograr la capacidad óptima. Por el contrario, el coste computacional se vuelve prohibitivo. En este contexto, tres algoritmos han sido presentados. Dos de ellos logran un rendimiento óptimo, reduciendo la complejidad de los detectores SISO óptimos que normalmente son empleados. Por el contrario, el otro algoritmo logra un rendimiento casi óptimo a baja complejidad. En la tercera parte, se ha abordado el problema de la precodificación. Se ha llevado a cabo un análisis de algunas de las técnicas de precodificación más usadas. En este contexto, se ha evaluado el impacto que el número de condición de la matriz de canal tiene en el rendimiento de los precodificadores. Además, se ha aprovechado este impacto para proponer un precodificador hibrido. Por otro lado, en MIMO Masivo, se ha propuesto un esquema precodificador. En la última parte de la tesis, la implementación paralela del algoritmo SUMIS es presentada. Varias estrategias sobre la paralelización del algoritmo han sido propuestas y evaluadas en dos plataformas diferentes: Unidad Central de Procesamiento multicore (multicore CPU) y Unidad de Procesamiento Gráfico (GPU). Las implementaciones paralelas consiguen una mejora de speedup. Estas implementaciones permiten simular para MIMO Masivo y de forma más rápida que por simulación convencional, un algoLa present tesi s'emmarca dins dels sistemes de comunicacions de múltiples antenes o sistemes MIMO. Avui dia, aquestos sistemes presenten una de les tecnologies més prometedora dins dels sistemes de comunicacions inalàmbriques. A través de l'ús de múltiples antenes en tots dos costats, transmissor y receptor, es pot augmentar la taxa de transmissió i la qualitat de la mateixa. D'altra banda, la tecnologia MIMO es pot utilitzar en un escenari multiusuari, on una estació base (BS) la qual està equipada amb diverses antenes serveix a diversos usuaris al mateix temps, aquests usuaris comparteixen dimensió espacial causant interferències multiusuari. Per totes aquestes raons, la tecnologia MIMO ha sigut adoptada en molts dels estàndars de comunicacions de nova generació. D'altra banda, la tecnologia MIMO Massiu, en la qual l'estació base està equipada amb un gran nombre d'antenes (centenars o milers) que serveix a molts usuaris en el mateix recurs de temps-freqüència. No obstant això, els avantatges proporcionats pels sistemes MIMO impliquen un augment en el cost computacional requerit. Per això, el disseny de receptors de baixa complexitat és una qüestió important en aquests sistemes. Per tal d'aconseguir esta finalitat, les principals contribucions de la tesi es basen en la implementació d'algoritmes de detecció soft i esquemes de precodificació eficients. En primer lloc, és abordat el problema de la detecció soft eficient en un sistema receptor sense interacció. Una descripció detallada dels detectors soft més emprats és presentada. D'altra banda, han sigut proposats dos algorismes de baix cost. El primer algorisme està basat en l'algorisme Box Optimization Hard Decoder (BOHD) i proporciona una baixa complexitat d'implementació aconseguint un bon resultat. El segon dels algorismes proposats intenta reduir el cost computacional del conegut algoritme Subspace Marginalization with Interference Suppression (SUMIS). En segon lloc, detectors d'entrada i eixidia soft (SISO, soft-input soft-output) els cuals són executats en estructures de recepció iterativa han sigut investigats. L'ocupació d'un receptor iteratiu millora el rendiment del sistema pel que fa a no realitzar realimentació, podent aconseguir la capacitat òptima. Per contra, el cost computacional es torna prohibitiu. En aquest context, tres algorismes han sigut presentats. Dos d'ells aconsegueixen un rendiment òptim, reduint la complexitat dels detectors SISO òptims que normalment són emprats. Per contra, l'altre algorisme aconsegueix un rendiment quasi òptim a baixa complexitat. En la tercera part, s'ha abordat el problema de la precodificació. S'ha dut a terme una anàlisi d'algunes de les tècniques de precodificació més usades, prestant especial atenció al seu rendiment i a la seua complexitat. Dins d'aquest context, l'impacte que el nombre de condició de la matriu de canal té en el rendiment dels precodificadors ha sigut avaluat. A més, aquest impacte ha sigut aprofitat per a proposar un precodificador híbrid , amb la finalitat de reduir la complexitat d'algorismes prèviament proposats. D'altra banda, en MIMO Massiu, un esquema precodificador ha sigut proposat. En l'última part, la implementació paral·lela de l'algorisme SUMIS és presentada. Diverses estratègies sobre la paral·lelizació de l'algorisme han sigut proposades i avaluades en dues plataformes diferents: multicore CPU i GPU. Les implementacions paral·leles aconsegueixen una millora de speedup quan el nombre d'àntenes o l'ordre de la constel·lació incrementen. D'aquesta manera, aquestes implementacions permeten simular per a MIMO Massiu, i de forma més ràpida que la simulació convencional.Simarro Haro, MDLA. (2017). Effi cient algorithms for iterative detection and decoding in Multiple-Input and Multiple-Output Communication Systems [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/86186TESI

    Joint source and channel coding

    Get PDF

    Hardware-Conscious Wireless Communication System Design

    Get PDF
    The work at hand is a selection of topics in efficient wireless communication system design, with topics logically divided into two groups.One group can be described as hardware designs conscious of their possibilities and limitations. In other words, it is about hardware that chooses its configuration and properties depending on the performance that needs to be delivered and the influence of external factors, with the goal of keeping the energy consumption as low as possible. Design parameters that trade off power with complexity are identified for analog, mixed signal and digital circuits, and implications of these tradeoffs are analyzed in detail. An analog front end and an LDPC channel decoder that adapt their parameters to the environment (e.g. fluctuating power level due to fading) are proposed, and it is analyzed how much power/energy these environment-adaptive structures save compared to non-adaptive designs made for the worst-case scenario. Additionally, the impact of ADC bit resolution on the energy efficiency of a massive MIMO system is examined in detail, with the goal of finding bit resolutions that maximize the energy efficiency under various system setups.In another group of themes, one can recognize systems where the system architect was conscious of fundamental limitations stemming from hardware.Put in another way, in these designs there is no attempt of tweaking or tuning the hardware. On the contrary, system design is performed so as to work around an existing and unchangeable hardware limitation. As a workaround for the problematic centralized topology, a massive MIMO base station based on the daisy chain topology is proposed and a method for signal processing tailored to the daisy chain setup is designed. In another example, a large group of cooperating relays is split into several smaller groups, each cooperatively performing relaying independently of the others. As cooperation consumes resources (such as bandwidth), splitting the system into smaller, independent cooperative parts helps save resources and is again an example of a workaround for an inherent limitation.From the analyses performed in this thesis, promising observations about hardware consciousness can be made. Adapting the structure of a hardware block to the environment can bring massive savings in energy, and simple workarounds prove to perform almost as good as the inherently limited designs, but with the limitation being successfully bypassed. As a general observation, it can be concluded that hardware consciousness pays off

    Representation learning on complex data

    Get PDF
    Machine learning has enabled remarkable progress in various fields of research and application in recent years. The primary objective of machine learning consists of developing algorithms that can learn and improve through observation and experience. Machine learning algorithms learn from data, which may exhibit various forms of complexity, which pose fundamental challenges. In this thesis, we address two major types of data complexity: First, data is often inherently connected and can be modeled by a single or multiple graphs. Machine learning methods could potentially exploit these connections, for instance, to find groups of similar users in a social network for targeted marketing or to predict functional properties of proteins for drug design. Secondly, data is often high-dimensional, for instance, due to a large number of recorded features or induced by a quadratic pixel grid on images. Classical machine learning methods perennially fail when exposed to high-dimensional data as several key assumptions cease to be satisfied. Therefore, a major challenge associated with machine learning on graphs and high-dimensional data is to derive meaningful representations of this data, which allow models to learn effectively. In contrast to conventional manual feature engineering methods, representation learning aims at automatically learning data representations that are particularly suitable for a specific task at hand. Driven by a rapidly increasing availability of data, these methods have celebrated tremendous success for tasks such as object detection in images and speech recognition. However, there is still a considerable amount of research work to be done to fully leverage such techniques for learning on graphs and high-dimensional data. In this thesis, we address the problem of learning meaningful representations for highly-effective machine learning on complex data, in particular, graph data and high-dimensional data. Additionally, most of our proposed methods are highly scalable, allowing them to learn from massive amounts of data. While we address a wide range of general learning problems with different modes of supervision, ranging from unsupervised problems on unlabeled data to (semi-)-supervised learning on annotated data sets, we evaluate our models on specific tasks from fields such as social network analysis, information security, and computer vision. The first part of this thesis addresses representation learning on graphs. While existing graph neural network models commonly perform synchronous message passing between nodes and thus struggle with long-range dependencies and efficiency issues, our first proposed method performs fast asynchronous message passing and, therefore, supports adaptive and efficient learning and additionally scales to large graphs. Another contribution consists of a novel graph-based approach to malware detection and classification based on network traffic. While existing methods classify individual network flows between two endpoints, our algorithm collects all traffic in a monitored network within a specific time frame and builds a communication graph, which is then classified using a novel graph neural network model. The developed model can be generally applied to further graph classification or anomaly detection tasks. Two further contributions challenge a common assumption made by graph learning methods, termed homophily, which states that nodes with similar properties are usually closely connected in the graph. To this end, we develop a method that predicts node-level properties leveraging the distribution of class labels appearing in the neighborhood of the respective node. That allows our model to learn general relations between a node and its neighbors, which are not limited to homophily. Another proposed method specifically models structural similarity between nodes to model different roles, for instance, influencers and followers in a social network. In particular, we develop an unsupervised algorithm for deriving node descriptors based on how nodes spread probability mass to their neighbors and aggregate these descriptors to represent entire graphs. The second part of this thesis addresses representation learning on high-dimensional data. Specifically, we consider the problem of clustering high-dimensional data, such as images, texts, or gene expression profiles. Classical clustering algorithms struggle with this type of data since it can usually not be assumed that data objects will be similar w.r.t. all attributes, but only within a particular subspace of the full-dimensional ambient space. Subspace clustering is an approach to clustering high-dimensional data based on this assumption. While there already exist powerful neural network-based subspace clustering methods, these methods commonly suffer from scalability issues and lack a theoretical foundation. To this end, we propose a novel metric learning approach to subspace clustering, which can provably recover linear subspaces under suitable assumptions and, at the same time, tremendously reduces the required numbear of model parameters and memory compared to existing algorithms.Maschinelles Lernen hat in den letzten Jahren bemerkenswerte Fortschritte in verschiedenen Forschungs- und Anwendungsbereichen ermöglicht. Das primäre Ziel des maschinellen Lernens besteht darin, Algorithmen zu entwickeln, die durch Beobachtung und Erfahrung lernen und sich verbessern können. Algorithmen des maschinellen Lernens lernen aus Daten, die verschiedene Formen von Komplexität aufweisen können, was grundlegende Herausforderungen mit sich bringt. Im Rahmen dieser Dissertation werden zwei Haupttypen von Datenkomplexität behandelt: Erstens weisen Daten oft inhärente Verbindungen, die durch einen einzelnen oder mehrere Graphen modelliert werden können. Methoden des maschinellen Lernens können diese Verbindungen potenziell ausnutzen, um beispielsweise Gruppen ähnlicher Nutzer in einem sozialen Netzwerk für gezieltes Marketing zu finden oder um funktionale Eigenschaften von Proteinen für das Design von Medikamenten vorherzusagen. Zweitens sind die Daten oft hochdimensional, z. B. aufgrund einer großen Anzahl von erfassten Merkmalen oder bedingt durch ein quadratisches Pixelraster auf Bildern. Klassische Methoden des maschinellen Lernens versagen immer wieder, wenn sie hochdimensionalen Daten ausgesetzt werden, da mehrere Schlüsselannahmen nicht mehr erfüllt sind. Daher besteht eine große Herausforderung beim maschinellen Lernen auf Graphen und hochdimensionalen Daten darin, sinnvolle Repräsentationen dieser Daten abzuleiten, die es den Modellen ermöglichen, effektiv zu lernen. Im Gegensatz zu konventionellen manuellen Feature-Engineering-Methoden zielt Representation Learning darauf ab, automatisch Datenrepräsentationen zu lernen, die für eine bestimmte Aufgabenstellung besonders geeignet sind. Angetrieben durch eine rasant steigende Datenverfügbarkeit haben diese Methoden bei Aufgaben wie der Objekterkennung in Bildern und der Spracherkennung enorme Erfolge gefeiert. Es besteht jedoch noch ein erheblicher Forschungsbedarf, um solche Verfahren für das Lernen auf Graphen und hochdimensionalen Daten voll auszuschöpfen. Diese Dissertation beschäftigt sich mit dem Problem des Lernens sinnvoller Repräsentationen für hocheffektives maschinelles Lernen auf komplexen Daten, insbesondere auf Graphen und hochdimensionalen Daten. Zusätzlich sind die meisten hier vorgeschlagenen Methoden hoch skalierbar, so dass sie aus großen Datenmengen lernen können. Obgleich eine breite Palette von allgemeinen Lernproblemen mit verschiedenen Arten der Überwachung adressiert wird, die von unüberwachten Problemen auf unannotierten Daten bis hin zum (semi-)überwachten Lernen auf annotierten Datensätzen reichen, werden die vorgestellten Metoden anhand spezifischen Anwendungen aus Bereichen wie der Analyse sozialer Netzwerke, der Informationssicherheit und der Computer Vision evaluiert. Der erste Teil der Dissertation befasst sich mit dem Representation Learning auf Graphen. Während existierende neuronale Netze für Graphen üblicherweise eine synchrone Nachrichtenübermittlung zwischen den Knoten durchführen und somit mit langreichweitigen Abhängigkeiten und Effizienzproblemen zu kämpfen haben, führt die erste hier vorgeschlagene Methode eine schnelle asynchrone Nachrichtenübermittlung durch und unterstützt somit adaptives und effizientes Lernen und skaliert zudem auf große Graphen. Ein weiterer Beitrag besteht in einem neuartigen graphenbasierten Ansatz zur Malware-Erkennung und -Klassifizierung auf Basis des Netzwerkverkehrs. Während bestehende Methoden einzelne Netzwerkflüsse zwischen zwei Endpunkten klassifizieren, sammelt der vorgeschlagene Algorithmus den gesamten Verkehr in einem überwachten Netzwerk innerhalb eines bestimmten Zeitraums und baut einen Kommunikationsgraphen auf, der dann mithilfe eines neuartigen neuronalen Netzes für Graphen klassifiziert wird. Das entwickelte Modell kann allgemein für weitere Graphenklassifizierungs- oder Anomalieerkennungsaufgaben eingesetzt werden. Zwei weitere Beiträge stellen eine gängige Annahme von Graphen-Lernmethoden in Frage, die so genannte Homophilie-Annahme, die besagt, dass Knoten mit ähnlichen Eigenschaften in der Regel eng im Graphen verbunden sind. Zu diesem Zweck wird eine Methode entwickelt, die Eigenschaften auf Knotenebene vorhersagt, indem sie die Verteilung der annotierten Klassen in der Nachbarschaft des jeweiligen Knotens nutzt. Das erlaubt dem vorgeschlagenen Modell, allgemeine Beziehungen zwischen einem Knoten und seinen Nachbarn zu lernen, die nicht auf Homophilie beschränkt sind. Eine weitere vorgeschlagene Methode modelliert strukturelle Ähnlichkeit zwischen Knoten, um unterschiedliche Rollen zu modellieren, zum Beispiel Influencer und Follower in einem sozialen Netzwerk. Insbesondere entwickeln wir einen unüberwachten Algorithmus zur Ableitung von Knoten-Deskriptoren, die darauf basieren, wie Knoten Wahrscheinlichkeitsmasse auf ihre Nachbarn verteilen, und aggregieren diese Deskriptoren, um ganze Graphen darzustellen. Der zweite Teil dieser Dissertation befasst sich mit dem Representation Learning auf hochdimensionalen Daten. Konkret wird das Problem des Clusterns hochdimensionaler Daten, wie z. B. Bilder, Texte oder Genexpressionsprofile, betrachtet. Klassische Clustering-Algorithmen haben mit dieser Art von Daten zu kämpfen, da in der Regel nicht davon ausgegangen werden kann, dass die Datenobjekte in Bezug auf alle Attribute ähnlich sind, sondern nur innerhalb eines bestimmten Unterraums des volldimensionalen Datenraums. Das Unterraum-Clustering ist ein Ansatz zum Clustern hochdimensionaler Daten, der auf dieser Annahme basiert. Obwohl es bereits leistungsfähige, auf neuronalen Netzen basierende Unterraum-Clustering-Methoden gibt, leiden diese Methoden im Allgemeinen unter Skalierbarkeitsproblemen und es fehlt ihnen an einer theoretischen Grundlage. Zu diesem Zweck wird ein neuartiger Metric Learning Ansatz für das Unterraum-Clustering vorgeschlagen, der unter geeigneten Annahmen nachweislich lineare Unterräume detektieren kann und gleichzeitig die erforderliche Anzahl von Modellparametern und Speicher im Vergleich zu bestehenden Algorithmen enorm reduziert
    corecore