373 research outputs found
MATLAB as a Design and Verification Tool for the Hardware Prototyping of Wireless Communication Systems
Peer ReviewedPostprint (published version
Driving the Network-on-Chip Revolution to Remove the Interconnect Bottleneck in Nanoscale Multi-Processor Systems-on-Chip
The sustained demand for faster, more powerful chips has been met by the
availability of chip manufacturing processes allowing for the integration of increasing
numbers of computation units onto a single die. The resulting outcome,
especially in the embedded domain, has often been called SYSTEM-ON-CHIP
(SoC) or MULTI-PROCESSOR SYSTEM-ON-CHIP (MP-SoC).
MPSoC design brings to the foreground a large number of challenges, one of
the most prominent of which is the design of the chip interconnection. With a
number of on-chip blocks presently ranging in the tens, and quickly approaching
the hundreds, the novel issue of how to best provide on-chip communication
resources is clearly felt.
NETWORKS-ON-CHIPS (NoCs) are the most comprehensive and scalable
answer to this design concern. By bringing large-scale networking concepts to
the on-chip domain, they guarantee a structured answer to present and future
communication requirements. The point-to-point connection and packet switching
paradigms they involve are also of great help in minimizing wiring overhead
and physical routing issues. However, as with any technology of recent inception,
NoC design is still an evolving discipline. Several main areas of interest
require deep investigation for NoCs to become viable solutions:
• The design of the NoC architecture needs to strike the best tradeoff among
performance, features and the tight area and power constraints of the onchip
domain.
• Simulation and verification infrastructure must be put in place to explore,
validate and optimize the NoC performance.
• NoCs offer a huge design space, thanks to their extreme customizability in
terms of topology and architectural parameters. Design tools are needed
to prune this space and pick the best solutions.
• Even more so given their global, distributed nature, it is essential to evaluate
the physical implementation of NoCs to evaluate their suitability for
next-generation designs and their area and power costs.
This dissertation performs a design space exploration of network-on-chip architectures,
in order to point-out the trade-offs associated with the design of
each individual network building blocks and with the design of network topology
overall. The design space exploration is preceded by a comparative analysis
of state-of-the-art interconnect fabrics with themselves and with early networkon-
chip prototypes. The ultimate objective is to point out the key advantages
that NoC realizations provide with respect to state-of-the-art communication
infrastructures and to point out the challenges that lie ahead in order to make
this new interconnect technology come true. Among these latter, technologyrelated
challenges are emerging that call for dedicated design techniques at all
levels of the design hierarchy. In particular, leakage power dissipation, containment
of process variations and of their effects. The achievement of the above
objectives was enabled by means of a NoC simulation environment for cycleaccurate
modelling and simulation and by means of a back-end facility for the
study of NoC physical implementation effects. Overall, all the results provided
by this work have been validated on actual silicon layout
Methoden und Beschreibungssprachen zur Modellierung und Verifikation vonSchaltungen und Systemen: MBMV 2015 - Tagungsband, Chemnitz, 03. - 04. März 2015
Der Workshop Methoden und Beschreibungssprachen zur Modellierung und Verifikation von Schaltungen und Systemen (MBMV 2015) findet nun schon zum 18. mal statt. Ausrichter sind in diesem Jahr die Professur Schaltkreis- und Systementwurf der Technischen Universität Chemnitz und das Steinbeis-Forschungszentrum Systementwurf und Test.
Der Workshop hat es sich zum Ziel gesetzt, neueste Trends, Ergebnisse und aktuelle Probleme auf dem Gebiet der Methoden zur Modellierung und Verifikation sowie der Beschreibungssprachen digitaler, analoger und Mixed-Signal-Schaltungen zu diskutieren. Er soll somit ein Forum zum Ideenaustausch sein.
Weiterhin bietet der Workshop eine Plattform für den Austausch zwischen Forschung und Industrie sowie zur Pflege bestehender und zur Knüpfung neuer Kontakte. Jungen Wissenschaftlern erlaubt er, ihre Ideen und Ansätze einem breiten Publikum aus Wissenschaft und Wirtschaft zu präsentieren und im Rahmen der Veranstaltung auch fundiert zu diskutieren. Sein langjähriges Bestehen hat ihn zu einer festen Größe in vielen Veranstaltungskalendern gemacht. Traditionell sind auch die Treffen der ITGFachgruppen an den Workshop angegliedert.
In diesem Jahr nutzen zwei im Rahmen der InnoProfile-Transfer-Initiative durch das Bundesministerium für Bildung und Forschung geförderte Projekte den Workshop, um in zwei eigenen Tracks ihre Forschungsergebnisse einem breiten Publikum zu präsentieren. Vertreter der Projekte Generische Plattform für Systemzuverlässigkeit und Verifikation (GPZV) und GINKO - Generische Infrastruktur zur nahtlosen energetischen Kopplung von Elektrofahrzeugen stellen Teile ihrer gegenwärtigen Arbeiten vor. Dies bereichert denWorkshop durch zusätzliche Themenschwerpunkte und bietet eine wertvolle Ergänzung zu den Beiträgen der Autoren. [... aus dem Vorwort
RISC-V Core Instruction Extension Sets M and F
This thesis project presents the hardware design of the components capable of implementing a 5-stages core RV32I, RV32IM with integer multiplication and division expansion, and RV32IMF with partial single-precision floating-point support. These have been developed using Verilog HDL and based on the RISC-V ISA. Furthermore, these designs have been verified and synthesised on "bare-metal" using the FPGA platform from the DE0 development board. In addition, a custom variety of division modules have been produced to offer performance diversity on frequency of operation, resource allocation and number of clock cycles per division operations. The selection of these modules provides implementation options that allow to personalize the product to the customer needs
Sistema de predicción epileptogenica en lazo cerrado basado en matrices sub-durales
The human brain is the most complex organ in the human body, which consists of
approximately 100 billion neurons. These cells effortlessly communicate over multiple
hemispheres to deliver our everyday sensorimotor and cognitive abilities.
Although the underlying principles of neuronal communication are not well understood,
there is evidence to suggest precise synchronisation and/or de-synchronisation
of neuronal clusters could play an important role. Furthermore, new evidence suggests
that these patterns of synchronisation could be used as an identifier for the detection
of a variety of neurological disorders including, Alzheimers (AD), Schizophrenia (SZ)
and Epilepsy (EP), where neural degradation or hyper synchronous networks have
been detected.
Over the years many different techniques have been proposed for the detection of
synchronisation patterns, in the form of spectral analysis, transform approaches and
statistical based studies. Nonetheless, most are confined to software based implementations
as opposed to hardware realisations due to their complexity. Furthermore, the
few hardware implementations which do exist, suffer from a lack of scalability, in terms
of brain area coverage, throughput and power consumption.
Here we introduce the design and implementation of a hardware efficient algorithm,
named Delay Difference Analysis (DDA), for the identification of patient specific
synchronisation patterns. The design is remarkably hardware friendly when compared
with other algorithms. In fact, we can reduce hardware requirements by as much as
80% and power consumption as much as 90%, when compared with the most common
techniques. In terms of absolute sensitivity the DDA produces an average sensitivity
of more than 80% for a false positive rate of 0.75 FP/h and indeed up to a maximum
of 90% for confidence levels of 95%. This thesis presents two integer-based digital processors for the calculation of
phase synchronisation between neural signals. It is based on the measurement of time
periods between two consecutive minima. The simplicity of the approach allows for
the use of elementary digital blocks, such as registers, counters or adders. In fact,
the first introduced processor was fabricated in a 0.18μm CMOS process and only
occupies 0.05mm2 and consumes 15nW from a 0.5V supply voltage at a signal input
rate of 1024S/s. These low-area and low-power features make the proposed circuit a
valuable computing element in closed-loop neural prosthesis for the treatment of neural
disorders, such as epilepsy, or for measuring functional connectivity maps between
different recording sites in the brain.
A second VLSI implementation was designed and integrated as a mass integrated
16-channel design. Incorporated into the design were 16 individual synchronisation
processors (15 on-line processors and 1 test processor) each with a dedicated training
and calculation module, used to build a specialised epileptic detection system based
on patient specific synchrony thresholds. Each of the main processors are capable of
calculating the phase synchrony between 9 independent electroencephalography (EEG)
signals over 8 epochs of time totalling 120 EEG combinations. Remarkably, the entire
circuit occupies a total area of only 3.64 mm2.
This design was implemented with a multi-purpose focus in mind. Firstly, as a
clinical aid to help physicians detect pathological brain states, where the small area
would allow the patient to wear the device for home trials. Moreover, the small power
consumption would allow to run from standard batteries for long periods. The trials
could produce important patient specific information which could be processed using
mathematical tools such as graph theory. Secondly, the design was focused towards the
use as an in-vivo device to detect phase synchrony in real time for patients who suffer
with such neurological disorders as EP, which need constant monitoring and feedback.
In future developments this synchronisation device would make an good contribution
to a full system on chip device for detection and stimulation.El cerebro humano es el órgano más complejo del cuerpo humano, que consta
de aproximadamente 100 mil millones de neuronas. Estas células se comunican sin
esfuerzo a través de ambos hemisferios para favorecer nuestras habilidades sensoriales
y cognitivas diarias.
Si bien los principios subyacentes de la comunicación neuronal no se comprenden
bien, existen pruebas que sugieren que la sincronización precisa y/o la desincronización
de los grupos neuronales podrían desempeñar un papel importante. Además, nuevas
evidencias sugieren que estos patrones de sincronización podrían usarse como un identificador
para la detección de una gran variedad de trastornos neurológicos incluyendo
la enfermedad de Alzheimer(AD), la esquizofrenia(SZ) y la epilepsia(EP), donde se ha
detectado la degradación neural o las redes hiper sincrónicas.
A lo largo de los años, se han propuesto muchas técnicas diferentes para la detección
de patrones de sincronización en forma de análisis espectral, enfoques de transformación
y análisis estadísticos. No obstante, la mayoría se limita a implementaciones basadas
en software en lugar de realizaciones de hardware debido a su complejidad. Además,
las pocas implementaciones de hardware que existen, sufren una falta de escalabilidad,
en términos de cobertura del área del cerebro, rendimiento y consumo de energía.
Aquí presentamos el diseño y la implementación de un algoritmo eficiente de
hardware llamado “Delay Difference Aproximation” (DDA) para la identificación
de patrones de sincronización específicos del paciente. El diseño es notablemente
compatible con el hardware en comparación con otros algoritmos. De hecho, podemos
reducir los requisitos de hardware hasta en un 80% y el consumo de energía hasta en
un 90%, en comparación con las técnicas más comunes. En términos de sensibilidad
absoluta, la DDA produce una sensibilidad promedio de más del 80% para una tasa de
falsos positivos de 0,75 PF / hr y hasta un máximo del 90% para niveles de confianza
del 95%.
Esta tesis presenta dos procesadores digitales para el cálculo de la sincronización de
fase entre señales neuronales. Se basa en la medición de los períodos de tiempo entre dos
mínimos consecutivos. La simplicidad del enfoque permite el uso de bloques digitales
elementales, como registros, contadores o sumadores. De hecho, el primer procesador
introducido se fabricó en un proceso CMOS de 0.18μm y solo ocupa 0.05mm2 y consume
15nW de un voltaje de suministro de 0.5V a una tasa de entrada de señal de 1024S/s Estas características de baja área y baja potencia hacen que el procesador propuesto
sea un valioso elemento informático en prótesis neurales de circuito cerrado para el
tratamiento de trastornos neuronales, como la epilepsia, o para medir mapas de
conectividad funcional entre diferentes sitios de registro en el cerebro.
Además, se diseñó una segunda implementación VLSI que se integró como un
diseño de 16 canales integrado en masa. Se incorporaron al diseño 16 procesadores
de sincronización individuales (15 procesadores en línea y 1 procesador de prueba),
cada uno con un módulo de entrenamiento y cálculo dedicado, utilizado para construir
un sistema de detección epiléptico especializado basado en umbrales de sincronía
específicos del paciente. Cada uno de los procesadores principales es capaz de calcular
la sincronización de fase entre 9 señales de electroencefalografía (EEG) independientes
en 8 épocas de tiempo que totalizan 120 combinaciones de EEG. Cabe destacar que
todo el circuito ocupa un área total de solo 3.64 mm2.
Este diseño fue implementado teniendo en mente varios propósitos. En primer
lugar, como ayuda clínica para ayudar a los médicos a detectar estados cerebrales
patológicos, donde el área pequeña permitiría al paciente usar el dispositivo para las
pruebas caseras. Además, el pequeño consumo de energía permitiría una carga cero del
dispositivo, lo que le permitiría funcionar con baterías estándar durante largos períodos.
Los ensayos podrían producir información importante específica para el paciente que
podría procesarse utilizando herramientas matemáticas como la teoría de grafos. En
segundo lugar, el diseño se centró en el uso como un dispositivo in-vivo para detectar la
sincronización de fase en tiempo real para pacientes que sufren trastornos neurológicos
como el EP, que necesitan supervisión y retroalimentación constantes. En desarrollos
futuros, este dispositivo de sincronización sería una buena base para desarrollar un
sistema completo de un dispositivo chip para detección de trastornos neurológicos
inSense: A Variation and Fault Tolerant Architecture for Nanoscale Devices
Transistor technology scaling has been the driving force in improving the size, speed, and power consumption of digital systems. As devices approach atomic size, however, their reliability and performance are increasingly compromised due to reduced noise margins, difficulties in fabrication, and emergent nano-scale phenomena. Scaled CMOS devices, in particular, suffer from process variations such as random dopant fluctuation (RDF) and line edge roughness (LER), transistor degradation mechanisms such as negative-bias temperature instability (NBTI) and hot-carrier injection (HCI), and increased sensitivity to single event upsets (SEUs). Consequently, future devices may exhibit reduced performance, diminished lifetimes, and poor reliability.
This research proposes a variation and fault tolerant architecture, the inSense architecture, as a circuit-level solution to the problems induced by the aforementioned phenomena. The inSense architecture entails augmenting circuits with introspective and sensory capabilities which are able to dynamically detect and compensate for process variations, transistor degradation, and soft errors. This approach creates ``smart\u27\u27 circuits able to function despite the use of unreliable devices and is applicable to current CMOS technology as well as next-generation devices using new materials and structures. Furthermore, this work presents an automated prototype implementation of the inSense architecture targeted to CMOS devices and is evaluated via implementation in ISCAS \u2785 benchmark circuits. The automated prototype implementation is functionally verified and characterized: it is found that error detection capability (with error windows from 30-400ps) can be added for less than 2\% area overhead for circuits of non-trivial complexity. Single event transient (SET) detection capability (configurable with target set-points) is found to be functional, although it generally tracks the standard DMR implementation with respect to overheads
Feasibility Study of High-Level Synthesis : Implementation of a Real-Time HEVC Intra Encoder on FPGA
High-Level Synthesis (HLS) on automatisoitu suunnitteluprosessi, joka pyrkii parantamaan tuottavuutta perinteisiin suunnittelumenetelmiin verrattuna, nostamalla suunnittelun abstraktiota rekisterisiirtotasolta (RTL) käyttäytymistasolle. Erilaisia kaupallisia HLS-työkaluja on ollut markkinoilla aina 1990-luvulta lähtien, mutta vasta äskettäin ne ovat alkaneet saada hyväksyntää teollisuudessa sekä akateemisessa maailmassa. Hidas käyttöönottoaste on johtunut pääasiassa huonommasta tulosten laadusta (QoR) kuin mitä on ollut mahdollista tavanomaisilla laitteistokuvauskielillä (HDL). Uusimmat HLS-työkalusukupolvet ovat kuitenkin kaventaneet QoR-aukkoa huomattavasti.
Tämä väitöskirja tutkii HLS:n soveltuvuutta videokoodekkien kehittämiseen. Se esittelee useita HLS-toteutuksia High Efficiency Video Coding (HEVC) -koodaukselle, joka on keskeinen mahdollistava tekniikka lukuisille nykyaikaisille mediasovelluksille. HEVC kaksinkertaistaa koodaustehokkuuden edeltäjäänsä Advanced Video Coding (AVC) -standardiin verrattuna, saavuttaen silti saman subjektiivisen visuaalisen laadun. Tämä tyypillisesti saavutetaan huomattavalla laskennallisella lisäkustannuksella. Siksi reaaliaikainen HEVC vaatii automatisoituja suunnittelumenetelmiä, joita voidaan käyttää rautatoteutus- (HW ) ja varmennustyön minimoimiseen.
Tässä väitöskirjassa ehdotetaan HLS:n käyttöä koko enkooderin suunnitteluprosessissa. Dataintensiivisistä koodaustyökaluista, kuten intra-ennustus ja diskreetit muunnokset, myös enemmän kontrollia vaativiin kokonaisuuksiin, kuten entropiakoodaukseen. Avoimen lähdekoodin Kvazaar HEVC -enkooderin C-lähdekoodia hyödynnetään tässä työssä referenssinä HLS-suunnittelulle sekä toteutuksen varmentamisessa. Suorituskykytulokset saadaan ja raportoidaan ohjelmoitavalla porttimatriisilla (FPGA).
Tämän väitöskirjan tärkein tuotos on HEVC intra enkooderin prototyyppi. Prototyyppi koostuu Nokia AirFrame Cloud Server palvelimesta, varustettuna kahdella 2.4 GHz:n 14-ytiminen Intel Xeon prosessorilla, sekä kahdesta Intel Arria 10 GX FPGA kiihdytinkortista, jotka voidaan kytkeä serveriin käyttäen joko peripheral component interconnect express (PCIe) liitäntää tai 40 gigabitin Ethernettiä. Prototyyppijärjestelmä saavuttaa reaaliaikaisen 4K enkoodausnopeuden, jopa 120 kuvaa sekunnissa. Lisäksi järjestelmän suorituskykyä on helppo skaalata paremmaksi lisäämällä järjestelmään käytännössä minkä tahansa määrän verkkoon kytkettäviä FPGA-kortteja.
Monimutkaisen HEVC:n tehokas mallinnus ja sen monipuolisten ominaisuuksien mukauttaminen reaaliaikaiselle HW HEVC enkooderille ei ole triviaali tehtävä, koska HW-toteutukset ovat perinteisesti erittäin aikaa vieviä. Tämä väitöskirja osoittaa, että HLS:n avulla pystytään nopeuttamaan kehitysaikaa, tarjoamaan ennen näkemätöntä suunnittelun skaalautuvuutta, ja silti osoittamaan kilpailukykyisiä QoR-arvoja ja absoluuttista suorituskykyä verrattuna olemassa oleviin toteutuksiin.High-Level Synthesis (HLS) is an automated design process that seeks to improve productivity over traditional design methods by increasing design abstraction from register transfer level (RTL) to behavioural level. Various commercial HLS tools have been available on the market since the 1990s, but only recently they have started to gain adoption across industry and academia. The slow adoption rate has mainly stemmed from lower quality of results (QoR) than obtained with conventional hardware description languages (HDLs). However, the latest HLS tool generations have substantially narrowed the QoR gap.
This thesis studies the feasibility of HLS in video codec development. It introduces several HLS implementations for High Efficiency Video Coding (HEVC) , that is the key enabling technology for numerous modern media applications. HEVC doubles the coding efficiency over its predecessor Advanced Video Coding (AVC) standard for the same subjective visual quality, but typically at the cost of considerably higher computational complexity. Therefore, real-time HEVC calls for automated design methodologies that can be used to minimize the HW implementation and verification effort.
This thesis proposes to use HLS throughout the whole encoder design process. From data-intensive coding tools, like intra prediction and discrete transforms, to more control-oriented tools, such as entropy coding. The C source code of the open-source Kvazaar HEVC encoder serves as a design entry point for the HLS flow, and it is also utilized in design verification. The performance results are gathered with and reported for field programmable gate array (FPGA) .
The main contribution of this thesis is an HEVC intra encoder prototype that is built on a Nokia AirFrame Cloud Server equipped with 2.4 GHz dual 14-core Intel Xeon processors and two Intel Arria 10 GX FPGA Development Kits, that can be connected to the server via peripheral component interconnect express (PCIe) generation 3 or 40 Gigabit Ethernet. The proof-of-concept system achieves real-time.
4K coding speed up to 120 fps, which can be further scaled up by adding practically any number of network-connected FPGA cards.
Overcoming the complexity of HEVC and customizing its rich features for a real-time HEVC encoder implementation on hardware is not a trivial task, as hardware development has traditionally turned out to be very time-consuming. This thesis shows that HLS is able to boost the development time, provide previously unseen design scalability, and still result in competitive performance and QoR over state-of-the-art hardware implementations
- …