29 research outputs found
Efficient Hardware Implementation of Probabilistic Gradient Descent Bit Flipping
This paper presents a new Bit Flipping (BF) decoder, called Probabilistic Parallel Bit Flipping (PPBF) for Low-Density Parity-Check (LDPC) codes on the Binary Symmetric Channel. In PPBF, the flipping operation is preceded with a probabilistic behavior which is shown to improve significantly the error correction performance. The advantage of PPBF comes from the fact that, no global computation is required during the decoding process and from that, all the computations can be executed in the local computing units and in-parallel. PPBF provides a considerable improvement of the decoding frequency and complexity, compared to other known BF decoders, while obtaining a significant gain in error correction. One improved version of PPBF, called non-syndrome PPBF (NS-PPBF) is also introduced, in which the global syndrome check is moved out of the critical path and a new terminating mechanism is proposed. In order to show the superiority of the new decoders in terms of hardware efficiency and decoding throughput, the corresponding hardware architectures are presented in the second part of the paper. The ASIC synthesis results confirm that, the decoding frequency of the proposed decoders is significantly improved, much higher than the BF decoders of literature while requiring lower complexity to be efficiently implemented
Recommended from our members
Low-Density Parity-Check Code Decoder Design and Error Characterization on an FPGA Based Framework
Low-Density Parity-Check (LDPC) codes have gained popularity in communication systems and standards due to their capacity approaching error correction performance. Among all the hard-decision based LDPC decoders, Gallager B (GaB), due to simplicity of its operations, poses as the most hardware friendly algorithm and an attractive solution for meeting the high-throughput demand in communication systems. However, GaB sufferers from poor error correction performance. In this work, we first propose a resource efficient GaB hardware architecture that delivers the best throughput while using fewest Field Programmable Gate Array (FPGA) resources with respect to the state of the art comparable LDPC decoding algorithms. We then introduce a Probabilistic GaB (PGaB) algorithm that disturbs the decisions made during the decoding iterations randomly with a probability value determined based on experimental studies. We achieve up to four orders of magnitude better error correction performance than the GaB with a 3.4% improvement in normalized throughput performance. PGaB requires around 40% less energy than GaB as the probabilistic execution results with reducing the average iteration count by up to 62% compared to the GaB. We also show that our PGaB consistently results with an improvement in maximum operational clock rate compared to the state of the art implementations.
In this dissertation, we also present a high throughput FPGA based framework to accelerate error characterization of the LDPC codes. Our flexible framework allows the end user adjust the simulation parameters and rapidly study various LDPC codes and decoders. We first show that the connection intensive bipartite graph based LDPC decoder hardware architecture creates routing stress for longer codewords that are utilized in today's communications systems and standards. We address this problem by partitioning each processing element (PE) in the bipartite graph in such a way that the inputs of a PE are evenly distributed over its partitions. This allows depopulating the Loo Up Table (LUT) resources utilized for the decoder architecture by spreading the logic across the FPGA. We show that even though LUT usage increases, critical path delay reduces with the depopulation. More importantly, with the depopulation technique an unroutable design becomes routable, which allows longer codewords to be mapped on the FPGA. We then conduct two experiments on error correction performance analysis for the GaB and PGaB algorithms, demonstrate our framework's ability to reach a resolution level that is not attainable with general purpose processor (GPP) based simulations, which reduces the time scale of simulations to 24 hours from an estimated 199 years. We finally conduct the first study on identifying all possible codewords that are not correctable by the GaB for the case where a codeword has four errors. We reduce the time scale of this simulation that requires processing 117 billion codewords to 4 hours and 38 minutes with our framework from an estimated 7800 days on a single GPP
Dekodovanje kodova sa malom gustinom provera parnosti u prisustvu grešaka u logičkim kolima
Sve ve´ca integracija poluprovodniˇckih tehnologija, varijacije nastale usled nesavršenosti procesa
proizvodnje, kao zahtevi za smanjenjem napona napajanja cˇine elektronske ured¯aje inherentno
nepouzdanim. Agresivno skaliranje napona smanjuje otpornost na šum i dovodi do nepouzdanog
rada ured¯aja. Široko je prihvac´ena paradigma prema kojoj se naredne generacije digitalnih
elektronskih ured¯aja moraju opremiti logikom za korekciju hardverskih grešaka...Due to huge density integration increase, lower supply voltages, and variations in technological
process, complementary metal-oxide-semiconductor (CMOS) and emerging nanoelectronic devices
are inherently unreliable. Moreover, the demands for energy efficiency require reduction
of energy consumption by several orders of magnitude, which can be done only by aggressive
supply voltage scaling. Consequently, the signal levels are much lower and closer to the noise
level, which reduces the component noise immunity and leads to unreliable behavior. It is
widely accepted that future generations of circuits and systems must be designed to deal with
unreliable components..
Hardware-Conscious Wireless Communication System Design
The work at hand is a selection of topics in efficient wireless communication system design, with topics logically divided into two groups.One group can be described as hardware designs conscious of their possibilities and limitations. In other words, it is about hardware that chooses its configuration and properties depending on the performance that needs to be delivered and the influence of external factors, with the goal of keeping the energy consumption as low as possible. Design parameters that trade off power with complexity are identified for analog, mixed signal and digital circuits, and implications of these tradeoffs are analyzed in detail. An analog front end and an LDPC channel decoder that adapt their parameters to the environment (e.g. fluctuating power level due to fading) are proposed, and it is analyzed how much power/energy these environment-adaptive structures save compared to non-adaptive designs made for the worst-case scenario. Additionally, the impact of ADC bit resolution on the energy efficiency of a massive MIMO system is examined in detail, with the goal of finding bit resolutions that maximize the energy efficiency under various system setups.In another group of themes, one can recognize systems where the system architect was conscious of fundamental limitations stemming from hardware.Put in another way, in these designs there is no attempt of tweaking or tuning the hardware. On the contrary, system design is performed so as to work around an existing and unchangeable hardware limitation. As a workaround for the problematic centralized topology, a massive MIMO base station based on the daisy chain topology is proposed and a method for signal processing tailored to the daisy chain setup is designed. In another example, a large group of cooperating relays is split into several smaller groups, each cooperatively performing relaying independently of the others. As cooperation consumes resources (such as bandwidth), splitting the system into smaller, independent cooperative parts helps save resources and is again an example of a workaround for an inherent limitation.From the analyses performed in this thesis, promising observations about hardware consciousness can be made. Adapting the structure of a hardware block to the environment can bring massive savings in energy, and simple workarounds prove to perform almost as good as the inherently limited designs, but with the limitation being successfully bypassed. As a general observation, it can be concluded that hardware consciousness pays off
A PUF based Lightweight Hardware Security Architecture for IoT
With an increasing number of hand-held electronics, gadgets, and other smart devices, data is present in a large number of platforms, thereby increasing the risk of security, privacy, and safety breach than ever before. Due to the extreme lightweight nature of these devices, commonly referred to as IoT or `Internet of Things\u27, providing any kind of security is prohibitive due to high overhead associated with any traditional and mathematically robust cryptographic techniques. Therefore, researchers have searched for alternative intuitive solutions for such devices. Hardware security, unlike traditional cryptography, can provide unique device-specific security solutions with little overhead, address vulnerability in hardware and, therefore, are attractive in this domain. As Moore\u27s law is almost at its end, different emerging devices are being explored more by researchers as they present opportunities to build better application-specific devices along with their challenges compared to CMOS technology. In this work, we have proposed emerging nanotechnology-based hardware security as a security solution for resource constrained IoT domain. Specifically, we have built two hardware security primitives i.e. physical unclonable function (PUF) and true random number generator (TRNG) and used these components as part of a security protocol proposed in this work as well. Both PUF and TRNG are built from metal-oxide memristors, an emerging nanoscale device and are generally lightweight compared to their CMOS counterparts in terms of area, power, and delay. Design challenges associated with designing these hardware security primitives and with memristive devices are properly addressed. Finally, a complete security protocol is proposed where all of these different pieces come together to provide a practical, robust, and device-specific security for resource-limited IoT systems
On the Design of Future Communication Systems with Coded Transport, Storage, and Computing
Communication systems are experiencing a fundamental change. There are novel applications that require an increased performance not only of throughput but also latency, reliability, security, and heterogeneity support from these systems. To fulfil the requirements, future systems understand communication not only as the transport of bits but also as their storage, processing, and relation. In these systems, every network node has transport storage and computing resources that the network operator and its users can exploit through virtualisation and softwarisation of the resources. It is within this context that this work presents its results. We proposed distributed coded approaches to improve communication systems. Our results improve the reliability and latency performance of the transport of information. They also increase the reliability, flexibility, and throughput of storage applications. Furthermore, based on the lessons that coded approaches improve the transport and storage performance of communication systems, we propose a distributed coded approach for the computing of novel in-network applications such as the steering and control of cyber-physical systems. Our proposed approach can increase the reliability and latency performance of distributed in-network computing in the presence of errors, erasures, and attackers
Flexible encoder and decoder of low density parity check codes
У дисертацији су предложена брза, флексибилна и хардверски ефикасна решења за
кодовање и декодовање изузетно нерегуларних кодова са проверама парности мале густине
(енгл. low-density parity-check, LDPC, codes) захтевана у савременим комуникационим
стандардима.
Један део доприноса дисертације је у новој делимично паралелној архитектури LDPC
кодера за пету генерацију мобилних комуникација. Архитектура је заснована на
флексибилној мрежи за кружни померај која омогућава паралелно процесирање више делова
контролне матрице кратких кодова чиме се остварује сличан ниво паралелизма као и при
кодовању дугачких кодова. Поред архитектуралног решења, предложена је оптимизација
редоследа процесирања контролне матрице заснована на генетичком алгоритму, која
омогућава постизање великих протока, малог кашњења и тренутно најбоље ефикасности
искоришћења хардверских ресурса.
У другом делу дисертације предложено је ново алгоритамско и архитектурално решење
за декодовање структурираних LDPC кодова. Често коришћени приступ у LDPC декодерима
је слојевито декодовање, код кога се услед проточне обраде јављају хазарди података који
смањују проток. Декодер предложен у дисертацији у конфликтним ситуацијама на погодан
начин комбинује слојевито и симултано декодовање чиме се избегавају циклуси паузе
изазвани хазардима података. Овај приступ даје могућност за увођење великог броја степени
проточне обраде чиме се постиже висока учестаност сигнала такта. Додатно, редослед
процесирања контролне матрице је оптимизован коришћењем генетичког алгоритма за
побољшане перформансе контроле грешака. Остварени резултати показују да, у поређењу са
референтним решењима, предложени декодер остварује значајна побољшања у протоку и
најбољу ефикасност за исте перформансе контроле грешака.The dissertation proposes high speed, flexible and hardware efficient solutions for coding and
decoding of highly irregular low-density parity-check (LDPC) codes, required by many modern
communication standards.
The first part of the dissertation’s contributions is in the novel partially parallel LDPC
encoder architecture for 5G. The architecture was built around the flexible shifting network that
enables parallel processing of multiple parity check matrix elements for short to medium code
lengths, thus providing almost the same level of parallelism as for long code encoding. In addition,
the processing schedule was optimized for minimal encoding time using the genetic algorithm. The
optimization procedure contributes to achieving high throughputs, low latency, and up to date the
best hardware usage efficiency (HUE).
The second part proposes a new algorithmic and architectural solution for structured LDPC
code decoding. A widely used approach in LDPC decoders is a layered decoding schedule, which
frequently suffers from pipeline data hazards that reduce the throughput. The decoder proposed in
the dissertation conveniently incorporates both the layered and the flooding schedules in cases when
hazards occur and thus facilitates LDPC decoding without stall cycles caused by pipeline hazards.
Therefore, the proposed architecture enables insertion of many pipeline stages, which consequently
provides a high operating clock frequency. Additionally, the decoding schedule was optimized for
better signal-to-noise ratio (SNR) performance using genetic algorithm. The obtained results show
that the proposed decoder achieves great throughput increase and the best HUE when compared
with the state of the art for the same SNR performance
Risk prediction analysis for post-surgical complications in cardiothoracic surgery
Cardiothoracic surgery patients have the risk of developing surgical site infections
(SSIs), which causes hospital readmissions, increases healthcare costs and may lead to
mortality. The first 30 days after hospital discharge are crucial for preventing these
kind of infections. As an alternative to a hospital-based diagnosis, an automatic digital
monitoring system can help with the early detection of SSIs by analyzing daily images
of patient’s wounds. However, analyzing a wound automatically is one of the biggest
challenges in medical image analysis.
The proposed system is integrated into a research project called CardioFollowAI,
which developed a digital telemonitoring service to follow-up the recovery of cardiothoracic
surgery patients. This present work aims to tackle the problem of SSIs by predicting
the existence of worrying alterations in wound images taken by patients, with the help of
machine learning and deep learning algorithms. The developed system is divided into a
segmentation model which detects the wound region area and categorizes the wound type,
and a classification model which predicts the occurrence of alterations in the wounds.
The dataset consists of 1337 images with chest wounds (WC), drainage wounds (WD)
and leg wounds (WL) from 34 cardiothoracic surgery patients. For segmenting the images,
an architecture with a Mobilenet encoder and an Unet decoder was used to obtain
the regions of interest (ROI) and attribute the wound class. The following model was
divided into three sub-classifiers for each wound type, in order to improve the model’s
performance. Color and textural features were extracted from the wound’s ROIs to feed
one of the three machine learning classifiers (random Forest, support vector machine and
K-nearest neighbors), that predict the final output.
The segmentation model achieved a final mean IoU of 89.9%, a dice coefficient of
94.6% and a mean average precision of 90.1%, showing good results. As for the algorithms
that performed classification, the WL classifier exhibited the best results with a
87.6% recall and 52.6% precision, while WC classifier achieved a 71.4% recall and 36.0%
precision. The WD had the worst performance with a 68.4% recall and 33.2% precision.
The obtained results demonstrate the feasibility of this solution, which can be a start for
preventing SSIs through image analysis with artificial intelligence.Os pacientes submetidos a uma cirurgia cardiotorácica tem o risco de desenvolver
infeções no local da ferida cirúrgica, o que pode consequentemente levar a readmissões
hospitalares, ao aumento dos custos na saúde e à mortalidade. Os primeiros 30 dias
após a alta hospitalar são cruciais na prevenção destas infecções. Assim, como alternativa
ao diagnóstico no hospital, a utilização diária de um sistema digital e automático de
monotorização em imagens de feridas cirúrgicas pode ajudar na precoce deteção destas
infeções. No entanto, a análise automática de feridas é um dos grandes desafios em análise
de imagens médicas.
O sistema proposto integra um projeto de investigação designado CardioFollow.AI,
que desenvolveu um serviço digital de telemonitorização para realizar o follow-up da recuperação
dos pacientes de cirurgia cardiotorácica. Neste trabalho, o problema da infeção
de feridas cirúrgicas é abordado, através da deteção de alterações preocupantes na ferida
com ajuda de algoritmos de aprendizagem automática. O sistema desenvolvido divide-se
num modelo de segmentação, que deteta a região da ferida e a categoriza consoante o seu
tipo, e num modelo de classificação que prevê a existência de alterações na ferida.
O conjunto de dados consistiu em 1337 imagens de feridas do peito (WC), feridas
dos tubos de drenagem (WD) e feridas da perna (WL), provenientes de 34 pacientes de
cirurgia cardiotorácica. A segmentação de imagem foi realizada através da combinação
de Mobilenet como codificador e Unet como decodificador, de forma a obter-se as regiões
de interesse e atribuir a classe da ferida. O modelo seguinte foi dividido em três subclassificadores
para cada tipo de ferida, de forma a melhorar a performance do modelo.
Caraterísticas de cor e textura foram extraídas da região da ferida para serem introduzidas
num dos modelos de aprendizagem automática de forma a prever a classificação final
(Random Forest, Support Vector Machine and K-Nearest Neighbors).
O modelo de segmentação demonstrou bons resultados ao obter um IoU médio final
de 89.9%, um dice de 94.6% e uma média de precisão de 90.1%. Relativamente aos algoritmos
que realizaram a classificação, o classificador WL exibiu os melhores resultados
com 87.6% de recall e 62.6% de precisão, enquanto o classificador das WC conseguiu um recall de 71.4% e 36.0% de precisão. Por fim, o classificador das WD teve a pior performance
com um recall de 68.4% e 33.2% de precisão. Os resultados obtidos demonstram
a viabilidade desta solução, que constitui o início da prevenção de infeções em feridas
cirúrgica a partir da análise de imagem, com recurso a inteligência artificial