12 research outputs found
Dynamically and partially reconfigurable hardware architectures for high performance microarray bioinformatics data analysis
The field of Bioinformatics and Computational Biology (BCB) is a multidisciplinary field
that has emerged due to the computational demands of current state-of-the-art biotechnology.
BCB deals with the storage, organization, retrieval, and analysis of biological datasets,
which have grown in size and complexity in recent years especially after the completion of
the human genome project. The advent of Microarray technology in the 1990s has resulted in
the new concept of high throughput experiment, which is a biotechnology that measures the
gene expression profiles of thousands of genes simultaneously. As such, Microarray requires
high computational power to extract the biological relevance from its high dimensional data.
Current general purpose processors (GPPs) has been unable to keep-up with the increasing
computational demands of Microarrays and reached a limit in terms of clock speed.
Consequently, Field Programmable Gate Arrays (FPGAs) have been proposed as a low
power viable solution to overcome the computational limitations of GPPs and other methods.
The research presented in this thesis harnesses current state-of-the-art FPGAs and tools to
accelerate some of the most widely used data mining methods used for the analysis of
Microarray data in an effort to investigate the viability of the technology as an efficient, low
power, and economic solution for the analysis of Microarray data. Three widely used
methods have been selected for the FPGA implementations: one is the un-supervised Kmeans
clustering algorithm, while the other two are supervised classification methods,
namely, the K-Nearest Neighbour (K-NN) and Support Vector Machines (SVM). These
methods are thought to benefit from parallel implementation. This thesis presents detailed
designs and implementations of these three BCB applications on FPGA captured in Verilog
HDL, whose performance are compared with equivalent implementations running on GPPs.
In addition to acceleration, the benefits of current dynamic partial reconfiguration (DPR)
capability of modern Xilinxâ FPGAs are investigated with reference to the aforementioned
data mining methods.
Implementing K-means clustering on FPGA using non-DPR design flow has
outperformed equivalent implementations in GPP and GPU in terms of speed-up by two
orders and one order of magnitude, respectively; while being eight times more power
efficient than GPP and four times more than a GPU implementation. As for the energy
efficiency, the FPGA implementation was 615 times more energy efficient than GPPs, and 31 times more than GPUs. Over and above, the FPGA implementation outperformed the
GPP and GPU implementations in terms of speed-up as the dimensionality of the Microarray
data increases. Additionally, the DPR implementations of the K-means clustering have
shown speed-up in partial reconfiguration time of ~5x and 17x over full chip reconfiguration
for single-core and eight-core implementations, respectively.
Two architectures of the K-NN classifier have been implemented on FPGA, namely, A1
and A2. The K-NN implementation based on A1 architecture achieved a speed-up of ~76x
over an equivalent GPP implementation whereas the A2 architecture achieved ~68x speedup.
Furthermore, the FPGA implementation outperformed the equivalent GPP
implementation when the dimensionality of data was increased. In addition, The DPR
implementations of the K-NN classifier have achieved speed-ups in reconfiguration time
between ~4x to 10x over full chip reconfiguration when reconfiguring portion of the
classifier or the complete classifier.
Similar to K-NN, two architectures of the SVM classifier were implemented on FPGA
whereby the former outperformed an equivalent GPP implementation by ~61x and the latter
by ~49x. As for the DPR implementation of the SVM classifier, it has shown a speed-up of
~8x in reconfiguration time when reconfiguring the complete core or when exchanging it
with a K-NN core forming a multi-classifier.
The aforementioned implementations clearly show FPGAs to be an efficacious, efficient
and economic solution for bioinformatics Microarrays data analysis
Towards the development of a reliable reconfigurable real-time operating system on FPGAs
In the last two decades, Field Programmable Gate Arrays (FPGAs) have been
rapidly developed from simple âglue-logicâ to a powerful platform capable of
implementing a System on Chip (SoC). Modern FPGAs achieve not only the high
performance compared with General Purpose Processors (GPPs), thanks to hardware
parallelism and dedication, but also better programming flexibility, in comparison to
Application Specific Integrated Circuits (ASICs). Moreover, the hardware
programming flexibility of FPGAs is further harnessed for both performance and
manipulability, which makes Dynamic Partial Reconfiguration (DPR) possible. DPR
allows a part or parts of a circuit to be reconfigured at run-time, without interrupting
the rest of the chipâs operation. As a result, hardware resources can be more
efficiently exploited since the chip resources can be reused by swapping in or out
hardware tasks to or from the chip in a time-multiplexed fashion. In addition, DPR
improves fault tolerance against transient errors and permanent damage, such as
Single Event Upsets (SEUs) can be mitigated by reconfiguring the FPGA to avoid
error accumulation. Furthermore, power and heat can be reduced by removing
finished or idle tasks from the chip. For all these reasons above, DPR has
significantly promoted Reconfigurable Computing (RC) and has become a very hot
topic. However, since hardware integration is increasing at an exponential rate, and
applications are becoming more complex with the growth of user demands, highlevel
application design and low-level hardware implementation are increasingly
separated and layered. As a consequence, users can obtain little advantage from DPR
without the support of system-level middleware.
To bridge the gap between the high-level application and the low-level hardware
implementation, this thesis presents the important contributions towards a Reliable,
Reconfigurable and Real-Time Operating System (R3TOS), which facilitates the
user exploitation of DPR from the application level, by managing the complex
hardware in the background. In R3TOS, hardware tasks behave just like software
tasks, which can be created, scheduled, and mapped to different computing resources
on the fly. The novel contributions of this work are: 1) a novel implementation of an efficient task scheduler and allocator; 2) implementation of a novel real-time
scheduling algorithm (FAEDF) and two efficacious allocating algorithms (EAC and
EVC), which schedule tasks in real-time and circumvent emerging faults while
maintaining more compact empty areas. 3) Design and implementation of a faulttolerant
microprocessor by harnessing the existing FPGA resources, such as Error
Correction Code (ECC) and configuration primitives. 4) A novel symmetric
multiprocessing (SMP)-based architectures that supports shared memory programing
interface. 5) Two demonstrations of the integrated system, including a) the K-Nearest
Neighbour classifier, which is a non-parametric classification algorithm widely used
in various fields of data mining; and b) pairwise sequence alignment, namely the
Smith Waterman algorithm, used for identifying similarities between two biological
sequences.
R3TOS gives considerably higher flexibility to support scalable multi-user, multitasking
applications, whereby resources can be dynamically managed in respect of
user requirements and hardware availability. Benefiting from this, not only the
hardware resources can be more efficiently used, but also the system performance
can be significantly increased. Results show that the scheduling and allocating
efficiencies have been improved up to 2x, and the overall system performance is
further improved by ~2.5x. Future work includes the development of Network on
Chip (NoC), which is expected to further increase the communication throughput; as
well as the standardization and automation of our system design, which will be
carried out in line with the enablement of other high-level synthesis tools, to allow
application developers to benefit from the system in a more efficient manner
Biometrics
Biometrics-Unique and Diverse Applications in Nature, Science, and Technology provides a unique sampling of the diverse ways in which biometrics is integrated into our lives and our technology. From time immemorial, we as humans have been intrigued by, perplexed by, and entertained by observing and analyzing ourselves and the natural world around us. Science and technology have evolved to a point where we can empirically record a measure of a biological or behavioral feature and use it for recognizing patterns, trends, and or discrete phenomena, such as individuals' and this is what biometrics is all about. Understanding some of the ways in which we use biometrics and for what specific purposes is what this book is all about
Models, Optimizations, and Tools for Large-Scale Phylogenetic Inference, Handling Sequence Uncertainty, and Taxonomic Validation
Das Konzept der Evolution ist in der modernen Biologie von zentraler Bedeutung.
Deswegen liefert die Phylogenetik, die Lehre ĂŒber die Verwandschaften und Abstam-
mung von Organismen bzw. Spezies, entscheidende Hinweise zur EntschlĂŒsselung
einer Vielzahl biologischer Prozesse. Phylogenetische StammbÀume sind einerseits
fĂŒr die Grundlagenforschung wichtig, da sie in Studien ĂŒber die Diversifizierung und
Umweltanpassung einzelner Organismengruppen (z.B. Insekten oder Vögel) bis hin
zu der groĂen Herausforderung, die Entstehung und Entwicklung aller Lebensfor-
men in einem umfassenden evolutionÀren Baum darzustellen (der sog. Tree of Life)
Anwendung finden. Andererseits werden phylogenetische Methoden auch in prax-
isnahen Anwendungen eingesetzt, um beispielsweise die Verbreitungsdynamik von
HIV-Infektionen oder, die HeterogenitÀt der Krebszellen eines Tumors, zu verstehen.
Den aktuellen Stand der Technik in der Stammbaumrekonstruktion stellen Meth-
oden Maximum Likelihood (ML) und Bayesâsche Inferenz (BI) dar, welche auf der
Analyse molekularer Sequenzendaten (DNA und Proteine) anhand probabilistis-
cher Evolutionsmodelle basieren. Diese Methoden weisen eine hohe Laufzeitkom-
plexitÀt auf (N P -schwer), welche die Entwicklung effizienter Heuristiken unabding-
bar macht. Hinzu kommt, dass die Berechnung der Zielfunktion (sog. Phylogenetic
Likelihood Function, PLF) neben einem hohen Speicherverbrauch auch eine Vielzahl
an Gleitkommaarithmetik-Operationen erfordert und somit extrem rechenaufwendig
ist.
Die neuesten Entwicklungen im Bereich der DNA-Sequenzierung (Next Gener-
ation Sequencing, NGS) steigern kontinuierlich den Durchsatz und senken zugleich
die Sequenzierungskosten um ein Vielfaches. FĂŒr die Phylogenetik hat dies zur
Folge, dass die Dimensionen der zu analysierenden DatensĂ€tze alle 2â3 Jahre, um
eine Grössenordnung zunhemen. War es bisher ĂŒblich, einige Dutzend bis Hun-
derte Spezies anhand einzelner bzw. weniger Gene zu analysieren (SequenzlÀnge:
1â10 Kilobasen), stellen derzeit Studien mit Tausenden Sequenzen oder Genen keine
Seltenheit mehr dar. In den nĂ€chsten 1â2 Jahren ist zu erwarten, dass die Anal-
ysen Tausender bis Zehntausender vollstÀndiger Genome bzw. Transkriptome (Se-
quenzlĂ€nge: 1â100 Megabasen und mehr) anstehen. Um diesen Aufgaben gewachsen
zu sein, mĂŒssen die bestehenden Methoden weiterentwickelt und optimiert werden,
um vor allem Höchstleistungsrechner sowie neue Hardware-Architekturen optimal
nutzen zu können.
AuĂerdem fĂŒhrt die sich beschleunigende Speicherung von Sequenzen in öffentli-
chen Datenbanken wie NCBI GenBank (und ihren Derivaten) dazu, dass eine hohe
QualitÀt der Sequenzannotierungen (z. B. Organismus- bzw. Speziesname, tax-
onomische Klassifikation, Name eines Gens usw.) nicht zwangslÀufig gewÀhrleistet
ist. Das hÀngt unter anderem auch damit zusammen, dass eine zeitnahe Korrektur
durch entsprechende Experten nicht mehr möglich ist, solange ihnen keine adÀquaten
Software-Tools zur VerfĂŒgung stehen.
In dieser Doktroarbeit leisten wir mehrere BeitrÀge zur BewÀltigung der oben
genannten Herausforderungen.
Erstens haben wir ExaML, eine dedizierte Software zur ML-basierten Stamm-
baumrekonstruktion fĂŒr Höchstleistungsrechner, auf den Intel Xeon Phi Hardware-
beschleuniger portiert. Der Xeon Phi bietet im Vergleich zu klassischen x86 CPUs
eine höhere Rechenleistung, die allerdings nur anhand architekturspezifischer Op-
timierungen vollstÀndig genutzt werden kann. Aus diesem Grund haben wir zum
einen die PLF-Berechnung fĂŒr die 512-bit-Vektoreinheit des Xeon Phi umstrukturi-
ert und optimiert. Zum anderen haben wir die in ExaML bereits vorhandene reine
MPI-Parallelisierung durch eine hybride MPI/OpenMP-Lösung ersetzt. Diese hy-
bride Lösung weist eine wesentlich bessere Skalierbarkeit fĂŒr eine hohe Zahl von
Kernen bzw. Threads innerhalb eines Rechenknotens auf (>100 HW-Threads fĂŒr
Xeon Phi).
Des Weiteren haben wir eine neue Software zur ML-Baumrekonstruktion na-
mens RAxML-NG entwickelt. Diese implementiert, bis auf kleinere Anpassungen, zwar
denselben Suchalgorithmus wie das weit verbreitete Programm RAxML, bietet aber
gegenĂŒber RAxML mehrere Vorteile: (a) dank den sorgfĂ€ltigen Optimierungen der
PLF-Berechnung ist es gelungen, die Laufzeiten um den Faktor 2 bis 3 zu reduzieren
(b) die Skalierbarkeit auf extrem groĂen EingabedatensĂ€tzen wurde verbessert, in-
dem ineffiziente topologische Operationen eliminiert bzw. optimiert wurden, (c) die
bisher nur in ExaML verfĂŒgbaren, fĂŒr groĂe DatensĂ€tze relevanten Funktionen wie
Checkpointing sowie ein dedizierter Datenverteilungsalgorithmus wurden nachimple-
mentiert (d) dem Benutzer steht eine gröĂere Auswahl an statistischen DNA-Evo-
lutionsmodellen zur VerfĂŒgung, die zudem flexibler kombiniert und parametrisiert
werden können (e) die Weiterentwicklung der Software wird aufgrund der modularen
Architektur wesentlich erleichtert (die Funktionen zur PLF-Berechnung wurden in
eine gesonderte Bibliothek ausgeglidert).
Als nÀchstes haben wir untersucht, wie sich Sequenzierungsfehler auf die Genau-
igkeit phylogenetischr Stammbaumrekonstruktionen auswirken. Wir modifizieren
den RAxML bzw. RAxML-NG Code dahingehend, dass sowohl die explizite Angabe von
Fehlerwahrscheinlichkeiten als auch die automatische SchÀtzung von Fehlerraten
mittels der ML-Methode möglich ist. Unsere Simulationen zeigen: (a) Wenn die
Fehler gleichverteilt sind, kann die Fehlerrate direkt aus den Sequenzdaten geschÀtzt
werden. (b) Ab einer Fehlerrate von ca. 1% liefert die Baumrekonstruktion unter
BerĂŒcksichtigung des Fehlermodells genauere Ergebnisse als die klassische Methode,
welche die Eingabe als fehlerfrei annimmt.
Ein weiterer Beitrag im Rahmen dieser Arbeit ist die Software-Pipeline SATIVA
zur rechnergestĂŒtzten Identifizierung und Korrektur fehlerhafter taxonomischer An-
notierungen in groĂen Sequenzendatenbanken. Der Algorithmus funktioniert wie
folgt: fĂŒr jede Sequenz wird die Platzierung im Stammbaum mit dem höchst-
möglichen Likelihood-Wert ermittelt und anschlieĂend geprĂŒft, ob diese mit der
vorgegeben taxonomischen Klassifikation ĂŒbereinstimmt. Ist dies nicht der Fall,
wird also eine Sequenz beispielsweise innerhalb einer anderen Gattung platziert,
wird die Sequenz als falsch annotiert gemeldet, und es wird eine entsprechende
Umklassifizierung vorgeschlagen. Auf simulierten DatensĂ€tzen mit zufĂ€llig eingefĂŒg-
ten Fehlern, erreichte unsere Pipeline eine hohe Identifikationsquote (>90%) sowie
Genauigkeit (>95%). Zur Evaluierung anhand empirischer Daten, haben wir vier
öffentliche rRNA Datenbanken untersucht, welche zur Klassifizierung von Bakterien
hÀufig als Referenz benutzt werden. Dabei haben wir je nach Datenbank 0.2% bis
2.5% aller Sequenzen als potenzielle Fehlannotierungen identifiziert
Ant Colony Optimization
Ant Colony Optimization (ACO) is the best example of how studies aimed at understanding and modeling the behavior of ants and other social insects can provide inspiration for the development of computational algorithms for the solution of difficult mathematical problems. Introduced by Marco Dorigo in his PhD thesis (1992) and initially applied to the travelling salesman problem, the ACO field has experienced a tremendous growth, standing today as an important nature-inspired stochastic metaheuristic for hard optimization problems. This book presents state-of-the-art ACO methods and is divided into two parts: (I) Techniques, which includes parallel implementations, and (II) Applications, where recent contributions of ACO to diverse fields, such as traffic congestion and control, structural optimization, manufacturing, and genomics are presented
Secure and Usable Behavioural User Authentication for Resource-Constrained Devices
Robust user authentication on small form-factor and resource-constrained smart devices, such as smartphones, wearables and IoT remains an important problem, especially as such devices are increasingly becoming stores of sensitive personal data, such as daily digital payment traces, health/wellness records and contact e-mails. Hence, a secure, usable and practical authentication mechanism to restrict access to unauthorized users is a basic requirement for such devices. Existing user authentication methods based on passwords pose a mental demand on the user's part and are not secure. Behavioural biometric based authentication provides an attractive means, which can replace passwords and provide high security and usability. To this end, we devise and study novel schemes and modalities and investigate how behaviour based user authentication can be practically realized on resource-constrained devices.
In the first part of the thesis, we implemented and evaluated the performance of touch based behavioural biometric on wearables and smartphones. Our results show that touch based behavioural authentication can yield very high accuracy and a small inference time without imposing huge resource requirements on the wearable devices. The second part of the thesis focus on designing a novel hybrid scheme named BehavioCog. The hybrid scheme combined touch gestures (behavioural biometric) with challenge-response based cognitive authentication. Touch based behavioural authentication is highly usable but is prone to observation attacks. While cognitive authentication schemes are highly resistant to observation attacks but not highly usable. The hybrid scheme improves the usability of cognitive authentication and improves the security of touch based behavioural biometric at the same time.
Next, we introduce and evaluate a novel behavioural biometric modality named BreathPrint based on an acoustics obtained from individual's breathing gestures. Breathing based authentication is highly usable and secure as it only requires a person to breathe and low observability makes it secure against spoofing and replay attacks. Our investigation with BreathPrint showed that it could be used for efficient real-time authentication on multiple standalone smart devices especially using deep learning models
Personality Identification from Social Media Using Deep Learning: A Review
Social media helps in sharing of ideas and information among people scattered around the world and thus helps in creating communities, groups, and virtual networks. Identification of personality is significant in many types of applications such as in detecting the mental state or character of a person, predicting job satisfaction, professional and personal relationship success, in recommendation systems. Personality is also an important factor to determine individual variation in thoughts, feelings, and conduct systems. According to the survey of Global social media research in 2018, approximately 3.196 billion social media users are in worldwide. The numbers are estimated to grow rapidly further with the use of mobile smart devices and advancement in technology. Support vector machine (SVM), Naive Bayes (NB), Multilayer perceptron neural network, and convolutional neural network (CNN) are some of the machine learning techniques used for personality identification in the literature review. This paper presents various studies conducted in identifying the personality of social media users with the help of machine learning approaches and the recent studies that targeted to predict the personality of online social media (OSM) users are reviewed
High-Performance Modelling and Simulation for Big Data Applications
This open access book was prepared as a Final Publication of the COST Action IC1406 âHigh-Performance Modelling and Simulation for Big Data Applications (cHiPSet)â project. Long considered important pillars of the scientific method, Modelling and Simulation have evolved from traditional discrete numerical methods to complex data-intensive continuous analytical optimisations. Resolution, scale, and accuracy have become essential to predict and analyse natural and complex systems in science and engineering. When their level of abstraction raises to have a better discernment of the domain at hand, their representation gets increasingly demanding for computational and data resources. On the other hand, High Performance Computing typically entails the effective use of parallel and distributed processing units coupled with efficient storage, communication and visualisation systems to underpin complex data-intensive applications in distinct scientific and technical domains. It is then arguably required to have a seamless interaction of High Performance Computing with Modelling and Simulation in order to store, compute, analyse, and visualise large data sets in science and engineering. Funded by the European Commission, cHiPSet has provided a dynamic trans-European forum for their members and distinguished guests to openly discuss novel perspectives and topics of interests for these two communities. This cHiPSet compendium presents a set of selected case studies related to healthcare, biological data, computational advertising, multimedia, finance, bioinformatics, and telecommunications
High-Performance Modelling and Simulation for Big Data Applications
This open access book was prepared as a Final Publication of the COST Action IC1406 âHigh-Performance Modelling and Simulation for Big Data Applications (cHiPSet)â project. Long considered important pillars of the scientific method, Modelling and Simulation have evolved from traditional discrete numerical methods to complex data-intensive continuous analytical optimisations. Resolution, scale, and accuracy have become essential to predict and analyse natural and complex systems in science and engineering. When their level of abstraction raises to have a better discernment of the domain at hand, their representation gets increasingly demanding for computational and data resources. On the other hand, High Performance Computing typically entails the effective use of parallel and distributed processing units coupled with efficient storage, communication and visualisation systems to underpin complex data-intensive applications in distinct scientific and technical domains. It is then arguably required to have a seamless interaction of High Performance Computing with Modelling and Simulation in order to store, compute, analyse, and visualise large data sets in science and engineering. Funded by the European Commission, cHiPSet has provided a dynamic trans-European forum for their members and distinguished guests to openly discuss novel perspectives and topics of interests for these two communities. This cHiPSet compendium presents a set of selected case studies related to healthcare, biological data, computational advertising, multimedia, finance, bioinformatics, and telecommunications