15 research outputs found
URI Undergraduate and Graduate Course Catalog 2021-2022
This is a downloadable PDF version of the University of Rhode Island course catalog.https://digitalcommons.uri.edu/course-catalogs/1073/thumbnail.jp
Bioinformatics
This book is divided into different research areas relevant in Bioinformatics such as biological networks, next generation sequencing, high performance computing, molecular modeling, structural bioinformatics, molecular modeling and intelligent data analysis. Each book section introduces the basic concepts and then explains its application to problems of great relevance, so both novice and expert readers can benefit from the information and research works presented here
URI Undergraduate and Graduate Course Catalog 2020-2021
This is a downloadable PDF version of the University of Rhode Island course catalog.https://digitalcommons.uri.edu/course-catalogs/1072/thumbnail.jp
Calcul approximatif à haute efficacité énergétique pour des applications de l'internet des objets
Reduced width units are ones of the power reduction methods. However such units have been mostly evaluated separately, i.e. not evaluated in a complete applications. In this thesis, we extend the RISC-V processor with reduced width computation and memory units, in which only a number of most significant bits (MSBs), configurable at runtime is active. The energy reduction vs quality of output trade-offs of applications executed with the extended RISC-V are studied. The results indicate that the energy can be reduced by up to 14% for an error †0.1%. Moreover we propose a generic energy model that includes both software parameters and hardware architecture ones. It allows software and hardware designers to have an early insight into the effects of optimizations on software and/or units.Les unitĂ©s Ă taille rĂ©duite font partie des mĂ©thodes proposĂ©es pour la rĂ©duction de la consommation dâĂ©nergie. Cependant, la plupart de ces unitĂ©s sont Ă©valuĂ©es sĂ©parĂ©ment,câest-Ă -dire elles ne sont pas Ă©valuĂ©es dans une application complĂšte. Dans cette thĂšse, des unitĂ©s Ă taille rĂ©duite pour le calcul et pour lâaccĂšs Ă la mĂ©moire de donnĂ©es, configurables au moment de lâexĂ©cution, sont intĂ©grĂ©es dans un processeur RISC-V. La rĂ©duction dâĂ©nergie et la qualitĂ© de sortie des applications exĂ©cutĂ©es sur le processeur RISC-V Ă©tendu avec ces unitĂ©s, sont Ă©valuĂ©es. Les rĂ©sultats indiquent que la consommation dâĂ©nergie peut ĂȘtre rĂ©duite jusquâĂ 14% pour une erreur â€0.1%. De plus, nous avons proposĂ© un modĂšle dâĂ©nergie gĂ©nĂ©rique qui inclut Ă la fois des paramĂštres logiciels et architecturaux. Le modĂšle permet aux concepteurs logiciels et matĂ©riels dâavoir un aperçu rapide sur lâimpact des optimisations effectuĂ©es sur le code source et/ou sur les unitĂ©s de calcul
URI Undergraduate and Graduate Course Catalog 2019-2020
This is a downloadable PDF version of the University of Rhode Island course catalog.https://digitalcommons.uri.edu/course-catalogs/1071/thumbnail.jp
URI Undergraduate and Graduate Course Catalog 2018-2019
This is a downloadable PDF version of the University of Rhode Island course catalog.https://digitalcommons.uri.edu/course-catalogs/1070/thumbnail.jp
Residue Number System Based Building Blocks for Applications in Digital Signal Processing
PĆedklĂĄdanĂĄ disertaÄnĂ prĂĄce se zabĂœvĂĄ nĂĄvrhem zĂĄkladnĂch blokĆŻ v systĂ©mu zbytkovĂœch tĆĂd pro zvĂœĆĄenĂ vĂœkonu aplikacĂ urÄenĂœch pro digitĂĄlnĂ zpracovĂĄnĂ signĂĄlĆŻ (DSP). SystĂ©m zbytkovĂœch tĆĂd (RNS) je nevĂĄhovĂĄ ÄĂselnĂĄ soustava, jeĆŸ umoĆŸĆuje provĂĄdÄt paralelizovatelnĂ©, vysokorychlostnĂ, bezpeÄnĂ© a proti chybĂĄm odolnĂ© aritmetickĂ© operace, kterĂ© jsou zpracovĂĄvĂĄny bez pĆenosu mezi ĆĂĄdy. Tyto vlastnosti jej ÄinĂ znaÄnÄ perspektivnĂm pro pouĆŸitĂ v DSP aplikacĂch nĂĄroÄnĂœch na vĂœpoÄetnĂ vĂœkon a odolnĂœch proti chybĂĄm. TypickĂœ RNS systĂ©m se sklĂĄdĂĄ ze tĆĂ hlavnĂch ÄĂĄstĂ: pĆevodnĂku z binĂĄrnĂho kĂłdu do RNS, kterĂœ poÄĂtĂĄ ekvivalent vstupnĂch binĂĄrnĂch hodnot v systĂ©mu zbytkovĂœch tĆĂd, dĂĄle jsou to paralelnÄ ĆazenĂ© RNS aritmetickĂ© jednotky, kterĂ© provĂĄdÄjĂ aritmetickĂ© operace s operandy jiĆŸ pĆevedenĂœmi do RNS. PoslednĂ ÄĂĄst pak tvoĆĂ pĆevodnĂk z RNS do binĂĄrnĂho kĂłdu, kterĂœ pĆevĂĄdĂ vĂœsledek zpÄt do vĂœchozĂho binĂĄrnĂho kĂłdu. HlavnĂm cĂlem tĂ©to disertaÄnĂ prĂĄce bylo navrhnout novĂ© struktury zĂĄkladnĂch blokĆŻ vĂœĆĄe zmiĆovanĂ©ho systĂ©mu zbytkovĂœch tĆĂd, kterĂ© mohou bĂœt vyuĆŸity v aplikacĂch DSP. Tato disertaÄnĂ prĂĄce pĆedklĂĄdĂĄ zlepĆĄenĂ a nĂĄvrhy novĂœch struktur komponent RNS, simulaci a takĂ© ovÄĆenĂ jejich funkÄnosti prostĆednictvĂm implementace v obvodech FPGA. KromÄ nĂĄvrhĆŻ novĂ© struktury zĂĄkladnĂch komponentĆŻ RNS je prezentovĂĄn takĂ© podrobnĂœ vĂœzkum rĆŻznĂœch sad modulĆŻ, kterĂœ je srovnĂĄvĂĄ a determinuje nejefektivnÄjĆĄĂ sadu pro rĆŻznĂ© dynamickĂ© rozsahy. DalĆĄĂm z klĂÄovĂœch pĆĂnosĆŻ disertaÄnĂ prĂĄce je objevenĂ a ovÄĆenĂ podmĂnky urÄujĂcĂ vĂœbÄr optimĂĄlnĂ sady modulĆŻ, kterĂĄ umoĆŸĆuje zvĂœĆĄit vĂœkonnost aplikacĂ DSP. DĂĄle byla navrĆŸena aplikace pro zpracovĂĄnĂ obrazu vyuĆŸĂvajĂcĂ RNS, kterĂĄ mĂĄ vĆŻÄi klasickĂ© binĂĄrnĂ implementanci niĆŸĆĄĂ spotĆebu a vyĆĄĆĄĂ maximĂĄlnĂ pracovnĂ frekvenci. V zĂĄvÄru prĂĄce byla vyhodnocena hlavnĂ kritĂ©ria pĆi rozhodovĂĄnĂ, zda je vhodnÄjĆĄĂ pro danou aplikaci vyuĆŸĂt binĂĄrnĂ ÄĂselnou soustavu nebo RNS.This doctoral thesis deals with designing residue number system based building blocks to enhance the performance of digital signal processing applications. The residue number system (RNS) is a non-weighted number system that provides carry-free, parallel, high speed, secure and fault tolerant arithmetic operations. These features make it very attractive to be used in high-performance and fault tolerant digital signal processing (DSP) applications. A typical RNS system consists of three main components; the first one is the binary to residue converter that computes the RNS equivalent of the inputs represented in the binary number system. The second component in this system is parallel residue arithmetic units that perform arithmetic operations on the operands already represented in RNS. The last component is the residue to binary converter, which converts the outputs back into their binary representation. The main aim of this thesis was to propose novel structures of the basic components of this system in order to be later used as fundamental units in DSP applications. This thesis encloses improving and designing novel structures of these components, simulating and verifying their efficiency via FPGA implementation. In addition to suggesting novel structures of basic RNS components, a detailed study on different moduli sets that compares and determines the most efficient one for different dynamic range requirements is also presented. One of the main outcomes of this thesis is concluding and verifying the main condition that should be met when choosing a moduli set, in order to improve the timing performance of a DSP application. An RNS-based image processing application is also proposed. Its efficiency, in terms of timing performance and power consumption, is proved via comparing it with a binary-based one. Finally, the main considerations that should be taken into account when choosing to use the binary number system or RNS are also discussed in details.
High performance reconfigurable architectures for biological sequence alignment
Bioinformatics and computational biology (BCB) is a rapidly developing
multidisciplinary field which encompasses a wide range of domains, including genomic
sequence alignments. It is a fundamental tool in molecular biology in searching for
homology between sequences. Sequence alignments are currently gaining close attention due
to their great impact on the quality aspects of life such as facilitating early disease diagnosis,
identifying the characteristics of a newly discovered sequence, and drug engineering. With
the vast growth of genomic data, searching for a sequence homology over huge databases
(often measured in gigabytes) is unable to produce results within a realistic time, hence the
need for acceleration. Since the exponential increase of biological databases as a result of the
human genome project (HGP), supercomputers and other parallel architectures such as the
special purpose Very Large Scale Integration (VLSI) chip, Graphic Processing Unit (GPUs)
and Field Programmable Gate Arrays (FPGAs) have become popular acceleration platforms.
Nevertheless, there are always trade-off between area, speed, power, cost, development time
and reusability when selecting an acceleration platform. FPGAs generally offer more
flexibility, higher performance and lower overheads. However, they suffer from a relatively
low level programming model as compared with off-the-shelf microprocessors such as
standard microprocessors and GPUs. Due to the aforementioned limitations, the need has
arisen for optimized FPGA core implementations which are crucial for this technology to
become viable in high performance computing (HPC).
This research proposes the use of state-of-the-art reprogrammable system-on-chip
technology on FPGAs to accelerate three widely-used sequence alignment algorithms; the
Smith-Waterman with affine gap penalty algorithm, the profile hidden Markov model
(HMM) algorithm and the Basic Local Alignment Search Tool (BLAST) algorithm. The
three novel aspects of this research are firstly that the algorithms are designed and
implemented in hardware, with each core achieving the highest performance compared to the
state-of-the-art. Secondly, an efficient scheduling strategy based on the double buffering
technique is adopted into the hardware architectures. Here, when the alignment matrix
computation task is overlapped with the PE configuration in a folded systolic array, the
overall throughput of the core is significantly increased. This is due to the bound PE
configuration time and the parallel PE configuration approach irrespective of the number of
PEs in a systolic array. In addition, the use of only two configuration elements in the PE optimizes hardware resources and enables the scalability of PE systolic arrays without
relying on restricted onboard memory resources. Finally, a new performance metric is
devised, which facilitates the effective comparison of design performance between different
FPGA devices and families. The normalized performance indicator (speed-up per area per
process technology) takes out advantages of the area and lithography technology of any
FPGA resulting in fairer comparisons.
The cores have been designed using Verilog HDL and prototyped on the Alpha Data
ADM-XRC-5LX card with the Virtex-5 XC5VLX110-3FF1153 FPGA. The implementation
results show that the proposed architectures achieved giga cell updates per second (GCUPS)
performances of 26.8, 29.5 and 24.2 respectively for the acceleration of the Smith-Waterman
with affine gap penalty algorithm, the profile HMM algorithm and the BLAST algorithm. In
terms of speed-up improvements, comparisons were made on performance of the designed
cores against their corresponding software and the reported FPGA implementations. In the
case of comparison with equivalent software execution, acceleration of the optimal
alignment algorithm in hardware yielded an average speed-up of 269x as compared to the
SSEARCH 35 software. For the profile HMM-based sequence alignment, the designed core
achieved speed-up of 103x and 8.3x against the HMMER 2.0 and the latest version of
HMMER (version 3.0) respectively. On the other hand, the implementation of the gapped
BLAST with the two-hit method in hardware achieved a greater than tenfold speed-up
compared to the latest NCBI BLAST software. In terms of comparison against other reported
FPGA implementations, the proposed normalized performance indicator was used to
evaluate the designed architectures fairly. The results showed that the first architecture
achieved more than 50 percent improvement, while acceleration of the profile HMM
sequence alignment in hardware gained a normalized speed-up of 1.34. In the case of the
gapped BLAST with the two-hit method, the designed core achieved 11x speed-up after
taking out advantages of the Virtex-5 FPGA. In addition, further analysis was conducted in
terms of cost and power performances; it was noted that, the core achieved 0.46 MCUPS per
dollar spent and 958.1 MCUPS per watt. This shows that FPGAs can be an attractive
platform for high performance computation with advantages of smaller area footprint as well
as represent economic âgreenâ solution compared to the other acceleration platforms. Higher
throughput can be achieved by redeploying the cores on newer, bigger and faster FPGAs
with minimal design effort
Virginia Commonwealth University Undergraduate Bulletin
Undergraduate bulletin for Virginia Commonwealth University for the academic year 2003-2004. It includes information on academic regulations, degree requirements, course offerings, faculty, academic calendar, and tuition and expenses for undergraduate programs