14 research outputs found

    Efficient implementation of elliptic curve cryptography.

    Get PDF
    Elliptic Curve Cryptosystems (ECC) were introduced in 1985 by Neal Koblitz and Victor Miller. Small key size made elliptic curve attractive for public key cryptosystem implementation. This thesis introduces solutions of efficient implementation of ECC in algorithmic level and in computation level. In algorithmic level, a fast parallel elliptic curve scalar multiplication algorithm based on a dual-processor hardware system is developed. The method has an average computation time of n3 Elliptic Curve Point Addition on an n-bit scalar. The improvement is n Elliptic Curve Point Doubling compared to conventional methods. When a proper coordinate system and binary representation for the scalar k is used the average execution time will be as low as n Elliptic Curve Point Doubling, which makes this method about two times faster than conventional single processor multipliers using the same coordinate system. In computation level, a high performance elliptic curve processor (ECP) architecture is presented. The processor uses parallelism in finite field calculation to achieve high speed execution of scalar multiplication algorithm. The architecture relies on compile-time detection rather than of run-time detection of parallelism which results in less hardware. Implemented on FPGA, the proposed processor operates at 66MHz in GF(2 167) and performs scalar multiplication in 100muSec, which is considerably faster than recent implementations.Dept. of Electrical and Computer Engineering. Paper copy at Leddy Library: Theses & Major Papers - Basement, West Bldg. / Call Number: Thesis2004 .A57. Source: Masters Abstracts International, Volume: 44-03, page: 1446. Thesis (M.A.Sc.)--University of Windsor (Canada), 2005

    Embed[d]ed Zerotree Codec

    Get PDF
    This thesis discusses the findings of the final year project involving the VHDL (V= Very High Speed Integrated Circuit, Hardware Description Language) design and simulation of an EZT (Embedded Zero Tree) codec. The basis of image compression and the various image compression techniques that are available today have been explored. This provided a clear understanding of image compression as a whole. An in depth understanding of wavelet transform theory was vital to the understanding of the edge that this transform provides over other transforms for image compression. Both the mathematics of it and how it is implemented using sets of high pass and low pass filters have been studied and presented. At the heart of the EZT codec is the EZW (Embedded Zerotree Wavelet) algorithm, as this is the algorithm that has been implemented in the codec. This required a thorough study and understanding of the algorithm and the various terms used in it. A generic single processor codec capable of handling any size of zerotree coefficients of images was designed. Once the coding and decoding strategy of this single processor had been figured out, it was easily extended to a codec with three parallel processors. This parallel architecture uses the same coding and decoding methods as in the single processor except that each processor in the parallel processing now handles only a third of the coefficients, thus promising a much speedier codec as compared to the first one. Both designs were then translated into VHDL behavioral level codes. The codes were then simulated and the results were verified. Once the simulations were completed the next aim for the project, namely synthesizing the design, was embarked upon. Of the two logical parts of the encoder, only the significance map generator has been synthesized

    Fast online predictive compression of radio astronomy data

    Get PDF
    This report investigates the fast, lossless compression of 32-bit single precision floating-point values. High speed compression is critical in the context of the MeerKAT radio telescope currently under construction in Southern Africa and Australia, which will produce data at rates up to 1 Petabyte every 20 seconds. The compression technique being investigated is based on predictive compression, which has proven successful at achieving high-speed compression in previous research. Several different predictive techniques (which includes polynomial extrapolation), along with CPU- and GPU-based parallelization approaches are discussed. The implementation successfully achieves throughput rates in excess of 6 GiB/s for compression and much higher rates for decompression using a 64-core AMD Opteron machine, achieving file-size reductions of, on average 9%. Furthermore the results of concurrent investigations into block-based parallel Huffman encoding and Zero-length Encoding are compared to the predictive scheme and it was found that the predictive scheme obtains approximately 4%-5% better compression ratios than the Zero-Length Encoder and is 25 times faster than Huffman encoding on an Intel Xeon E5 processor. The scheme may be well-suited to address the large network bandwidth requirements of the MeerKAT project

    Development of Lifting-based VLSI Architectures for Two-Dimensional Discrete Wavelet Transform

    Get PDF
    Two-dimensional discrete wavelet transform (2-D DWT) has evolved as an essential part of a modem compression system. It offers superior compression with good image quality and overcomes disadvantage of the discrete cosine transform, which suffers from blocks artifacts that reduces the quality of the inage. The amount of computations involve in 2-D DWT is enormous and cannot be processed by generalpurpose processors when real-time processing is required. Th·"efore, high speed and low power VLSI architecture that computes 2-D DWT effectively is needed. In this research, several VLSI architectures have been developed that meets real-time requirements for 2-D DWT applications. This research iaitially started off by implementing a software simulation program that decorrelates the original image and reconstructs the original image from the decorrelated image. Then, based on the information gained from implementing the simulation program, a new approach for designing lifting-based VLSI architectures for 2-D forward DWT is introduced. As a result, two high performance VLSI architectures that perform 2-D DWT for 5/3 and 9/7 filters are developed based on overlapped and nonoverlapped scan methods. Then, the intermediate architecture is developed, which aim a·: reducing the power consumption of the overlapped areas without using the expensive line buffer. In order to best meet real-time applications of 2-D DWT with demanding requirements in terms of speed and throughput parallelism is explored. The single pipelined intermediate and overlapped architectures are extended to 2-, 3-, and 4-parallel architectures to achieve speed factors of 2, 3, and 4, respectively. To further demonstrate the effectiveness of the approach single and para.llel VLSI architectures for 2-D inverse discrete wavelet transform (2-D IDWT) are developed. Furthermore, 2-D DWT memory architectures, which have been overlooked in the literature, are also developed. Finally, to show the architectural models developed for 2-D DWT are simple to control, the control algorithms for 4-parallel architecture based on the first scan method is developed. To validate architectures develcped in this work five architectures are implemented and simulated on Altera FPGA. In compliance with the terms of the Copyright Act 1987 and the IP Policy of the university, the copyright of this thesis has been reassigned by the author to the legal entity of the university, Institute of Technology PETRONAS Sdn bhd. Due acknowledgement shall always be made of the use of any material contained in, or derived from, this thesis

    The MANGO clockless network-on-chip: Concepts and implementation

    Get PDF

    3rd EGEE User Forum

    Get PDF
    We have organized this book in a sequence of chapters, each chapter associated with an application or technical theme introduced by an overview of the contents, and a summary of the main conclusions coming from the Forum for the chapter topic. The first chapter gathers all the plenary session keynote addresses, and following this there is a sequence of chapters covering the application flavoured sessions. These are followed by chapters with the flavour of Computer Science and Grid Technology. The final chapter covers the important number of practical demonstrations and posters exhibited at the Forum. Much of the work presented has a direct link to specific areas of Science, and so we have created a Science Index, presented below. In addition, at the end of this book, we provide a complete list of the institutes and countries involved in the User Forum

    Towards all-optical label switching nodes with multicast

    Get PDF
    Fiber optics has developed so rapidly during the last decades that it has be- come the backbone of our communication systems. Evolved from initially static single-channel point-to-point links, the current advanced optical backbone net- work consists mostly of wavelength-division multiplexed (WDM) networks with optical add/drop multiplexing nodes and optical cross-connects that can switch data in the optical domain. However, the commercially implemented optical net- work nodes are still performing optical circuit switching using wavelength routing. The dedicated use of wavelength and infrequent recon¯guration result in relatively poor bandwidth utilization. The success of electronic packet switching has inspired researchers to improve the °exibility, e±ciency, granularity and network utiliza- tion of optical networks by introducing optical packet switching using short, local optical labels for forwarding decision making at intermediate optical core network nodes, a technique that is referred to as optical label switching (OLS). Various research demonstrations on OLS systems have been reported with transparent optical packet payload forwarding based on electronic packet label processing, taking advantage of the mature technologies of electronic logical cir- cuitry. This approach requires optic-electronic-optic (OEO) conversion of the op- tical labels, a costly and power consuming procedure particularly for high-speed labels. As optical packet payload bit rate increases from gigabit per second (Gb/s) to terabit per second (Tb/s) or higher, the increased speed of the optical labels will eventually face the electronic bottleneck, so that the OEO conversion and the electronic label processing will be no longer e±cient. OLS with label processing in the optical domain, namely, all-optical label switching (AOLS), will become necessary. Di®erent AOLS techniques have been proposed in the last ¯ve years. In this thesis, AOLS node architectures based on optical time-serial label processing are presented for WDM optical packets. The unicast node architecture, where each optical packet is to be sent to only one output port of the node, has been in- vestigated and partially demonstrated in the EU IST-LASAGNE project. This thesis contributes to the multicast aspects of the AOLS nodes, where the optical packets can be forwarded to multiple or all output ports of a node. Multicast capable AOLS nodes are becoming increasingly interesting due to the exponen- tial growth of the emerging multicast Internet and modern data services such as video streaming, high de¯nition TV, multi-party online games, and enterprise ap- plications such as video conferencing and optical storage area networks. Current electronic routers implement multicast in the Internet protocol (IP) layer, which requires not only the OEO conversion of the optical packets, but also exhaus- tive routing table lookup of the globally unique IP addresses. Despite that, there has been no extensive studies on AOLS multicast nodes, technologies and tra±c performance, apart from a few proof-of-principle experimental demonstrations. In this thesis, three aspects of the multicast capable AOLS nodes are addressed: 1. Logical design of the AOLS multicast node architectures, as well as func- tional subsystems and interconnections, based on state-of-the-art literature research of the ¯eld and the subject. 2. Computer simulations of the tra±c performance of di®erent AOLS unicast and multicast node architectures, using a custom-developed AOLS simulator AOLSim. 3. Experimental demonstrations in laboratory and computer simulations using the commercially available simulator VPItransmissionMakerTM, to evaluate the physical layer performance of the required all-optical multicast technolo- gies. A few selected multi-wavelength conversion (MWC) techniques are particularly looked into. MWC is an essential subsystem of the AOLS node for realizing optical packet multicast by making multiple copies of the optical packet all-optically onto di®er- ent wavelengths channels. In this thesis, theMWC techniques based on cross-phase modulation and four-wave mixing are extensively investigated. The former tech- nique o®ers more wavelength °exibility and good conversion e±ciency, but it is only applicable to intensity modulated signals. The latter technique, on the other hand, o®ers strict transparency in data rate and modulation format, but its work- ing wavelengths are limited by the device or component used, and the conversion e±ciency is considerably lower. The proposals and results presented in this thesis show feasibility of all-optical packet switching and multicasting at line speed without any OEO conversion and electronic processing. The scalability and the costly optical components of the AOLS nodes have been so far two of the major obstacles for commercialization of the AOLS concept. This thesis also introduced a novel, scalable optical labeling concept and a label processing scheme for the AOLS multicast nodes. The pro- posed scheme makes use of the spatial positions of each label bit instead of the total absolute value of all the label bits. Thus for an n-bit label, the complexity of the label processor is determined by n instead of 2n

    Prédiction de performance d'algorithmes de traitement d'images sur différentes architectures hardwares

    Get PDF
    In computer vision, the choice of a computing architecture is becoming more difficult for image processing experts. Indeed, the number of architectures allowing the computation of image processing algorithms is increasing. Moreover, the number of computer vision applications constrained by computing capacity, power consumption and size is increasing. Furthermore, selecting an hardware architecture, as CPU, GPU or FPGA is also an important issue when considering computer vision applications.The main goal of this study is to predict the system performance in the beginning of a computer vision project. Indeed, for a manufacturer or even a researcher, selecting the computing architecture should be done as soon as possible to minimize the impact on development.A large variety of methods and tools has been developed to predict the performance of computing systems. However, they do not cover a specific area and they cannot predict the performance without analyzing the code or making some benchmarks on architectures. In this works, we specially focus on the prediction of the performance of computer vision algorithms without the need for benchmarking. This allows splitting the image processing algorithms in primitive blocks.In this context, a new paradigm based on splitting every image processing algorithms in primitive blocks has been developed. Furthermore, we propose a method to model the primitive blocks according to the software and hardware parameters. The decomposition in primitive blocks and their modeling was demonstrated to be possible. Herein, the performed experiences, on different architectures, with real data, using algorithms as convolution and wavelets validated the proposed paradigm. This approach is a first step towards the development of a tool allowing to help choosing hardware architecture and optimizing image processing algorithms.Dans le contexte de la vision par ordinateur, le choix d’une architecture de calcul est devenu de plus en plus complexe pour un spécialiste du traitement d’images. Le nombre d’architectures permettant de résoudre des algorithmes de traitement d’images augmente d’année en année. Ces algorithmes s’intègrent dans des cadres eux-mêmes de plus en plus complexes répondant à de multiples contraintes, que ce soit en terme de capacité de calculs, mais aussi en terme de consommation ou d’encombrement. A ces contraintes s’ajoute le nombre grandissant de types d’architectures de calculs pouvant répondre aux besoins d’une application (CPU, GPU, FPGA). L’enjeu principal de l’étude est la prédiction de la performance d’un système, cette prédiction pouvant être réalisée en phase amont d’un projet de développement dans le domaine de la vision. Dans un cadre de développement, industriel ou de recherche, l’impact en termes de réduction des coûts de développement, est d’autant plus important que le choix de l’architecture de calcul est réalisé tôt. De nombreux outils et méthodes d’évaluation de la performance ont été développés mais ceux-ci, se concentrent rarement sur un domaine précis et ne permettent pas d’évaluer la performance sans une étude complète du code ou sans la réalisation de tests sur l’architecture étudiée. Notre but étant de s’affranchir totalement de benchmark, nous nous sommes concentrés sur le domaine du traitement d’images pour pouvoir décomposer les algorithmes du domaine en éléments simples ici nommées briques élémentaires. Dans cette optique, un nouveau paradigme qui repose sur une décomposition de tout algorithme de traitement d’images en ces briques élémentaires a été conçu. Une méthode est proposée pour modéliser ces briques en fonction de paramètres software et hardwares. L’étude démontre que la décomposition en briques élémentaires est réalisable et que ces briques élémentaires peuvent être modélisées. Les premiers tests sur différentes architectures avec des données réelles et des algorithmes comme la convolution et les ondelettes ont permis de valider l'approche. Ce paradigme est un premier pas vers la réalisation d’un outil qui permettra de proposer des architectures pour le traitement d’images et d’aider à l’optimisation d’un programme dans ce domaine

    High performance disk array architectures.

    Get PDF
    Yeung Kai-hau, Alan.Thesis (Ph.D.)--Chinese University of Hong Kong, 1995.Includes bibliographical references.ACKNOWLEDGMENTS --- p.ivABSTRACT --- p.vChapter CHAPTER 1 --- Introduction --- p.1Chapter 1.1 --- The Information Age --- p.2Chapter 1.2 --- The Importance of Input/Output --- p.3Chapter 1.3 --- Redundant Arrays of Inexpensive Disks --- p.5Chapter 1.4 --- Outline of the Thesis --- p.7References --- p.8Chapter CHAPTER 2 --- Selective Broadcast Data Distribution Systems --- p.10Chapter 2.1 --- Introduction --- p.11Chapter 2.2 --- The Distributed Architecture --- p.12Chapter 2.3 --- Mean Block Acquisition Delay for Uniform Request Distribution --- p.16Chapter 2.4 --- Mean Block Acquisition Delay for General Request Distributions --- p.21Chapter 2.5 --- Optimal Choice of Block Sizes --- p.24Chapter 2.6 --- Chapter Summary --- p.25References --- p.26Chapter CHAPTER 3 --- Dynamic Multiple Parity Disk Arrays --- p.28Chapter 3.1 --- Introduction --- p.29Chapter 3.2 --- DMP Disk Array --- p.31Chapter 3.3 --- Average Delay --- p.37Chapter 3.4 --- Maximum Throughput --- p.47Chapter 3.5 --- Simulation with Precise Disk Model --- p.53Chapter 3.6 --- Chapter Summary --- p.58References --- p.59Appendix --- p.61Chapter CHAPTER 4 --- Dynamic Parity Logging Disk Arrays --- p.69Chapter 4.1 --- Introduction --- p.70Chapter 4.2 --- DPL Disk Array Architecture --- p.73Chapter 4.3 --- DPL Disk Array Operation --- p.79Chapter 4.4 --- Performance of DPL Disk Array --- p.83Chapter 4.5 --- Chapter Summary --- p.91References --- p.92Appendix --- p.94Chapter CHAPTER 5 --- Performance Analysis of Mirrored Disk Array --- p.101Chapter 5.1 --- Introduction --- p.102Chapter 5.2 --- Queueing Model --- p.103Chapter 5.3 --- Delay Analysis --- p.104Chapter 5.4 --- Numerical Examples and Simulation Results --- p.108References --- p.109Chapter CHAPTER 6 --- State Reduction in the Exact Analysis of Fork/Join Queues --- p.110Chapter 6.1 --- Introduction --- p.111Chapter 6.2 --- State Reduction For Closed Fork/Join Queueing Systems --- p.113Chapter 6.3 --- Extension To Open Fork/Join Queueing Systems --- p.118Chapter 6.4 --- Chapter Summary --- p.122References --- p.123Chapter CHAPTER 7 --- Conclusion and Future Research --- p.124Chapter 7.1 --- Summary --- p.125Chapter 7.2 --- Future Researches --- p.12

    Robust computational intelligence techniques for visual information processing

    Get PDF
    The third part is exclusively dedicated to the super-resolution of Magnetic Resonance Images. In one of these works, an algorithm based on the random shifting technique is developed. Besides, we studied noise removal and resolution enhancement simultaneously. To end, the cost function of deep networks has been modified by different combinations of norms in order to improve their training. Finally, the general conclusions of the research are presented and discussed, as well as the possible future research lines that are able to make use of the results obtained in this Ph.D. thesis.This Ph.D. thesis is about image processing by computational intelligence techniques. Firstly, a general overview of this book is carried out, where the motivation, the hypothesis, the objectives, and the methodology employed are described. The use and analysis of different mathematical norms will be our goal. After that, state of the art focused on the applications of the image processing proposals is presented. In addition, the fundamentals of the image modalities, with particular attention to magnetic resonance, and the learning techniques used in this research, mainly based on neural networks, are summarized. To end up, the mathematical framework on which this work is based on, â‚š-norms, is defined. Three different parts associated with image processing techniques follow. The first non-introductory part of this book collects the developments which are about image segmentation. Two of them are applications for video surveillance tasks and try to model the background of a scenario using a specific camera. The other work is centered on the medical field, where the goal of segmenting diabetic wounds of a very heterogeneous dataset is addressed. The second part is focused on the optimization and implementation of new models for curve and surface fitting in two and three dimensions, respectively. The first work presents a parabola fitting algorithm based on the measurement of the distances of the interior and exterior points to the focus and the directrix. The second work changes to an ellipse shape, and it ensembles the information of multiple fitting methods. Last, the ellipsoid problem is addressed in a similar way to the parabola
    corecore