71 research outputs found

    Random Number Generators for Parallel Computers

    Get PDF
    Random number generators are used in many applications, from slot machines to simulations of nuclear reactors. For many computational science applications, such as Monte Carlo simulation, it is crucial that the generators have good randomness properties. This is particularly true for large-scale simulations done on high-performance parallel computers. Good random number generators are hard to find, and many widely-used techniques have been shown to be inadequate. Finding high-quality, efficient algorithms for random number generation on parallel computers is even more difficult. Here we present a review of the most commonly-used random number generators for parallel computers, and evaluate each generator based on theoretical knowledge and empirical tests. In conclusion, we provide recommendations for using random number generators on parallel computers

    Evaluation of alternative discrete-event simulation experimental methods

    Get PDF
    The aim of the research was to assist non-experts produce meaningful, non-terminating discrete event simulations studies. The exemplar used was manufacturing applications, in particular sequential production lines. The thesis addressed the selection of methods for introducing randomness, setting the length of individual simulation runs, and determining the conditions for starting measurements". Received wisdom" in these aspects of simulation experimentation was not accepted.The research made use of a Markov Chain queuing model and statistica analysis of exhaustive computer-based experimentation using test models. A specific production-line model drawn from the motor industry was used as a point of reference. A distinctive,quality control like, process of facilitating the controlled introduction of "representative randomness" from a pseudo random-number generator was developed, rather than relying on a generator's a priori performance in standard statistical tests of randomness. This approach proved to be effective and practical. Other results included: The distortion in measurements due to the initial conditions of a simulation run of a queue was only corrected by a lengthy run and not by discarding early results. Simulation experiments of the same queue, demonstrated that a single long run gave greater accuracy than having multiple runs. The choice of random number generator is less important than the choice of seed. Notably, RANDU (a "discredited"MLCG) with careful seed selection was able to outperform in tests both real random numbers, and other MLCGs if their seed were chosen randomly,99.8% of the time. Similar results were obtained for Mersenne Twister and Descriptive Sampling.Descriptive Samnpling was found to provide the best samples and was less susceptible to errorsin the forecast of the required sample size. A method of determining the run length of the simulation that would ensure the run was representative of the true condifions was proposed. An interactive computer program was created to assist in the calculation of the run length of a simulation and determine seeds so as to obtain" highly representative" samples, demonstrating the facility required in simulation software to support theses elected methods

    A formalism for describing and simulating systems with interacting components.

    Get PDF
    This thesis addresses the problem of descriptive complexity presented by systems involving a high number of interacting components. It investigates the evaluation measure of performability and its application to such systems. A new description and simulation language, ICE and it's application to performability modelling is presented. ICE (Interacting ComponEnts) is based upon an earlier description language which was first proposed for defining reliability problems. ICE is declarative in style and has a limited number of keywords. The ethos in the development of the language has been to provide an intuitive formalism with a powerful descriptive space. The full syntax of the language is presented with discussion as to its philosophy. The implementation of a discrete event simulator using an ICE interface is described, with use being made of examples to illustrate the functionality of the code and the semantics of the language. Random numbers are used to provide the required stochastic behaviour within the simulator. The behaviour of an industry standard generator within the simulator and different methods of number allocation are shown. A new generator is proposed that is a development of a fast hardware shift register generator and is demonstrated to possess good statistical properties and operational speed. For the purpose of providing a rigorous description of the language and clarification of its semantics, a computational model is developed using the formalism of extended coloured Petri nets. This model also gives an indication of the language's descriptive power relative to that of a recognised and well developed technique. Some recognised temporal and structural problems of system event modelling are identified. and ICE solutions given. The growing research area of ATM communication networks is introduced and a sophisticated top down model of an ATM switch presented. This model is simulated and interesting results are given. A generic ICE framework for performability modelling is developed and demonstrated. This is considered as a positive contribution to the general field of performability research

    VLSI architectures for public key cryptology

    Get PDF

    The hardware implementation of an artificial neural network using stochastic pulse rate encoding principles

    Get PDF
    In this thesis the development of a hardware artificial neuron device and artificial neural network using stochastic pulse rate encoding principles is considered. After a review of neural network architectures and algorithmic approaches suitable for hardware implementation, a critical review of hardware techniques which have been considered in analogue and digital systems is presented. New results are presented demonstrating the potential of two learning schemes which adapt by the use of a single reinforcement signal. The techniques for computation using stochastic pulse rate encoding are presented and extended with new novel circuits relevant to the hardware implementation of an artificial neural network. The generation of random numbers is the key to the encoding of data into the stochastic pulse rate domain. The formation of random numbers and multiple random bit sequences from a single PRBS generator have been investigated. Two techniques, Simulated Annealing and Genetic Algorithms, have been applied successfully to the problem of optimising the configuration of a PRBS random number generator for the formation of multiple random bit sequences and hence random numbers. A complete hardware design for an artificial neuron using stochastic pulse rate encoded signals has been described, designed, simulated, fabricated and tested before configuration of the device into a network to perform simple test problems. The implementation has shown that the processing elements of the artificial neuron are small and simple, but that there can be a significant overhead for the encoding of information into the stochastic pulse rate domain. The stochastic artificial neuron has the capability of on-line weight adaption. The implementation of reinforcement schemes using the stochastic neuron as a basic element are discussed

    Evaluation of alternative discrete-event simulation experimental methods

    Get PDF
    The aim of the research was to assist non-experts produce meaningful, non-terminating discrete event simulations studies. The exemplar used was manufacturing applications, in particular sequential production lines. The thesis addressed the selection of methods for introducing randomness, setting the length of individual simulation runs, and determining the conditions for starting measurements". Received wisdom" in these aspects of simulation experimentation was not accepted.The research made use of a Markov Chain queuing model and statistica analysis of exhaustive computer-based experimentation using test models. A specific production-line model drawn from the motor industry was used as a point of reference. A distinctive,quality control like, process of facilitating the controlled introduction of "representative randomness" from a pseudo random-number generator was developed, rather than relying on a generator's a priori performance in standard statistical tests of randomness. This approach proved to be effective and practical. Other results included: The distortion in measurements due to the initial conditions of a simulation run of a queue was only corrected by a lengthy run and not by discarding early results. Simulation experiments of the same queue, demonstrated that a single long run gave greater accuracy than having multiple runs. The choice of random number generator is less important than the choice of seed. Notably, RANDU (a "discredited"MLCG) with careful seed selection was able to outperform in tests both real random numbers, and other MLCGs if their seed were chosen randomly,99.8% of the time. Similar results were obtained for Mersenne Twister and Descriptive Sampling.Descriptive Samnpling was found to provide the best samples and was less susceptible to errorsin the forecast of the required sample size. A method of determining the run length of the simulation that would ensure the run was representative of the true condifions was proposed. An interactive computer program was created to assist in the calculation of the run length of a simulation and determine seeds so as to obtain" highly representative" samples, demonstrating the facility required in simulation software to support theses elected methods.EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    Scalability in extensible and heterogeneous storage systems

    Get PDF
    The evolution of computer systems has brought an exponential growth in data volumes, which pushes the capabilities of current storage architectures to organize and access this information effectively: as the unending creation and demand of computer-generated data grows at an estimated rate of 40-60% per year, storage infrastructures need increasingly scalable data distribution layouts that are able to adapt to this growth with adequate performance. In order to provide the required performance and reliability, large-scale storage systems have traditionally relied on multiple RAID-5 or RAID-6 storage arrays, interconnected with high-speed networks like FibreChannel or SAS. Unfortunately, the performance of the current, most commonly-used storage technology-the magnetic disk drive-can't keep up with the rate of growth needed to sustain this explosive growth. Moreover, storage architectures based on solid-state devices (the successors of current magnetic drives) don't seem poised to replace HDD-based storage for the next 5-10 years, at least in data centers. Though the performance of SSDs significantly improves that of hard drives, it would cost the NAND industry hundreds of billions of dollars to build enough manufacturing plants to satisfy the forecasted demand. Besides the problems derived from technological and mechanical limitations, the massive data growth poses more challenges: to build a storage infrastructure, the most flexible approach consists in using pools of storage devices that can be expanded as needed by adding new devices or replacing older ones, thus seamlessly increasing the system's performance and capacity. This approach however, needs data layouts that can adapt to these topology changes and also exploit the potential performance offered by the hardware. Such strategies should be able to rebuild the data layout to accommodate the new devices in the infrastructure, extracting the utmost performance from the hardware and offering a balanced workload distribution. An inadequate data layout might not effectively use the enlarged capacity or better performance provided by newer devices, thus leading to unbalancing problems like bottlenecks or resource underusage. Besides, massive storage systems will inevitably be composed of a collection of heterogeneous hardware: as capacity and performance requirements grow, new storage devices must be added to cope with demand, but it is unlikely that these devices will have the same capacity or performance of those installed. Moreover, upon failure, disks are most commonly replaced by faster and larger ones, since it is not always easy (or cheap) to find a particular model of drive. In the long run, any large-scale storage system will have to cope with a myriad of devices. The title of this dissertation, "Scalability in Extensible and Heterogeneous Storage Systems", refers to the main focus of our contributions in scalable data distributions that can adapt to increasing volumes of data. Our first contribution is the design of a scalable data layout that can adapt to hardware changes while redistributing only the minimum data to keep a balanced workload. With the second contribution, we perform a comparative study on the influence of pseudo-random number generators in the performance and distribution quality of randomized layouts and prove that a badly chosen generator can degrade the quality of the strategy. Our third contribution is an an analysis of long-term data access patterns in several real-world traces to determine if it is possible to offer high performance and a balanced load with less than minimal data rebalancing. In our final contribution, we apply the knowledge learnt about long-term access patterns to design an extensible RAID architecture that can adapt to changes in the number of disks without migrating large amounts of data, and prove that it can be competitive with current RAID arrays with an overhead of at most 1.28% the storage capacity.L'evolució dels sistemes de computació ha dut un creixement exponencial dels volums de dades, que porta al límit la capacitat d'organitzar i accedir informació de les arquitectures d'emmagatzemament actuals. Amb una incessant creació de dades que creix a un ritme estimat del 40-60% per any, les infraestructures de dades requereixen de distribucions de dades cada cop més escalables que puguin adaptar-se a aquest creixement amb un rendiment adequat. Per tal de proporcionar aquest rendiment, els sistemes d'emmagatzemament de gran escala fan servir agregacions RAID5 o RAID6 connectades amb xarxes d'alta velocitat com FibreChannel o SAS. Malauradament, el rendiment de la tecnologia més emprada actualment, el disc magnètic, no creix prou ràpid per sostenir tal creixement explosiu. D'altra banda, les prediccions apunten que els dispositius d'estat sòlid, els successors de la tecnologia actual, no substituiran els discos magnètics fins d'aquí 5-10 anys. Tot i que el rendiment és molt superior, la indústria NAND necessitarà invertir centenars de milions de dòlars per construir prou fàbriques per satisfer la demanda prevista. A més dels problemes derivats de limitacions tècniques i mecàniques, el creixement massiu de les dades suposa més problemes: la solució més flexible per construir una infraestructura d'emmagatzematge consisteix en fer servir grups de dispositius que es poden fer créixer bé afegint-ne de nous, bé reemplaçant-ne els més vells, incrementant així la capacitat i el rendiment del sistema de forma transparent. Aquesta solució, però, requereix distribucions de dades que es puguin adaptar a aquests canvis a la topologia i explotar el rendiment potencial que el hardware ofereix. Aquestes distribucions haurien de poder reconstruir la col.locació de les dades per acomodar els nous dispositius, extraient-ne el màxim rendiment i oferint una càrrega de treball balancejada. Una distribució inadient pot no fer servir de manera efectiva la capacitat o el rendiment addicional ofert pels nous dispositius, provocant problemes de balanceig com colls d¿ampolla o infrautilització. A més, els sistemes d'emmagatzematge massius estaran inevitablement formats per hardware heterogeni: en créixer els requisits de capacitat i rendiment, es fa necessari afegir nous dispositius per poder suportar la demanda, però és poc probable que els dispositius afegits tinguin la mateixa capacitat o rendiment que els ja instal.lats. A més, en cas de fallada, els discos són reemplaçats per d'altres més ràpids i de més capacitat, ja que no sempre és fàcil (o barat) trobar-ne un model particular. A llarg termini, qualsevol arquitectura d'emmagatzematge de gran escala estarà formada per una miríade de dispositius diferents. El títol d'aquesta tesi, "Scalability in Extensible and Heterogeneous Storage Systems", fa referència a les nostres contribucions a la recerca de distribucions de dades escalables que es puguin adaptar a volums creixents d'informació. La primera contribució és el disseny d'una distribució escalable que es pot adaptar canvis de hardware només redistribuint el mínim per mantenir un càrrega de treball balancejada. A la segona contribució, fem un estudi comparatiu sobre l'impacte del generadors de números pseudo-aleatoris en el rendiment i qualitat de les distribucions pseudo-aleatòries de dades, i provem que una mala selecció del generador pot degradar la qualitat de l'estratègia. La tercera contribució és un anàlisi dels patrons d'accés a dades de llarga duració en traces de sistemes reals, per determinar si és possible oferir un alt rendiment i una bona distribució amb una rebalanceig inferior al mínim. A la contribució final, apliquem el coneixement adquirit en aquest estudi per dissenyar una arquitectura RAID extensible que es pot adaptar a canvis en el número de dispositius sense migrar grans volums de dades, i demostrem que pot ser competitiva amb les distribucions ideals RAID actuals, amb només una penalització del 1.28% de la capacita

    Pseudorandom Bit Generation with Asymmetric Numeral Systems

    Get PDF
    The generation of pseudorandom binary sequences is of a great importance in numerous applications stretching from simulation and gambling to cryptography. Pseudorandom bit generators (PRBGs) can be split into two classes depending on their claimed security. The first includes PRBGs that are provably secure (such as the Blum-Blum-Shub one). Security of the second class rests on heuristic arguments. Sadly, PRBG from the first class are inherently inefficient and some PRBG are insecure against quantum attacks. While, their siblings from the second class are very efficient, but security relies on their resistance against known cryptographic attacks. This work presents a construction of PRBG from the asymmetric numeral system (ANS) compression algorithm. We define a family of PRBGs for 2R2^R ANS states and prove that it is indistinguishable from a truly random one for a big enough RR. To make our construction efficient, we investigate PRBG built for smaller R=7,8,9R=7,8,9 and show how to remove local correlations from output stream. We permute output bits using rotation and Keccak transformations and show that permuted bits pass all NIST tests. Our PRBG design is provably secure (for a large enough RR) and heuristically secure (for a smaller RR). Besides, we claim that our PRBG is secure against quantum adversaries

    Elliptic Curve Cryptography on Modern Processor Architectures

    Get PDF
    Abstract Elliptic Curve Cryptography (ECC) has been adopted by the US National Security Agency (NSA) in Suite "B" as part of its "Cryptographic Modernisation Program ". Additionally, it has been favoured by an entire host of mobile devices due to its superior performance characteristics. ECC is also the building block on which the exciting field of pairing/identity based cryptography is based. This widespread use means that there is potentially a lot to be gained by researching efficient implementations on modern processors such as IBM's Cell Broadband Engine and Philip's next generation smart card cores. ECC operations can be thought of as a pyramid of building blocks, from instructions on a core, modular operations on a finite field, point addition & doubling, elliptic curve scalar multiplication to application level protocols. In this thesis we examine an implementation of these components for ECC focusing on a range of optimising techniques for the Cell's SPU and the MIPS smart card. We show significant performance improvements that can be achieved through of adoption of EC
    corecore