4,461 research outputs found

    Development of maths capabilities and confidence in primary school

    Get PDF

    Multilevel Monte Carlo methods for highly heterogeneous media

    Full text link
    We discuss the application of multilevel Monte Carlo methods to elliptic partial differential equations with random coefficients. Such problems arise, for example, in uncertainty quantification in subsurface flow modeling. We give a brief review of recent advances in the numerical analysis of the multilevel algorithm under minimal assumptions on the random coefficient, and extend the analysis to cover also tensor--valued coefficients, as well as point evaluations. Our analysis includes as an example log--normal random coefficients, which are frequently used in applications.Comment: 14 pages, no figure

    Rethinking Arithmetic for Deep Neural Networks

    Full text link
    We consider efficiency in the implementation of deep neural networks. Hardware accelerators are gaining interest as machine learning becomes one of the drivers of high-performance computing. In these accelerators, the directed graph describing a neural network can be implemented as a directed graph describing a Boolean circuit. We make this observation precise, leading naturally to an understanding of practical neural networks as discrete functions, and show that so-called binarised neural networks are functionally complete. In general, our results suggest that it is valuable to consider Boolean circuits as neural networks, leading to the question of which circuit topologies are promising. We argue that continuity is central to generalisation in learning, explore the interaction between data coding, network topology, and node functionality for continuity, and pose some open questions for future research. As a first step to bridging the gap between continuous and Boolean views of neural network accelerators, we present some recent results from our work on LUTNet, a novel Field-Programmable Gate Array inference approach. Finally, we conclude with additional possible fruitful avenues for research bridging the continuous and discrete views of neural networks

    Thermodynamic-RAM Technology Stack

    Full text link
    We introduce a technology stack or specification describing the multiple levels of abstraction and specialization needed to implement a neuromorphic processor (NPU) based on the previously-described concept of AHaH Computing and integrate it into today's digital computing systems. The general purpose NPU implementation described here is called Thermodynamic-RAM (kT-RAM) and is just one of many possible architectures, each with varying advantages and trade offs. Bringing us closer to brain-like neural computation, kT-RAM will provide a general-purpose adaptive hardware resource to existing computing platforms enabling fast and low-power machine learning capabilities that are currently hampered by the separation of memory and processing, a.k.a the von Neumann bottleneck. Because understanding such a processor based on non-traditional principles can be difficult, by presenting the various levels of the stack from the bottom up, layer by layer, explaining kT-RAM becomes a much easier task. The levels of the Thermodynamic-RAM technology stack include the memristor, synapse, AHaH node, kT-RAM, instruction set, sparse spike encoding, kT-RAM emulator, and SENSE server

    Data Protection: Combining Fragmentation, Encryption, and Dispersion, a final report

    Full text link
    Hardening data protection using multiple methods rather than 'just' encryption is of paramount importance when considering continuous and powerful attacks in order to observe, steal, alter, or even destroy private and confidential information.Our purpose is to look at cost effective data protection by way of combining fragmentation, encryption, and dispersion over several physical machines. This involves deriving general schemes to protect data everywhere throughout a network of machines where they are being processed, transmitted, and stored during their entire life cycle. This is being enabled by a number of parallel and distributed architectures using various set of cores or machines ranging from General Purpose GPUs to multiple clouds. In this report, we first present a general and conceptual description of what should be a fragmentation, encryption, and dispersion system (FEDS) including a number of high level requirements such systems ought to meet. Then, we focus on two kind of fragmentation. First, a selective separation of information in two fragments a public one and a private one. We describe a family of processes and address not only the question of performance but also the questions of memory occupation, integrity or quality of the restitution of the information, and of course we conclude with an analysis of the level of security provided by our algorithms. Then, we analyze works first on general dispersion systems in a bit wise manner without data structure consideration; second on fragmentation of information considering data defined along an object oriented data structure or along a record structure to be stored in a relational database

    High-performance computing selection of models of DNA substitution for multicore clusters

    Get PDF
    [Abstract] This paper presents the high-performance computing (HPC) support of jModelTest2, the most popular bioinformatic tool for the statistical selection of models of DNA substitution. As this can demand vast computational resources, especially in terms of processing power, jModelTest2 implements three parallel algorithms for model selection: (1) a multithreaded implementation for shared memory architectures; (2) a message-passing implementation for distributed memory architectures, such as clusters; and (3) a hybrid shared/distributed memory implementation for clusters of multicore nodes, combining the workload distribution across cluster nodes with a multithreaded model optimization within each node. The main limitation of the shared and distributed versions is the workload imbalance that generally appears when using more than 32 cores, a direct consequence of the heterogeneity in the computational cost of the evaluated models. The hybrid shared/distributed memory version overcomes this issue reducing the workload imbalance through a thread-based decomposition of the most costly model optimization tasks. The performance evaluation of this HPC application on a 40-core shared memory system and on a 528-core cluster has shown high scalability, with speedups of the multithreaded version of up to 32, and up to 257 for the hybrid shared/distributed memory implementation. This can represent a reduction in the execution time of some analyses from 4 days down to barely 20 minutes. The implementation of the three parallel execution strategies of jModelTest2 presented in this paper are available under a GPL license at http://code.google.com/jmodeltest2.European Research Council; ERC-2007-Stg 203161-PHYGENOM to D.P.Ministerio de Ciencia y Educación; BFU2009-08611 to D.P.Ministerio de Ciencia y Educación; TIN2010-16735 to R.D

    Beyond Powers of Two: Hexagonal Modulation and Non-Binary Coding for Wireless Communication Systems

    Full text link
    Adaptive modulation and coding (AMC) is widely employed in modern wireless communication systems to improve the transmission efficiency by adjusting the transmission rate according to the channel conditions. Thus, AMC can provide very efficient use of channel resources especially over fading channels. Quadrature Amplitude Modulation (QAM) is an ef- ficient and widely employed digital modulation technique. It typically employs a rectangular signal constellation. Therefore the decision regions of the constellation are square partitions of the two-dimensional signal space. However, it is well known that hexagons rather than squares provide the most compact regular tiling in two dimensions. A compact tiling means a dense packing of the constellation points and thus more energy efficient data transmission. Hexagonal modulation can be difficult to implement because it does not fit well with the usual power- of-two symbol sizes employed with binary data. To overcome this problem, non-binary coding is combined with hexagonal modulation in this paper to provide a system which is compatible with binary data. The feasibility and efficiency are evaluated using a software-defined radio (SDR) based prototype. Extensive simulation results are presented which show that this approach can provide improved energy efficiency and spectrum utilization in wireless communication systems.Comment: 9 page

    Secure Payment System Utilizing MANET for Disaster Areas

    Full text link
    Mobile payment system in a disaster area have the potential to provide electronic transactions for people purchasing recovery goods like foodstuffs, clothes, and medicine. Conversely, to enable transactions in a disaster area, current payment systems need communication infrastructures (such as wired networks and cellular networks) which may be ruined during such disasters as large-scale earthquakes and flooding and thus cannot be depended on in a disaster area. In this paper, we introduce a new mobile payment system utilizing infrastructureless MANETs to enable transactions that permit users to shop in disaster areas. Specifically, we introduce an endorsement-based mechanism to provide payment guarantees for a customer-to-merchant transaction and a multilevel endorsement mechanism with a lightweight scheme based on Bloom filter and Merkle tree to reduce communication overheads. Our mobile payment system achieves secure transaction by adopting various schemes such as location-based mutual monitoring scheme and blind signature, while our newly introduce event chain mechanism prevents double spending attacks. As validated by simulations, the proposed mobile payment system is useful in a disaster area, achieving high transaction completion ratio, 65% - 90% for all scenario tested, and is storage-efficient for mobile devices with an overall average of 7MB merchant message size

    High Performance Evaluation of Helmholtz Potentials using the Multi-Level Fast Multipole Algorithm

    Full text link
    Evaluation of pair potentials is critical in a number of areas of physics. The classicalN-body problem has its root in evaluating the Laplace potential, and has spawned tree-algorithms, the fast multipole method (FMM), as well as kernel independent approaches. Over the years, FMM for Laplace potential has had a profound impact on a number of disciplines as it has been possible to develop highly scalable parallel algorithm for these potential evaluators. This is in stark contrast to parallel algorithms for the Helmholtz (oscillatory) potentials. The principal bottleneck to scalable parallelism are operations necessary to traverse up, across and down the tree, affecting both computation and communication. In this paper, we describe techniques to overcome bottlenecks and achieve high performance evaluation of the Helmholtz potential for a wide spectrum of geometries. We demonstrate that the resulting implementation has a load balancing effect that significantly reduces the time-to-solution and enhances the scale of problems that can be treated using full wave physics.Comment: Submitted to ACM Transactions on Parallel Computin
    corecore