373 research outputs found

    On the impact of communication complexity in the design of parallel numerical algorithms

    Get PDF
    This paper describes two models of the cost of data movement in parallel numerical algorithms. One model is a generalization of an approach due to Hockney, and is suitable for shared memory multiprocessors where each processor has vector capabilities. The other model is applicable to highly parallel nonshared memory MIMD systems. In the second model, algorithm performance is characterized in terms of the communication network design. Techniques used in VLSI complexity theory are also brought in, and algorithm independent upper bounds on system performance are derived for several problems that are important to scientific computation

    Evolution of Network Architecture in a Granular Material Under Compression

    Full text link
    As a granular material is compressed, the particles and forces within the system arrange to form complex and heterogeneous collective structures. Force chains are a prime example of such structures, and are thought to constrain bulk properties such as mechanical stability and acoustic transmission. However, capturing and characterizing the evolving nature of the intrinsic inhomogeneity and mesoscale architecture of granular systems can be challenging. A growing body of work has shown that graph theoretic approaches may provide a useful foundation for tackling these problems. Here, we extend the current approaches by utilizing multilayer networks as a framework for directly quantifying the progression of mesoscale architecture in a compressed granular system. We examine a quasi-two-dimensional aggregate of photoelastic disks, subject to biaxial compressions through a series of small, quasistatic steps. Treating particles as network nodes and interparticle forces as network edges, we construct a multilayer network for the system by linking together the series of static force networks that exist at each strain step. We then extract the inherent mesoscale structure from the system by using a generalization of community detection methods to multilayer networks, and we define quantitative measures to characterize the changes in this structure throughout the compression process. We separately consider the network of normal and tangential forces, and find that they display a different progression throughout compression. To test the sensitivity of the network model to particle properties, we examine whether the method can distinguish a subsystem of low-friction particles within a bath of higher-friction particles. We find that this can be achieved by considering the network of tangential forces, and that the community structure is better able to separate the subsystem than a purely local measure of interparticle forces alone. The results discussed throughout this study suggest that these network science techniques may provide a direct way to compare and classify data from systems under different external conditions or with different physical makeup

    Expanded delta networks for very large parallel computers

    Get PDF
    In this paper we analyze a generalization of the traditional delta network, introduced by Patel [21], and dubbed Expanded Delta Network (EDN). These networks provide in general multiple paths that can be exploited to reduce contention in the network resulting in increased performance. The crossbar and traditional delta networks are limiting cases of this class of networks. However, the delta network does not provide the multiple paths that the more general expanded delta networks provide, and crossbars are to costly to use for large networks. The EDNs are analyzed with respect to their routing capabilities in the MIMD and SIMD models of computation.The concepts of capacity and clustering are also addressed. In massively parallel SIMD computers, it is the trend to put a larger number processors on a chip, but due to I/O constraints only a subset of the total number of processors may have access to the network. This is introduced as a Restricted Access Expanded Delta Network of which the MasPar MP-1 router network is an example

    Maturation trajectories of cortical resting-state networks depend on the mediating frequency band

    Full text link
    The functional significance of resting state networks and their abnormal manifestations in psychiatric disorders are firmly established, as is the importance of the cortical rhythms in mediating these networks. Resting state networks are known to undergo substantial reorganization from childhood to adulthood, but whether distinct cortical rhythms, which are generated by separable neural mechanisms and are often manifested abnormally in psychiatric conditions, mediate maturation differentially, remains unknown. Using magnetoencephalography (MEG) to map frequency band specific maturation of resting state networks from age 7 to 29 in 162 participants (31 independent), we found significant changes with age in networks mediated by the beta (13–30 Hz) and gamma (31–80 Hz) bands. More specifically, gamma band mediated networks followed an expected asymptotic trajectory, but beta band mediated networks followed a linear trajectory. Network integration increased with age in gamma band mediated networks, while local segregation increased with age in beta band mediated networks. Spatially, the hubs that changed in importance with age in the beta band mediated networks had relatively little overlap with those that showed the greatest changes in the gamma band mediated networks. These findings are relevant for our understanding of the neural mechanisms of cortical maturation, in both typical and atypical development.This work was supported by grants from the Nancy Lurie Marks Family Foundation (TK, SK, MGK), Autism Speaks (TK), The Simons Foundation (SFARI 239395, TK), The National Institute of Child Health and Development (R01HD073254, TK), National Institute for Biomedical Imaging and Bioengineering (P41EB015896, 5R01EB009048, MSH), and the Cognitive Rhythms Collaborative: A Discovery Network (NFS 1042134, MSH). (Nancy Lurie Marks Family Foundation; Autism Speaks; SFARI 239395 - Simons Foundation; R01HD073254 - National Institute of Child Health and Development; P41EB015896 - National Institute for Biomedical Imaging and Bioengineering; 5R01EB009048 - National Institute for Biomedical Imaging and Bioengineering; NFS 1042134 - Cognitive Rhythms Collaborative: A Discovery Network

    Solving the Uncapacitated Single Allocation p-Hub Median Problem on GPU

    Full text link
    A parallel genetic algorithm (GA) implemented on GPU clusters is proposed to solve the Uncapacitated Single Allocation p-Hub Median problem. The GA uses binary and integer encoding and genetic operators adapted to this problem. Our GA is improved by generated initial solution with hubs located at middle nodes. The obtained experimental results are compared with the best known solutions on all benchmarks on instances up to 1000 nodes. Furthermore, we solve our own randomly generated instances up to 6000 nodes. Our approach outperforms most well-known heuristics in terms of solution quality and time execution and it allows hitherto unsolved problems to be solved

    A Lower Bound Technique for Communication in BSP

    Get PDF
    Communication is a major factor determining the performance of algorithms on current computing systems; it is therefore valuable to provide tight lower bounds on the communication complexity of computations. This paper presents a lower bound technique for the communication complexity in the bulk-synchronous parallel (BSP) model of a given class of DAG computations. The derived bound is expressed in terms of the switching potential of a DAG, that is, the number of permutations that the DAG can realize when viewed as a switching network. The proposed technique yields tight lower bounds for the fast Fourier transform (FFT), and for any sorting and permutation network. A stronger bound is also derived for the periodic balanced sorting network, by applying this technique to suitable subnetworks. Finally, we demonstrate that the switching potential captures communication requirements even in computational models different from BSP, such as the I/O model and the LPRAM
    • …
    corecore