20,335 research outputs found

    The t-core of an s-core

    Get PDF
    We consider the tt-core of an ss-core partition, when ss and tt are coprime positive integers. Olsson has shown that the tt-core of an ss-core is again an ss-core, and we examine certain actions of the affine symmetric group on ss-cores which preserve the tt-core of an ss-core. Along the way, we give a new proof of Olsson's result. We also give a new proof of a result of Vandehey, showing that there is a simultaneous ss- and tt-core which contains all others

    Three-Level Parallel J-Jacobi Algorithms for Hermitian Matrices

    Get PDF
    The paper describes several efficient parallel implementations of the one-sided hyperbolic Jacobi-type algorithm for computing eigenvalues and eigenvectors of Hermitian matrices. By appropriate blocking of the algorithms an almost ideal load balancing between all available processors/cores is obtained. A similar blocking technique can be used to exploit local cache memory of each processor to further speed up the process. Due to diversity of modern computer architectures, each of the algorithms described here may be the method of choice for a particular hardware and a given matrix size. All proposed block algorithms compute the eigenvalues with relative accuracy similar to the original non-blocked Jacobi algorithm.Comment: Submitted for publicatio

    Performing large full-wave simulations by means of a parallel MLFMA implementation

    Get PDF
    In this paper large full-wave simulations are performed using a parallel Multilevel Fast Multipole Algorithm (MLFMA) implementation. The data structures of the MLFMA-tree are partitioned according to the so-called hierarchical partitioning scheme, while the radiation patterns are partitioned in a blockwise way. To test the implementation of the algorithm, a full-wave simulation of a canonical example with more than 50 millions of unknowns has been performed

    Partition Statistics Equidistributed with the Number of Hook Difference One Cells

    Full text link
    Let λ\lambda be a partition, viewed as a Young diagram. We define the hook difference of a cell of λ\lambda to be the difference of its leg and arm lengths. Define h1,1(λ)h_{1,1}(\lambda) to be the number of cells of λ\lambda with hook difference one. In the paper of Buryak and Feigin (arXiv:1206.5640), algebraic geometry is used to prove a generating function identity which implies that h1,1h_{1,1} is equidistributed with a2a_2, the largest part of a partition that appears at least twice, over the partitions of a given size. In this paper, we propose a refinement of the theorem of Buryak and Feigin and prove some partial results using combinatorial methods. We also obtain a new formula for the q-Catalan numbers which naturally leads us to define a new q,t-Catalan number with a simple combinatorial interpretation

    Weak scalability analysis of the distributed-memory parallel MLFMA

    Get PDF
    Distributed-memory parallelization of the multilevel fast multipole algorithm (MLFMA) relies on the partitioning of the internal data structures of the MLFMA among the local memories of networked machines. For three existing data partitioning schemes (spatial, hybrid and hierarchical partitioning), the weak scalability, i.e., the asymptotic behavior for proportionally increasing problem size and number of parallel processes, is analyzed. It is demonstrated that none of these schemes are weakly scalable. A nontrivial change to the hierarchical scheme is proposed, yielding a parallel MLFMA that does exhibit weak scalability. It is shown that, even for modest problem sizes and a modest number of parallel processes, the memory requirements of the proposed scheme are already significantly lower, compared to existing schemes. Additionally, the proposed scheme is used to perform full-wave simulations of a canonical example, where the number of unknowns and CPU cores are proportionally increased up to more than 200 millions of unknowns and 1024 CPU cores. The time per matrix-vector multiplication for an increasing number of unknowns and CPU cores corresponds very well to the theoretical time complexity

    A scalable parallel finite element framework for growing geometries. Application to metal additive manufacturing

    Get PDF
    This work introduces an innovative parallel, fully-distributed finite element framework for growing geometries and its application to metal additive manufacturing. It is well-known that virtual part design and qualification in additive manufacturing requires highly-accurate multiscale and multiphysics analyses. Only high performance computing tools are able to handle such complexity in time frames compatible with time-to-market. However, efficiency, without loss of accuracy, has rarely held the centre stage in the numerical community. Here, in contrast, the framework is designed to adequately exploit the resources of high-end distributed-memory machines. It is grounded on three building blocks: (1) Hierarchical adaptive mesh refinement with octree-based meshes; (2) a parallel strategy to model the growth of the geometry; (3) state-of-the-art parallel iterative linear solvers. Computational experiments consider the heat transfer analysis at the part scale of the printing process by powder-bed technologies. After verification against a 3D benchmark, a strong-scaling analysis assesses performance and identifies major sources of parallel overhead. A third numerical example examines the efficiency and robustness of (2) in a curved 3D shape. Unprecedented parallelism and scalability were achieved in this work. Hence, this framework contributes to take on higher complexity and/or accuracy, not only of part-scale simulations of metal or polymer additive manufacturing, but also in welding, sedimentation, atherosclerosis, or any other physical problem where the physical domain of interest grows in time
    corecore