306 research outputs found

    Joint Unitary Triangularization for MIMO Networks

    Full text link
    This work considers communication networks where individual links can be described as MIMO channels. Unlike orthogonal modulation methods (such as the singular-value decomposition), we allow interference between sub-channels, which can be removed by the receivers via successive cancellation. The degrees of freedom earned by this relaxation are used for obtaining a basis which is simultaneously good for more than one link. Specifically, we derive necessary and sufficient conditions for shaping the ratio vector of sub-channel gains of two broadcast-channel receivers. We then apply this to two scenarios: First, in digital multicasting we present a practical capacity-achieving scheme which only uses scalar codes and linear processing. Then, we consider the joint source-channel problem of transmitting a Gaussian source over a two-user MIMO channel, where we show the existence of non-trivial cases, where the optimal distortion pair (which for high signal-to-noise ratios equals the optimal point-to-point distortions of the individual users) may be achieved by employing a hybrid digital-analog scheme over the induced equivalent channel. These scenarios demonstrate the advantage of choosing a modulation basis based upon multiple links in the network, thus we coin the approach "network modulation".Comment: Submitted to IEEE Tran. Signal Processing. Revised versio

    Among graphs, groups, and latin squares

    Get PDF
    A latin square of order n is an n × n array in which each row and each column contains each of the numbers {1, 2, . . . , n}. A k-plex in a latin square is a collection of entries which intersects each row and column k times and contains k copies of each symbol. This thesis studies the existence of k-plexes and approximations of k-plexes in latin squares, paying particular attention to latin squares which correspond to multiplication tables of groups. The most commonly studied class of k-plex is the 1-plex, better known as a transversal. Although many latin squares do not have transversals, Brualdi conjectured that every latin square has a near transversal—i.e. a collection of entries with distinct symbols which in- tersects all but one row and all but one column. Our first main result confirms Brualdi’s conjecture in the special case of group-based latin squares. Then, using a well-known equivalence between edge-colorings of complete bipartite graphs and latin squares, we introduce Hamilton 2-plexes. We conjecture that every latin square of order n ≄ 5 has a Hamilton 2-plex and provide a range of evidence for this conjecture. In particular, we confirm our conjecture computationally for n ≀ 8 and show that a suitable analogue of Hamilton 2-plexes always occur in n × n arrays with no symbol appearing more than n/√96 times. To study Hamilton 2-plexes in group-based latin squares, we generalize the notion of harmonious groups to what we call H2-harmonious groups. Our second main result classifies all H2-harmonious abelian groups. The last part of the thesis formalizes an idea which first appeared in a paper of Cameron and Wanless: a (k,l)-plex is a collection of entries which intersects each row and column k times and contains at most l copies of each symbol. We demonstrate the existence of (k, 4k)-plexes in all latin squares and (k, k + 1)-plexes in sufficiently large latin squares. We also find analogues of these theorems for Hamilton 2-plexes, including our third main result: every sufficiently large latin square has a Hamilton (2,3)-plex

    Book of Abstracts of the Sixth SIAM Workshop on Combinatorial Scientific Computing

    Get PDF
    Book of Abstracts of CSC14 edited by Bora UçarInternational audienceThe Sixth SIAM Workshop on Combinatorial Scientific Computing, CSC14, was organized at the Ecole Normale Supérieure de Lyon, France on 21st to 23rd July, 2014. This two and a half day event marked the sixth in a series that started ten years ago in San Francisco, USA. The CSC14 Workshop's focus was on combinatorial mathematics and algorithms in high performance computing, broadly interpreted. The workshop featured three invited talks, 27 contributed talks and eight poster presentations. All three invited talks were focused on two interesting fields of research specifically: randomized algorithms for numerical linear algebra and network analysis. The contributed talks and the posters targeted modeling, analysis, bisection, clustering, and partitioning of graphs, applied in the context of networks, sparse matrix factorizations, iterative solvers, fast multi-pole methods, automatic differentiation, high-performance computing, and linear programming. The workshop was held at the premises of the LIP laboratory of ENS Lyon and was generously supported by the LABEX MILYON (ANR-10-LABX-0070, Université de Lyon, within the program ''Investissements d'Avenir'' ANR-11-IDEX-0007 operated by the French National Research Agency), and by SIAM

    Development of a high-order parallel solver for direct and large eddy simulations of turbulent flows

    Get PDF
    Turbulence is inherent in fluid dynamics, in that laminar flows are rather the exception than the rule, hence the longstanding interest in the subject, both within the academic community and the industrial R&D laboratories. Since 1883, much progress has been made, and statistics applied to turbulence have provided understanding of the scaling laws which are peculiar to several model flows, whereas experiments have given insight on the structure of real-world flows, but, soon enough, numerical approaches to the matter have become the most promising ones, since they lay the ground for the solution of high Reynolds number unsteady Navier-Stokes equations by means of computer systems. Nevertheless, despite the exponential rise in computational capability over the last few decades, the more computer technology advances, the higher the Reynolds number sought for test-cases of industrial interest: there is a natural tendency to perform simulations as large as possible, a habit that leaves no room for wasting resources. Indeed, as the scale separation grows with Re, the reduction of wall clock times for a high-fidelity solution of desired accuracy becomes increasingly important. To achieve this task, a CFD solver should rely on the use of appropriate physical models, consistent numerical methods to discretize the equations, accurate non-dissipative numerical schemes, efficient algorithms to solve the numerics, and fast routines implementing those algorithms. Two archetypal approaches to CFD are direct and large-eddy simulation (DNS and LES respectively), which profoundly differ in several aspects but are both “eddy-resolving” methods, meant to resolve the structures of the flow-field with the highest possible accuracy and putting in as little spurious dissipation as possible. These two requirements of accurate resolution of scales, and energy conservation, should be addressed by any numerical method, since they are essential to many real-world fluid flows of industrial interest. As a consequence, high order numerical schemes, and compact schemes among them, have received much consideration, since they address both goals, at the cost of a lower ease of application of the boundary condition, and a higher computational cost. The latter problem is tackled with parallel computing, which also allows to take advantage of the currently available computer power at the best possible extent. The research activity conducted by the present author has concerned the development, from scratch, of a three-dimensional, unsteady, incompressible Navier-Stokes parallel solver, which uses an advanced algorithm for the process-wise solution of the linear systems arising from the application of high order compact finite difference schemes, and hinges upon a three-dimensional decomposition of the cartesian computational space. The code is written in modern Fortran 2003 — plus a few features which are unique to the 2008 standard — and is parallelized through the use of MPI 3.1 standard’s advanced routines, as implemented by the OpenMPI library project. The coding was carried out with the objective of creating an original CFD high-order parallel solver which is maintainable and extendable, of course within a well-defined range of possibilities. With this main priority being outlined, particular attention was paid to several key concepts: modularity and readability of the source code and, in turn, its reusability; ease of implementation of virtually any new explicit or implicit finite difference scheme; modern programming style and avoidance of deprecated old legacy Fortran constructs and features, so that the world wide web is a reliable and active means to the quick solution of coding problems arising from the implementation of new modules in the code; last but not least, thorough comments, especially in critical sections of the code, explaining motives and possible expected weak links. Design, production, and documentation of a program from scratch is almost never complete. This is certainly true for the present effort. The method and the code are verified against the full three-dimensional Lid-Driven Cavity and Taylor-Green Vortex flows. The latter test is used also for the assessment of scalability and parallel efficiency

    Capture, storage, and analysis of video images on the Alcator C-Mod tokamak

    Get PDF

    Extracting Data-Level Parallelism in High-Level Synthesis for Reconfigurable Architectures

    Get PDF
    High-Level Synthesis (HLS) tools are a set of algorithms that allow programmers to obtain implementable Hardware Description Language (HDL) code from specifications written high-level, sequential languages such as C, C++, or Java. HLS has allowed programmers to code in their preferred language while still obtaining all the benefits hardware acceleration has to offer without them needing to be intimately familiar with the hardware platform of the accelerator. In this work we summarize and expand upon several of our approaches to improve the automatic memory banking capabilities of HLS tools targeting reconfigurable architectures, namely Field-Programmable Gate Arrays or FPGA\u27s. We explored several approaches to automatically find the optimal partition factor and a usable banking scheme for stencil kernels including a tessellation based approach using multiple families of hyperplanes to do the partitioning which was able to find a better banking factor than current state-of-the-art methods and a graph theory methodology that allowed us to mathematically prove the optimality of our banking solutions. For non-stencil kernels we relaxed some of the conditions in our graph-based model to propose a best-effort solution to arbitrarily reduce memory access conflicts (simultaneous accesses to the same memory bank). We also proposed a non-linear transformation using prime factorization to convert a small subset of non-stencil kernels into stencil memory accesses, allowing us to use all previous work in memory partition to them. Our approaches were able to obtain better results than commercial tools and state-of-the-art algorithms in terms of reduced resource utilization and increased frequency of operation. We were also able to obtain better partition factors for some stencil kernels and usable baking schemes for non-stencil kernels with better performance than any applicable existing algorithm
    • 

    corecore