101 research outputs found

    Simulated annealing based datapath synthesis

    Get PDF

    A new polyhedral approach to combinatorial designs

    Get PDF
    We consider combinatorial t-design problems as discrete optimization problems. Our motivation is that only a few studies have been done on the use of exact optimization techniques in designs, and that classical methods in design theory have still left many open existence questions. Roughly defined, t-designs are pairs of discrete sets that are related following some strict properties of size, balance, and replication. These highly structured relationships provide optimal solutions to a variety of problems in computer science like error-correcting codes, secure communications, network interconnection, design of hardware; and are applicable to other areas like statistics, scheduling, games, among others. We give a new approach to combinatorial t-designs that is useful in constructing t-designs by polyhedral methods. The first contribution of our work is a new result of equivalence of t-design problems with a graph theory problem. This equivalence leads to a novel integer programming formulation for t-designs, which we call GDP. We analyze the polyhedral properties of GDP and conclude, among other results, the associated polyhedron dimension. We generate new classes of valid inequalities to aim at approximating this integer program by a linear program that has the same optimal solution. Some new classes of valid inequalities are generated as Chv´atal-Gomory cuts, other classes are generated by graph complements and combinatorial arguments, and others are generated by the use of incidence substructures in a t-design. In particular, we found a class of valid inequalities that we call stable-set class that represents an alternative graph equivalence for the problem of finding a t-design. We analyze and give results on the strength of these new classes of valid inequalities. We propose a separation problem and give its integer programming formulation as a maximum (or minimum) edge-weight biclique subgraph problem. We implement a pure cutting-plane algorithm using one of the stronger classes of valid inequalities derived. Several instances of t-designs were solved efficiently by this algorithm at the root node of the search tree. Also, we implement a branch-and-cut algorithm and solve several instances of 2-designs trying different base formulations. Computational results are included

    Subject index volumes 1–92

    Get PDF

    Efficient Quantum Circuit Simulation by Tensor Network Methods on Modern GPUs

    Full text link
    Efficient simulation of quantum circuits has become indispensable with the rapid development of quantum hardware. The primary simulation methods are based on state vectors and tensor networks. As the number of qubits and quantum gates grows larger in current quantum devices, traditional state-vector based quantum circuit simulation methods prove inadequate due to the overwhelming size of the Hilbert space and extensive entanglement. Consequently, brutal force tensor network simulation algorithms become the only viable solution in such scenarios. The two main challenges faced in tensor network simulation algorithms are optimal contraction path finding and efficient execution on modern computing devices, with the latter determines the actual efficiency. In this study, we investigate the optimization of such tensor network simulations on modern GPUs and propose general optimization strategies from two aspects: computational efficiency and accuracy. Firstly, we propose to transform critical Einstein summation operations into GEMM operations, leveraging the specific features of tensor network simulations to amplify the efficiency of GPUs. Secondly, by analyzing the data characteristics of quantum circuits, we employ extended precision to ensure the accuracy of simulation results and mixed precision to fully exploit the potential of GPUs, resulting in faster and more precise simulations. Our numerical experiments demonstrate that our approach can achieve a 3.96x reduction in verification time for random quantum circuit samples in the 18-cycle case of Sycamore, with sustained performance exceeding 21 TFLOPS on one A100. This method can be easily extended to the 20-cycle case, maintaining the same performance, accelerating by 12.5x compared to the state-of-the-art CPU-based results and 4.48-6.78x compared to the state-of-the-art GPU-based results reported in the literature.Comment: 25 pages, 10 figure

    A resource allocation mechanism based on cost function synthesis in complex systems

    Get PDF
    While the management of resources in computer systems can greatly impact the usefulness and integrity of the system, finding an optimal solution to the management problem is unfortunately NP hard. Adding to the complexity, today\u27s \u27modern\u27 systems - such as in multimedia, medical, and military systems - may be, and often are, comprised of interacting real and non-real-time components. In addition, these systems can be driven by a host of non-functional objectives – often differing not only in nature, importance, and form, but also in dimensional units and range, and themselves interacting in complex ways. We refer to systems exhibiting such characteristics as Complex Systems (CS). We present a method for handling the multiple non-functional system objectives in CS, by addressing decomposition, quantification, and evaluation issues. Our method will result in better allocations, improve objective satisfaction, improve the overall performance of the system, and reduce cost -in a global sense. Moreover, we consider the problem of formulating the cost of an allocation driven by system objectives. We start by discussing issues and relationships among global objectives, their decomposition, and cost functions for evaluation of system objective. Then, as an example of objective and cost function development, we introduce the concept of deadline balancing. Next, we proceed by proving the existence of combining models and their underlying conditions. Then, we describe a hierarchical model for system objective function synthesis. This synthesis is performed solely for the purpose of measuring the level of objective satisfaction in a proposed hardware to software allocation, not for design of individual software modules. Then, Examples are given to show how the model applies to actual multi-objective problems. In addition the concept of deadline balancing is extended to a new scheduling concept, namely Inter-Completion-Time Scheduling (ICTS. Finally, experiments based on simulation have been conducted to capture various properties of the synthesis approach as well as ICTS. A prototype implementation of the cost functions synthesis and evaluation environment is described, highlighting the applicability and usefulness of the synthesis in realistic applications

    GCA: Global Congestion Awareness for Load Balance in Networks-on-Chip

    Get PDF
    As modern CMPs scale to ever increasing core counts, Networks-on-Chip (NoCs) are emerging as an interconnection fabric, enabling communication between components. While NoCs are easy to implement and provide high and scalable bandwidth, current routing algorithms, such as dimension-ordered routing, suffer from poor load balance, leading to reduced throughput and high latencies. Improving load balance, hence, is critical in future CMP designs where increased latency leads to wasted power and energy waiting for outstanding requests to resolve. Adaptive routing is a known technique to improve load balance; however, prior adaptive routing techniques either use local, myopic information or misinformed, regionally-aggregated information to form their routing decisions. This thesis proposes a new, light-weight, adaptive routing algorithm for on-chip routers based on global link state and congestion information, Global Congestion Awareness (GCA). GCA leverages unused bits in existing packet header flits to "piggyback" congestion state information around the network and uses a simple, low-complexity route calculation unit, to calculate optimal packet paths to their destination without the myopia of local decisions, nor the aggregation of unrelated status information, found in prior designs. In particular GCA outperforms local adaptive routing by up to 82%, Regional Congestion Awareness (RCA) by up to 51%, and a recent competing adaptive routing algorithm, DAR, by 8% on average on realistic workloads
    • …
    corecore