101 research outputs found
A new polyhedral approach to combinatorial designs
We consider combinatorial t-design problems as discrete optimization problems. Our motivation is that only a few studies have been done on the use of exact optimization techniques in designs, and that classical methods in design theory have still left many open existence questions. Roughly defined, t-designs are pairs of discrete sets that are related following some strict properties of size, balance, and replication. These highly structured relationships provide optimal solutions to a variety of problems in computer science like error-correcting codes, secure communications, network interconnection, design of hardware; and are applicable to other areas like statistics, scheduling, games, among others. We give a new approach to combinatorial t-designs that is useful in constructing t-designs by polyhedral methods. The first contribution of our work is a new result of equivalence of t-design problems with a graph theory problem. This equivalence leads to a novel integer programming formulation for t-designs, which we call GDP. We analyze the polyhedral properties of GDP and conclude, among other results, the associated polyhedron dimension. We generate new classes of valid inequalities to aim at approximating this integer program by a linear program that has the same optimal solution. Some new classes of valid inequalities are generated as Chv´atal-Gomory cuts, other classes are generated by graph complements and combinatorial arguments, and others are generated by the use of incidence substructures in a t-design. In particular, we found a class of valid inequalities that we call stable-set class that represents an alternative graph equivalence for the problem of finding a t-design. We analyze and give results on the strength of these new classes of valid inequalities. We propose a separation problem and give its integer programming formulation as a maximum (or minimum) edge-weight biclique subgraph problem. We implement a pure cutting-plane algorithm using one of the stronger classes of valid inequalities derived. Several instances of t-designs were solved efficiently by this algorithm at the root node of the search tree. Also, we implement a branch-and-cut algorithm and solve several instances of 2-designs trying different base formulations. Computational results are included
Efficient Quantum Circuit Simulation by Tensor Network Methods on Modern GPUs
Efficient simulation of quantum circuits has become indispensable with the
rapid development of quantum hardware. The primary simulation methods are based
on state vectors and tensor networks. As the number of qubits and quantum gates
grows larger in current quantum devices, traditional state-vector based quantum
circuit simulation methods prove inadequate due to the overwhelming size of the
Hilbert space and extensive entanglement. Consequently, brutal force tensor
network simulation algorithms become the only viable solution in such
scenarios. The two main challenges faced in tensor network simulation
algorithms are optimal contraction path finding and efficient execution on
modern computing devices, with the latter determines the actual efficiency. In
this study, we investigate the optimization of such tensor network simulations
on modern GPUs and propose general optimization strategies from two aspects:
computational efficiency and accuracy. Firstly, we propose to transform
critical Einstein summation operations into GEMM operations, leveraging the
specific features of tensor network simulations to amplify the efficiency of
GPUs. Secondly, by analyzing the data characteristics of quantum circuits, we
employ extended precision to ensure the accuracy of simulation results and
mixed precision to fully exploit the potential of GPUs, resulting in faster and
more precise simulations. Our numerical experiments demonstrate that our
approach can achieve a 3.96x reduction in verification time for random quantum
circuit samples in the 18-cycle case of Sycamore, with sustained performance
exceeding 21 TFLOPS on one A100. This method can be easily extended to the
20-cycle case, maintaining the same performance, accelerating by 12.5x compared
to the state-of-the-art CPU-based results and 4.48-6.78x compared to the
state-of-the-art GPU-based results reported in the literature.Comment: 25 pages, 10 figure
A resource allocation mechanism based on cost function synthesis in complex systems
While the management of resources in computer systems can greatly impact the usefulness and integrity of the system, finding an optimal solution to the management problem is unfortunately NP hard. Adding to the complexity, today\u27s \u27modern\u27 systems - such as in multimedia, medical, and military systems - may be, and often are, comprised of interacting real and non-real-time components. In addition, these systems can be driven by a host of non-functional objectives – often differing not only in nature, importance, and form, but also in dimensional units and range, and themselves interacting in complex ways. We refer to systems exhibiting such characteristics as Complex Systems (CS).
We present a method for handling the multiple non-functional system objectives in CS, by addressing decomposition, quantification, and evaluation issues. Our method will result in better allocations, improve objective satisfaction, improve the overall performance of the system, and reduce cost -in a global sense. Moreover, we consider the problem of formulating the cost of an allocation driven by system objectives. We start by discussing issues and relationships among global objectives, their decomposition, and cost functions for evaluation of system objective. Then, as an example of objective and cost function development, we introduce the concept of deadline balancing. Next, we proceed by proving the existence of combining models and their underlying conditions. Then, we describe a hierarchical model for system objective function synthesis. This synthesis is performed solely for the purpose of measuring the level of objective satisfaction in a proposed hardware to software allocation, not for design of individual software modules. Then, Examples are given to show how the model applies to actual multi-objective problems.
In addition the concept of deadline balancing is extended to a new scheduling concept, namely Inter-Completion-Time Scheduling (ICTS. Finally, experiments based on simulation have been conducted to capture various properties of the synthesis approach as well as ICTS. A prototype implementation of the cost functions synthesis and evaluation environment is described, highlighting the applicability and usefulness of the synthesis in realistic applications
GCA: Global Congestion Awareness for Load Balance in Networks-on-Chip
As modern CMPs scale to ever increasing core counts, Networks-on-Chip (NoCs) are emerging as an interconnection fabric, enabling communication between components. While NoCs are easy to implement and provide high and scalable bandwidth, current routing algorithms, such as dimension-ordered routing, suffer from poor load balance, leading to reduced throughput and high latencies. Improving load balance, hence, is critical in future CMP designs where increased latency leads to wasted power and energy waiting for outstanding requests to resolve. Adaptive routing is a known technique to improve load balance; however, prior adaptive routing techniques either use local, myopic information or misinformed, regionally-aggregated information to form their routing decisions. This thesis proposes a new, light-weight, adaptive routing algorithm for on-chip routers based on global link state and congestion information, Global Congestion Awareness (GCA). GCA leverages unused bits in existing packet header flits to "piggyback" congestion state information around the network and uses a simple, low-complexity route calculation unit, to calculate optimal packet paths to their destination without the myopia of local decisions, nor the aggregation of unrelated status information, found in prior designs. In particular GCA outperforms local adaptive routing by up to 82%, Regional Congestion Awareness (RCA) by up to 51%, and a recent competing adaptive routing algorithm, DAR, by 8% on average on realistic workloads
- …