507 research outputs found

    Towards a GPU SDN controller

    Full text link
    Abstract—The SDN concept of separating and centralizing the control plane from the data plane has provided more flexibility and programmability to the deployment of the networks. On the other hand, the separation of the planes has raised some scala-bility and performance questions, being that the SDN controller is the bottleneck. In this paper we present an implementation of a GPU SDN controller. The goal of this paper is to mitigate the scalability problem of the SDN controller by offloading all the packet inspection and creation to the GPU. Experimental evaluation shows that the controller is able to process 17 Million flows/s in the worst case scenario using just off-the-shelf GPU’s. I

    A Synergistic Compilation Workflow for Tackling Crosstalk in Quantum Machines

    Full text link
    Near-term quantum systems tend to be noisy. Crosstalk noise has been recognized as one of several major types of noises in superconducting Noisy Intermediate-Scale Quantum (NISQ) devices. Crosstalk arises from the concurrent execution of two-qubit gates on nearby qubits, such as \texttt{CX}. It might significantly raise the error rate of gates in comparison to running them individually. Crosstalk can be mitigated through scheduling or hardware machine tuning. Prior scientific studies, however, manage crosstalk at a really late phase in the compilation process, usually after hardware mapping is done. It may miss great opportunities of optimizing algorithm logic, routing, and crosstalk at the same time. In this paper, we push the envelope by considering all these factors simultaneously at the very early compilation stage. We propose a crosstalk-aware quantum program compilation framework called CQC that can enhance crosstalk mitigation while achieving satisfactory circuit depth. Moreover, we identify opportunities for translation from intermediate representation to the circuit for application-specific crosstalk mitigation, for instance, the \texttt{CX} ladder construction in variational quantum eigensolvers (VQE). Evaluations through simulation and on real IBM-Q devices show that our framework can significantly reduce the error rate by up to 6×\times, with only \sim60\% circuit depth compared to state-of-the-art gate scheduling approaches. In particular, for VQE, we demonstrate 49\% circuit depth reduction with 9.6\% fidelity improvement over prior art on the H4 molecule using IBMQ Guadalupe. Our CQC framework will be released on GitHub

    A Structured Method for Compilation of QAOA Circuits in Quantum Computing

    Full text link
    Quantum Approximation Optimization Algorithm (QAOA) is a highly advocated variational algorithm for solving the combinatorial optimization problem. One critical feature in the quantum circuit of QAOA algorithm is that it consists of two-qubit operators that commute. The flexibility in reordering the two-qubit gates allows compiler optimizations to generate circuits with better depths, gate count, and fidelity. However, it also imposes significant challenges due to additional freedom exposed in the compilation. Prior studies lack the following: (1) Performance guarantee, (2) Scalability, and (3) Awareness of regularity in scalable hardware. We propose a structured method that ensures linear depth for any compiled QAOA circuit on multi-dimensional quantum architectures. We also demonstrate how our method runs on Google Sycamore and IBM Non-linear architectures in a scalable manner and in linear time. Overall, we can compile a circuit with up to 1024 qubits in 10 seconds with a 3.8X speedup in depth, 17% reduction in gate count, and 18X improvement for circuit ESP.Comment: 11 pages, 22 figure

    Tetris: A compilation Framework for VQE Applications

    Full text link
    Quantum computing has shown promise in solving complex problems by leveraging the principles of superposition and entanglement. The Variational Quantum Eigensolver (VQE) algorithm stands as a pivotal approach in the realm of quantum algorithms, enabling the simulation of quantum systems on quantum hardware. In this paper, we introduce two innovative techniques, namely "Tetris" and "Fast Bridging," designed to enhance the efficiency and effectiveness of VQE tasks. The "Tetris" technique addresses a crucial aspect of VQE optimization by unveiling cancellation opportunities within the logical circuit phase of UCCSD ansatz. Tetris demonstrates a remarkable reduction up to 20% in CNOT gate counts, about 119048 CNOT gates, and 30% depth reduction compared to the state-of-the-art compiler 'Paulihedral'. In addition to Tetris, we present the "Fast Bridging" technique as an alternative to the conventional qubit routing methods that heavily rely on swap operations. The fast bridging offers a novel approach to qubit routing, mitigating the limitations associated with swap-heavy routing. By integrating the fast bridging into the VQE framework, we observe further reductions in CNOT gate counts and circuit depth. The bridging technique can achieve up to 27% CNOT gate reduction in the QAOA application. Through a combination of Tetris and the fast bridging, we present a comprehensive strategy for enhancing VQE performance. Our experimental results showcase the effectiveness of Tetris in uncovering cancellation opportunities and demonstrate the symbiotic relationship between Tetris and the fast bridging in minimizing gate counts and circuit depth. This paper contributes not only to the advancement of VQE techniques but also to the broader field of quantum algorithm optimization

    New-Sum: A Novel Online ABFT Scheme for General Iterative Methods

    Get PDF
    Emerging high-performance computing platforms, with large component counts and lower power margins, are anticipated to be more susceptible to soft errors in both logic circuits and memory subsystems. We present an online algorithm-based fault tolerance (ABFT) approach to efficiently detect and recover soft errors for general iterative methods. We design a novel checksum-based encoding scheme for matrix-vector multiplication that is resilient to both arithmetic and memory errors. Our design decouples the checksum updating process from the actual computation, and allows adaptive checksum overhead control. Building on this new encoding mechanism, we propose two online ABFT designs that can effectively recover from errors when combined with a checkpoint/rollback scheme. These designs are capable of addressing scenarios under different error rates. Our ABFT approaches apply to a wide range of iterative solvers that primarily rely on matrix-vector multiplication and vector linear operations. We evaluate our designs through comprehensive analytical and empirical analysis. Experimental evaluation on the Stampede supercomputer demonstrates the low performance overheads incurred by our two ABFT schemes for preconditioned CG (0:4% and 2:2%) and preconditioned BiCGSTAB (1:0% and 4:0%) for the largest SPD matrix from UFL Sparse Matrix Collection. The evaluation also demonstrates the exibility and effectiveness of our proposed designs for detecting and recovering various types of soft errors in general iterative methods

    QASMTrans: A QASM based Quantum Transpiler Framework for NISQ Devices

    Full text link
    The success of a quantum algorithm hinges on the ability to orchestrate a successful application induction. Detrimental overheads in mapping general quantum circuits to physically implementable routines can be the deciding factor between a successful and erroneous circuit induction. In QASMTrans, we focus on the problem of rapid circuit transpilation. Transpilation plays a crucial role in converting high-level, machine-agnostic circuits into machine-specific circuits constrained by physical topology and supported gate sets. The efficiency of transpilation continues to be a substantial bottleneck, especially when dealing with larger circuits requiring high degrees of inter-qubit interaction. QASMTrans is a high-performance C++ quantum transpiler framework that demonstrates up to 369X speedups compared to the commonly used Qiskit transpiler. We observe speedups on large dense circuits such as uccsd_n24 and qft_n320 which require O(10^6) gates. QASMTrans successfully transpiles the aforementioned circuits in 69s and 31s, whilst Qiskit exceeded an hour of transpilation time. With QASMTrans providing transpiled circuits in a fraction of the time of prior transpilers, potential design space exploration, and heuristic-based transpiler design becomes substantially more tractable. QASMTrans is released at http://github.com/pnnl/qasmtrans

    Spag16, an Axonemal Central Apparatus Gene, Encodes a Male Germ Cell Nuclear Speckle Protein that Regulates SPAG16 mRNA Expression

    Get PDF
    Spag16 is the murine orthologue of Chlamydomonas reinhardtii PF20, a protein known to be essential to the structure and function of the “9+2” axoneme. In Chlamydomonas, the PF20 gene encodes a single protein present in the central pair of the axoneme. Loss of PF20 prevents central pair assembly/integrity and results in flagellar paralysis. Here we demonstrate that the murine Spag16 gene encodes two proteins: 71 kDa SPAG16L, which is found in all murine cells with motile cilia or flagella, and 35 kDa SPAG16S, representing the C terminus of SPAG16L, which is expressed only in male germ cells, and is predominantly found in specific regions within the nucleus that also contain SC35, a known marker of nuclear speckles enriched in pre-mRNA splicing factors. SPAG16S expression precedes expression of SPAG16L. Mice homozygous for a knockout of SPAG16L alone are infertile, but show no abnormalities in spermatogenesis. Mice chimeric for a mutation deleting the transcripts for both SPAG16L and SPAG16S have a profound defect in spermatogenesis. We show here that transduction of SPAG16S into cultured dispersed mouse male germ cells and BEAS-2B human bronchial epithelial cells increases SPAG16L expression, but has no effect on the expression of several other axoneme components. We also demonstrate that the Spag16L promoter shows increased activity in the presence of SPAG16S. The distinct nuclear localization of SPAG16S and its ability to modulate Spag16L mRNA expression suggest that SPAG16S plays an important role in the gene expression machinery of male germ cells. This is a unique example of a highly conserved axonemal protein gene that encodes two protein products with different functions

    Universal architecture of bacterial chemoreceptor arrays

    Get PDF
    Chemoreceptors are key components of the high-performance signal transduction system that controls bacterial chemotaxis. Chemoreceptors are typically localized in a cluster at the cell pole, where interactions among the receptors in the cluster are thought to contribute to the high sensitivity, wide dynamic range, and precise adaptation of the signaling system. Previous structural and genomic studies have produced conflicting models, however, for the arrangement of the chemoreceptors in the clusters. Using whole-cell electron cryo-tomography, here we show that chemoreceptors of different classes and in many different species representing several major bacterial phyla are all arranged into a highly conserved, 12-nm hexagonal array consistent with the proposed “trimer of dimers” organization. The various observed lengths of the receptors confirm current models for the methylation, flexible bundle, signaling, and linker sub-domains in vivo. Our results suggest that the basic mechanism and function of receptor clustering is universal among bacterial species and was thus conserved during evolution
    corecore