27,264 research outputs found

    QAmplifyNet: Pushing the Boundaries of Supply Chain Backorder Prediction Using Interpretable Hybrid Quantum - Classical Neural Network

    Full text link
    Supply chain management relies on accurate backorder prediction for optimizing inventory control, reducing costs, and enhancing customer satisfaction. However, traditional machine-learning models struggle with large-scale datasets and complex relationships, hindering real-world data collection. This research introduces a novel methodological framework for supply chain backorder prediction, addressing the challenge of handling large datasets. Our proposed model, QAmplifyNet, employs quantum-inspired techniques within a quantum-classical neural network to predict backorders effectively on short and imbalanced datasets. Experimental evaluations on a benchmark dataset demonstrate QAmplifyNet's superiority over classical models, quantum ensembles, quantum neural networks, and deep reinforcement learning. Its proficiency in handling short, imbalanced datasets makes it an ideal solution for supply chain management. To enhance model interpretability, we use Explainable Artificial Intelligence techniques. Practical implications include improved inventory control, reduced backorders, and enhanced operational efficiency. QAmplifyNet seamlessly integrates into real-world supply chain management systems, enabling proactive decision-making and efficient resource allocation. Future work involves exploring additional quantum-inspired techniques, expanding the dataset, and investigating other supply chain applications. This research unlocks the potential of quantum computing in supply chain optimization and paves the way for further exploration of quantum-inspired machine learning models in supply chain management. Our framework and QAmplifyNet model offer a breakthrough approach to supply chain backorder prediction, providing superior performance and opening new avenues for leveraging quantum-inspired techniques in supply chain management

    A Tutorial on Clique Problems in Communications and Signal Processing

    Full text link
    Since its first use by Euler on the problem of the seven bridges of K\"onigsberg, graph theory has shown excellent abilities in solving and unveiling the properties of multiple discrete optimization problems. The study of the structure of some integer programs reveals equivalence with graph theory problems making a large body of the literature readily available for solving and characterizing the complexity of these problems. This tutorial presents a framework for utilizing a particular graph theory problem, known as the clique problem, for solving communications and signal processing problems. In particular, the paper aims to illustrate the structural properties of integer programs that can be formulated as clique problems through multiple examples in communications and signal processing. To that end, the first part of the tutorial provides various optimal and heuristic solutions for the maximum clique, maximum weight clique, and kk-clique problems. The tutorial, further, illustrates the use of the clique formulation through numerous contemporary examples in communications and signal processing, mainly in maximum access for non-orthogonal multiple access networks, throughput maximization using index and instantly decodable network coding, collision-free radio frequency identification networks, and resource allocation in cloud-radio access networks. Finally, the tutorial sheds light on the recent advances of such applications, and provides technical insights on ways of dealing with mixed discrete-continuous optimization problems

    OS Scheduling Algorithms for Memory Intensive Workloads in Multi-socket Multi-core servers

    Full text link
    Major chip manufacturers have all introduced multicore microprocessors. Multi-socket systems built from these processors are routinely used for running various server applications. Depending on the application that is run on the system, remote memory accesses can impact overall performance. This paper presents a new operating system (OS) scheduling optimization to reduce the impact of such remote memory accesses. By observing the pattern of local and remote DRAM accesses for every thread in each scheduling quantum and applying different algorithms, we come up with a new schedule of threads for the next quantum. This new schedule potentially cuts down remote DRAM accesses for the next scheduling quantum and improves overall performance. We present three such new algorithms of varying complexity followed by an algorithm which is an adaptation of Hungarian algorithm. We used three different synthetic workloads to evaluate the algorithm. We also performed sensitivity analysis with respect to varying DRAM latency. We show that these algorithms can cut down DRAM access latency by up to 55% depending on the algorithm used. The benefit gained from the algorithms is dependent upon their complexity. In general higher the complexity higher is the benefit. Hungarian algorithm results in an optimal solution. We find that two out of four algorithms provide a good trade-off between performance and complexity for the workloads we studied

    Performance Models for Split-execution Computing Systems

    Full text link
    Split-execution computing leverages the capabilities of multiple computational models to solve problems, but splitting program execution across different computational models incurs costs associated with the translation between domains. We analyze the performance of a split-execution computing system developed from conventional and quantum processing units (QPUs) by using behavioral models that track resource usage. We focus on asymmetric processing models built using conventional CPUs and a family of special-purpose QPUs that employ quantum computing principles. Our performance models account for the translation of a classical optimization problem into the physical representation required by the quantum processor while also accounting for hardware limitations and conventional processor speed and memory. We conclude that the bottleneck in this split-execution computing system lies at the quantum-classical interface and that the primary time cost is independent of quantum processor behavior.Comment: Presented at 18th Workshop on Advances in Parallel and Distributed Computational Models [APDCM2016] on 23 May 2016; 10 page

    Flight Gate Assignment with a Quantum Annealer

    Get PDF
    Optimal flight gate assignment is a highly relevant optimization problem from airport management. Among others, an important goal is the minimization of the total transit time of the passengers. The corresponding objective function is quadratic in the binary decision variables encoding the flight-to-gate assignment. Hence, it is a quadratic assignment problem being hard to solve in general. In this work we investigate the solvability of this problem with a D-Wave quantum annealer. These machines are optimizers for quadratic unconstrained optimization problems (QUBO). Therefore the flight gate assignment problem seems to be well suited for these machines. We use real world data from a mid-sized German airport as well as simulation based data to extract typical instances small enough to be amenable to the D-Wave machine. In order to mitigate precision problems, we employ bin packing on the passenger numbers to reduce the precision requirements of the extracted instances. We find that, for the instances we investigated, the bin packing has little effect on the solution quality. Hence, we were able to solve small problem instances extracted from real data with the D-Wave 2000Q quantum annealer.Comment: Updated figure

    Noise-Adaptive Compiler Mappings for Noisy Intermediate-Scale Quantum Computers

    Full text link
    A massive gap exists between current quantum computing (QC) prototypes, and the size and scale required for many proposed QC algorithms. Current QC implementations are prone to noise and variability which affect their reliability, and yet with less than 80 quantum bits (qubits) total, they are too resource-constrained to implement error correction. The term Noisy Intermediate-Scale Quantum (NISQ) refers to these current and near-term systems of 1000 qubits or less. Given NISQ's severe resource constraints, low reliability, and high variability in physical characteristics such as coherence time or error rates, it is of pressing importance to map computations onto them in ways that use resources efficiently and maximize the likelihood of successful runs. This paper proposes and evaluates backend compiler approaches to map and optimize high-level QC programs to execute with high reliability on NISQ systems with diverse hardware characteristics. Our techniques all start from an LLVM intermediate representation of the quantum program (such as would be generated from high-level QC languages like Scaffold) and generate QC executables runnable on the IBM Q public QC machine. We then use this framework to implement and evaluate several optimal and heuristic mapping methods. These methods vary in how they account for the availability of dynamic machine calibration data, the relative importance of various noise parameters, the different possible routing strategies, and the relative importance of compile-time scalability versus runtime success. Using real-system measurements, we show that fine grained spatial and temporal variations in hardware parameters can be exploited to obtain an average 2.92.9x (and up to 1818x) improvement in program success rate over the industry standard IBM Qiskit compiler.Comment: To appear in ASPLOS'1

    Towards Lattice Quantum Chromodynamics on FPGA devices

    Get PDF
    In this paper we describe a single-node, double precision Field Programmable Gate Array (FPGA) implementation of the Conjugate Gradient algorithm in the context of Lattice Quantum Chromodynamics. As a benchmark of our proposal we invert numerically the Dirac-Wilson operator on a 4-dimensional grid on three Xilinx hardware solutions: Zynq Ultrascale+ evaluation board, the Alveo U250 accelerator and the largest device available on the market, the VU13P device. In our implementation we separate software/hardware parts in such a way that the entire multiplication by the Dirac operator is performed in hardware, and the rest of the algorithm runs on the host. We find out that the FPGA implementation can offer a performance comparable with that obtained using current CPU or Intel's many core Xeon Phi accelerators. A possible multiple node FPGA-based system is discussed and we argue that power-efficient High Performance Computing (HPC) systems can be implemented using FPGA devices only.Comment: 17 pages, 4 figure
    corecore