933 research outputs found
Preemptive Software Transactional Memory
In state-of-the-art Software Transactional Memory (STM) systems, threads carry out the execution of transactions as non-interruptible tasks. Hence, a thread can react to the injection of a higher priority transactional task and take care of its processing only at the end of the currently executed transaction. In this article we pursue a paradigm shift where the execution of an in-memory transaction is carried out as a preemptable task, so that a thread can start processing a higher priority transactional task before finalizing its current transaction. We achieve this goal in an application-transparent manner, by only relying on Operating System facilities we include in our preemptive STM architecture. With our approach we are able to re-evaluate CPU assignment across transactions along a same thread every few tens of microseconds. This is mandatory for an effective priority-aware architecture given the typically finer-grain nature of in-memory transactions compared to their counterpart in database systems. We integrated our preemptive STM architecture with the TinySTM package, and released it as open source. We also provide the results of an experimental assessment of our proposal based on running a port of the TPC-C benchmark to the STM environment
Lattice QCD Thermodynamics on the Grid
We describe how we have used simultaneously nodes of the
EGEE Grid, accumulating ca. 300 CPU-years in 2-3 months, to determine an
important property of Quantum Chromodynamics. We explain how Grid resources
were exploited efficiently and with ease, using user-level overlay based on
Ganga and DIANE tools above standard Grid software stack. Application-specific
scheduling and resource selection based on simple but powerful heuristics
allowed to improve efficiency of the processing to obtain desired scientific
results by a specified deadline. This is also a demonstration of combined use
of supercomputers, to calculate the initial state of the QCD system, and Grids,
to perform the subsequent massively distributed simulations. The QCD simulation
was performed on a lattice. Keeping the strange quark mass at
its physical value, we reduced the masses of the up and down quarks until,
under an increase of temperature, the system underwent a second-order phase
transition to a quark-gluon plasma. Then we measured the response of this
system to an increase in the quark density. We find that the transition is
smoothened rather than sharpened. If confirmed on a finer lattice, this finding
makes it unlikely for ongoing experimental searches to find a QCD critical
point at small chemical potential
Optimization of a Quantum Cascade Laser Operating in the Terahertz Frequency Range Using a Multiobjective Evolutionary Algorithm
A quantum cascade (QC) laser is a specific type of semiconductor laser that operates through principles of quantum mechanics. In less than a decade QC lasers are already able to outperform previously designed double heterostructure semiconductor lasers. Because there is a genuine lack of compact and coherent devices which can operate in the far-infrared region the motivation exists for designing a terahertz QC laser. A device operating at this frequency is expected to be more efficient and cost effective than currently existing devices. It has potential applications in the fields of spectroscopy, astronomy, medicine and free-space communication as well as applications to near-space radar and chemical/biological detection. The overarching goal of this research was to find QC laser parameter combinations which can be used to fabricate viable structures. To ensure operation in the THz region the device must conform to the extremely small energy level spacing range from ~10-15 meV. The time and expense of the design and production process is prohibitive, so an alternative to fabrication was necessary. To accomplish this goal a model of a QC laser, developed at Worchester Polytechnic Institute with sponsorship from the Air Force Research Laboratory Sensors Directorate, and the General Multiobjective Parallel Genetic Algorithm (GenMOP), developed at the Air Force Institute of Technology, were integrated to form a computer simulation which stochastically searches for feasible solutions
A quantum active learning algorithm for sampling against adversarial attacks
Adversarial attacks represent a serious menace for learning algorithms and
may compromise the security of future autonomous systems. A theorem by Khoury
and Hadfield-Menell (KH), provides sufficient conditions to guarantee the
robustness of machine learning algorithms, but comes with a caveat: it is
crucial to know the smallest distance among the classes of the corresponding
classification problem. We propose a theoretical framework that allows us to
think of active learning as sampling the most promising new points to be
classified, so that the minimum distance between classes can be found and the
theorem KH used. Additionally, we introduce a quantum active learning algorithm
that makes use of such framework and whose complexity is polylogarithmic in the
dimension of the space, , and the size of the initial training data ,
provided the use of qRAMs; and polynomial in the precision, achieving an
exponential speedup over the equivalent classical algorithm in and .
This algorithm may be nevertheless `dequantized' reducing the advantage to
polynomial.Comment: Contains an additional dequantization appendix E that does not appear
in the published versio
Neural network encoded variational quantum algorithms
We introduce a general framework called neural network (NN) encoded
variational quantum algorithms (VQAs), or NN-VQA for short, to address the
challenges of implementing VQAs on noisy intermediate-scale quantum (NISQ)
computers. Specifically, NN-VQA feeds input (such as parameters of a
Hamiltonian) from a given problem to a neural network and uses its outputs to
parameterize an ansatz circuit for the standard VQA. Combining the strengths of
NN and parameterized quantum circuits, NN-VQA can dramatically accelerate the
training process of VQAs and handle a broad family of related problems with
varying input parameters with the pre-trained NN. To concretely illustrate the
merits of NN-VQA, we present results on NN-variational quantum eigensolver
(VQE) for solving the ground state of parameterized XXZ spin models. Our
results demonstrate that NN-VQE is able to estimate the ground-state energies
of parameterized Hamiltonians with high precision without fine-tuning, and
significantly reduce the overall training cost to estimate ground-state
properties across the phases of XXZ Hamiltonian. We also employ an
active-learning strategy to further increase the training efficiency while
maintaining prediction accuracy. These encouraging results demonstrate that
NN-VQAs offer a new hybrid quantum-classical paradigm to utilize NISQ resources
for solving more realistic and challenging computational problems.Comment: 4.4 pages, 5 figures, with supplemental material
Optimizing the Performance of Parallel and Concurrent Applications Based on Asynchronous Many-Task Runtimes
Nowadays, High-performance Computing (HPC) scientific applications often face per- formance challenges when running on heterogeneous supercomputers, so do scalability, portability, and efficiency issues. For years, supercomputer architectures have been rapidly changing and becoming more complex, and this challenge will become even more com- plicated as we enter the exascale era, where computers will exceed one quintillion cal- culations per second. Software adaption and optimization are needed to address these challenges. Asynchronous many-task (AMT) systems show promise against the exascale challenge as they combine advantages of multi-core architectures with light-weight threads, asynchronous executions, smart scheduling, and portability across diverse architectures.
In this research, we optimize the performance of a highly scalable scientific application using HPX, an AMT runtime system, and address its performance bottlenecks on super- computers. We use DCA++ (Dynamical Cluster Approximation) as a research vehicle for studying the performance bottlenecks in parallel and concurrent applications. DCA++ is a high-performance research software application that provides a modern C++ imple- mentation to solve quantum many-body problems with a Quantum Monte Carlo (QMC) kernel. QMC solver applications are widely used and are mission-critical across the US Department of Energy’s (DOE’s) application landscape.
Throughout the research, we implement several optimization techniques. Firstly, we add HPX threading backend support to DCA++ and achieve significant performance speedup. Secondly, we solve a memory-bound challenge in DCA++ and develop ring- based communication algorithms using GPU RDMA technology that allow much larger scientific simulation cases. Thirdly, we explore a methodology for using LLVM-based tools to tune the DCA++ that targets the new ARM A64Fx processor. We profile all imple- mentations in-depth and observe significant performance improvement throughout all the implementations
- …