14 research outputs found
Quantum Algorithmic Techniques for Fault-Tolerant Quantum Computers
Quantum computers have the potential to push the limits of computation in areas such as quantum chemistry, cryptography, optimization, and machine learning. Even though many quantum algorithms show asymptotic improvement compared to classical ones, the overhead of running quantum computers limits when quantum computing becomes useful. Thus, by optimizing components of quantum algorithms, we can bring the regime of quantum advantage closer. My work focuses on developing efficient subroutines for quantum computation. I focus specifically on algorithms for scalable, fault-tolerant quantum computers. While it is possible that even noisy quantum computers can outperform classical ones for specific tasks, high-depth and therefore fault-tolerance is likely required for most applications. In this thesis, I introduce three sets of techniques that can be used by themselves or as subroutines in other algorithms.
The first components are coherent versions of classical sort and shuffle. We require that a quantum shuffle prepares a uniform superposition over all permutations of a sequence. The quantum sort is used within the shuffle and as well as in the next algorithm in this thesis. The quantum shuffle is an essential part of state preparation for quantum chemistry computation in first quantization.
Second, I review the progress of Hamiltonian simulations and give a new algorithm for simulating time-dependent Hamiltonians. This algorithm scales polylogarithmic in the inverse error, and the query complexity does not depend on the derivatives of the Hamiltonian. A time-dependent Hamiltonian simulation was recently used for interaction picture simulation with applications to quantum chemistry.
Next, I present a fully quantum Boltzmann machine. I show that our algorithm can train on quantum data and learn a classical description of quantum states. This type of machine learning can be used for tomography, Hamiltonian learning, and approximate quantum cloning
An Investigation of Orthogonal Wavelet Division Multiplexing Techniques as an Alternative to Orthogonal Frequency Division Multiplex Transmissions and Comparison of Wavelet Families and Their Children
Recently, issues surrounding wireless communications have risen to prominence because of the increase in the popularity of wireless applications. Bandwidth problems, and the difficulty of modulating signals across carriers, represent significant challenges. Every modulation scheme used to date has had limitations, and the use of the Discrete Fourier Transform in OFDM (Orthogonal Frequency Division Multiplex) is no exception. The restriction on further development of OFDM lies primarily within the type of transform it uses in the heart of its system, Fourier transform. OFDM suffers from sensitivity to Peak to Average Power Ratio, carrier frequency offset and wasting some bandwidth to guard successive OFDM symbols. The discovery of the wavelet transform has opened up a number of potential applications from image compression to watermarking and encryption. Very recently, work has been done to investigate the potential of using wavelet transforms within the communication space. This research will further investigate a recently proposed, innovative, modulation technique, Orthogonal Wavelet Division Multiplex, which utilises the wavelet transform opening a new avenue for an alternative modulation scheme with some interesting potential characteristics. Wavelet transform has many families and each of those families has children which each differ in filter length. This research consider comprehensively investigates the new modulation scheme, and proposes multi-level dynamic sub-banding as a tool to adapt variable signal bandwidths. Furthermore, all compactly supported wavelet families and their associated children of those families are investigated and evaluated against each other and compared with OFDM. The linear computational complexity of wavelet transform is less than the logarithmic complexity of Fourier in OFDM. The more important complexity is the operational complexity which is cost effectiveness, such as the time response of the system, the memory consumption and the number of iterative operations required for data processing. Those
complexities are investigated for all available compactly supported wavelet families and their children and compared with OFDM. The evaluation reveals which wavelet families perform more effectively than OFDM, and for each wavelet family identifies which family children perform the best. Based on these results, it is concluded that the wavelet modulation scheme has some interesting advantages over OFDM, such as lower complexity and bandwidth conservation of up to 25%, due to the elimination of guard intervals and dynamic bandwidth allocation, which result in better cost effectiveness
Recommended from our members
The Effectiveness of <i>t</i>-Way Test Data Generation
Modern society is increasingly dependent on the correct functioning of software and increasingly so in areas that are considered safety related or safety critical. Therefore, there is an increasing need to be able to verify and validate that the software is in fact correct and will perform its intended function. Many approaches to this problem have been proposed; however, none seems likely to supplant the role of testing in the near future.
If we accept that there is, and will be, a continuing need to be able to test software then the question becomes one of how can this be done effectively, both in terms of ability to detect errors and in terms of cost. One avenue of research that offers prospects of improving both of these aspects is the automatic generation of test data.
There has recently been a large amount of work conducted in this area. One particularly promising direction has been the application of ideas from the field of experimental design and in particular, the field of t-way adequate factorial designs.
The area however, is not without issues; there is evidence that the technique is capable of detecting errors but that evidence is not unequivocal. Moreover, as with almost all work in the area of automatic test generation, there has been very little comparative work comparing the technique with other test data generation techniques. Worse, there has been effectively no work done that compares any automatic test data generation technique with the effectiveness of tests generated by humans. Another major issue with the technique is the number of tests that applying the technique can result in. This implies that there is a need for an automated oracle if the technique is to be successfully applied. The flaw with this is of course that in most situations the oracle is the human that is conducting the tests, a point often ignored in testing research.
The work presented here addresses both of these points. To do this I have used a code base taken from an industrial engine control system that has an existing set of high quality unit tests developed by hand. To complement this, several other techniques for automatically generating test data have been applied, namely random testing, random experimental designs and a technique for generating single factor experiments. To address the issue of being able to compare the error detection ability of all of the sets of test vectors, rather than the usual effectiveness surrogates of code coverage I have used mutation analysis on the code base to directly measure the ability of each set of test vectors to discover common coding errors. The results presented here show that test data generation techniques based on t-way factorial designs are at least as effective as handgenerated tests and superior to random testing and the factor experimental technique.
The oracle problem associated with the factorial design techniques was addressed using a test set minimisation approach. The mutation tool monitored which vectors could “kill” which code mutants. After a subset of the test vectors had been run, the most effective vectors were retained and the rest discarded. Likewise, mutants that were killed were removed from further consideration and the process repeated. Experimental results show that this minimisation procedure is effective at reducing computational overhead and is capable of producing final sets of test vectors that are comparable in size with the sets of hand-generated tests and so amenable to final hand checking
Similarity Search over Network Structure
With the advent of the Internet, graph-structured data are ubiquitous. An essential task for graph-structured data management is similarity search based on graph topology, with a wide spectrum of applications, e.g., web search, outlier detection, co-citation analysis, and collaborative filtering. These graph topology data arrive from multiple sources at an astounding velocity, volume and veracity. While the scale of network structured data is increasing, existing similarity search algorithms on large graphs are impractical due to their expensive costs in terms of computational time and memory space. Moreover, dynamic changes (e.g., noise and abnormality) exists in network data, and it arises from many factors, such as data loss in transfer, data incompleteness, and dirty reading. Thus, the dynamic changes have become the main barrier to gaining accurate results for efficient network analysis. In real Web applications, CoSimRank has been proposed as a robust measure of node-pair similarity based on graph topology. It follows a SimRank-like notion that “two nodes are considered as similar if their in-neighbours are similar”, but the similarity of each node with itself is not constantly 1, which is different from SimRank. However, existing work on CoSimRank is restricted to static graphs. Each node pair CoSimRank score is retrieved from the sum of dot products of two Personalised PageRank vectors. When the graph is updated with edges (nodes) addition and deletion over time, it is cost-inhibitive to recompute all CoSimRank scores from scratch, which is impractical. RoleSim is a popular graph-structural role similarity search measure with many applications (e.g., sociometry), it can get the automorphic equivalence of nodes pair similarity, which SimRank and CoSimRank lack. But the accuracy of RoleSim algorithm can be improved. In this study, (1) we propose fast dynamic scheme, D-CoSim and D-deCoSim, for accurate CoSimRank search over large-scale evolving graphs. (2) Based on D-CoSim, we also propose fast scheme, F-CoSim and Opt_F-CoSim, which greatly accelerates CoSimRank search over static graphs. Our theoretical analysis shows that D-CoSim, D-deCoSim F-CoSim and Opt_F-CoSim guarantee the exactness of CoSimRank scores. Experimental evaluations verify the superiority of D-CoSim and D-deCoSim over evolving graphs, and the fast speedupof F-CoSim and Opt_F-CoSim on large-scale static graphs against its competitors, without any loss of accuracy. (3) We propose a novel role similarity search algorithm FaRS, and a speedup algorithm Opt_FaRS, which guarantees the automorphic equivalence capture, and captures the information from the neighbour’s class. The experimental results of FaRS and Opt_FaRS show that our algorithms achieves higher accuracy than baseline algorithms
LIPIcs, Volume 261, ICALP 2023, Complete Volume
LIPIcs, Volume 261, ICALP 2023, Complete Volum
Parameter-free agglomerative hierarchical clustering to model learners' activity in online discussion forums
L'anàlisi de l'activitat dels estudiants en els fòrums de discussió online implica un problema de modelització altament depenent del context, el qual pot ser plantejat des d'aproximacions tant teòriques com empíriques. Quan aquest problema és abordat des de l'àmbit de la mineria de dades, l'enfocament més comunament adoptat és el de la classificació no supervisada (o clustering), donant lloc, d'aquesta manera, a un escenari de clustering en el qual el nombre real de clústers és a priori desconegut. Per tant, aquesta aproximació revela una qüestió subjacent, la qual no és sinó un dels problemes més coneguts del paradigma del clustering: l'estimació del nombre de clústers, habitualment seleccionat per l'usuari concorde a algun tipus de criteri subjectiu que pot comportar fàcilment l'aparició de biaixos indesitjats en els models obtinguts.
Amb l'objectiu d'evitar qualsevol intervenció de l'usuari en l'etapa de clustering, dos nous criteris d'unió entre clústers són proposats en la present tesi, els quals, al seu torn, permeten la implementació d'un nou algorisme de clustering jeràrquic aglomeratiu lliure de paràmetres. Un complet conjunt d'experiments indica que el nou algorisme de clustering és capaç de proporcionar solucions de clustering òptimes enfront d'una gran varietat d'escenaris de clustering, sent capaç de bregar amb diferents classes de dades, així com de millorar el rendiment ofert pels algorismes de clustering més àmpliament emprats en la pràctica.
Finalment, una estratègia d'anàlisi de dues etapes basada en el paradigma del clustering subespaial és proposada a fi d'abordar adequadament el problema de la modelització de la participació dels estudiants en les discussions asíncrones. Combinada amb el nou algorisme clustering, l'estratègia proposada demostra ser capaç de limitar la intervenció subjectiva de l'usuari a les etapes d'interpretació del procés d'anàlisi i de donar lloc a una completa modelització de l'activitat duta a terme pels estudiants en els fòrums de discussió online.El análisis de la actividad de los estudiantes en los foros de discusión online acarrea un problema de modelización altamente dependiente del contexto, el cual puede ser planteado desde aproximaciones tanto teóricas como empíricas. Cuando este problema es abordado desde el ámbito de la minería de datos, el enfoque más comúnmente adoptado es el de la clasificación no supervisada (o clustering), dando lugar, de este modo, a un escenario de clustering en el que el número real de clusters es a priori desconocido. Por tanto, esta aproximación revela una cuestión subyacente, la cual no es sino uno de los problemas más conocidos del paradigma del clustering: la estimación del número de clusters, habitualmente seleccionado por el usuario acorde a algún tipo de criterio subjetivo que puede conllevar fácilmente la aparición de sesgos indeseados en los modelos obtenidos.
Con el objetivo de evitar cualquier intervención del usuario en la etapa de clustering, dos nuevos criterios de unión entre clusters son propuestos en la presente tesis, los cuales, a su vez, permiten la implementación de un nuevo algoritmo de clustering jerárquico aglomerativo libre de parámetros. Un completo conjunto de experimentos indica que el nuevo algoritmo de clustering es capaz de proporcionar soluciones de clustering óptimas frente a una gran variedad de escenarios de clustering, siendo capaz de lidiar con diferentes clases de datos, así como de mejorar el rendimiento ofrecido por los algoritmos de clustering más ampliamente utilizados en la práctica.
Finalmente, una estrategia de análisis de dos etapas basada en el paradigma del clustering subespacial es propuesta a fin de abordar adecuadamente el problema de la modelización de la participación de los estudiantes en las discusiones asíncronas. Combinada con el nuevo algoritmo clustering, la estrategia propuesta demuestra ser capaz de limitar la intervención subjetiva del usuario a las etapas de interpretación del proceso de análisis y de dar lugar a una completa modelización de la actividad llevada a cabo por los estudiantes en los foros de discusión online.The analysis of learners' activity in online discussion forums leads to a highly context-dependent modelling problem, which can be posed from both theoretical and empirical approaches. When this problem is tackled from the data mining field, a clustering-based perspective is usually adopted, thus giving rise to a clustering scenario where the real number of clusters is a priori unknown. Hence, this approach reveals an underlying problem, which is one of the best-known issues of the clustering paradigm: the estimation of the number of clusters, habitually selected by user according to some kind of subjective criterion that may easily lead to the appearance of undesired biases in the obtained models.
With the aim of avoiding any user intervention in the cluster analysis stage, two new cluster merging criteria are proposed in the present thesis, which allow to implement a novel parameter-free agglomerative hierarchical algorithm. A complete set of experiments indicate that the new clustering algorithm is able to provide optimal clustering solutions in the face of a great variety of clustering scenarios, both having the ability to deal with different kinds of data and outperforming clustering algorithms most widely used in practice.
Finally, a two-stage analysis strategy based on the subspace clustering paradigm is proposed to properly tackle the issue of modelling learners' participation in the asynchronous discussions. In combination with the new clustering algorithm, the proposed strategy proves to be able to limit user's subjective intervention to the interpretation stages of the analysis process and to lead to a complete modelling of the activity performed by learners in online discussion forums