23 research outputs found

    The Application of Spectral Clustering in Drug Discovery

    Get PDF
    The application of clustering algorithms to chemical datasets is well established and has been reviewed extensively. Recently, a number of ‘modern’ clustering algorithms have been reported in other fields. One example is spectral clustering, which has yielded promising results in areas such as protein library analysis. The term spectral clustering is used to describe any clustering algorithm that utilises the eigenpairs of a matrix as the basis for partitioning a dataset. This thesis describes the development and optimisation of a non-overlapping spectral clustering method that is based upon a study by Brewer. The initial version of the spectral clustering algorithm was closely related to Brewer’s method and used a full matrix diagonalisation procedure to identify the eigenpairs of an input matrix. This spectral clustering method was compared to the k-means and Ward’s algorithms, producing encouraging results, for example, when coupled with extended connectivity fingerprints, this method outperformed the other clustering algorithms according to the QCI measure. Although the spectral clustering algorithm showed promising results, its operational costs restricted its application to small datasets. Hence, the method was optimised in successive studies. Firstly, the effect of matrix sparsity on the spectral clustering was examined and showed that spectral clustering with sparse input matrices can lead to an improvement in the results. Despite this improvement, the costs of spectral clustering remained prohibitive, so the full matrix diagonalisation procedure was replaced with the Lanczos algorithm that has lower associated costs, as suggested by Brewer. This method led to a significant decrease in the computational costs when identifying a small number of clusters, however a number of issues remained; leading to the adoption of a SVD-based eigendecomposition method. The SVD-based algorithm was shown to be highly efficient, accurate and scalable through a number of studies

    Basic transformations in linear algebra for vector computing

    Get PDF

    Polynomial matrix eigenvalue decomposition techniques for multichannel signal processing

    Get PDF
    Polynomial eigenvalue decomposition (PEVD) is an extension of the eigenvalue decomposition (EVD) for para-Hermitian polynomial matrices, and it has been shown to be a powerful tool for broadband extensions of narrowband signal processing problems. In the context of broadband sensor arrays, the PEVD allows the para-Hermitian matrix that results from the calculation of a space-time covariance matrix of the convolutively mixed signals to be diagonalised. Once the matrix is diagonalised, not only can the correlation between different sensor signals be removed but the signal and noise subspaces can also be identified. This process is referred to as broadband subspace decomposition, and it plays a very important role in many areas that require signal separation techniques for multichannel convolutive mixtures, such as speech recognition, radar clutter suppression, underwater acoustics, etc. The multiple shift second order sequential best rotation (MS-SBR2) algorithm, built on the most established SBR2 algorithm, is proposed to compute the PEVD of para-Hermitian matrices. By annihilating multiple off-diagonal elements per iteration, the MS-SBR2 algorithm shows a potential advantage over its predecessor (SBR2) in terms of the computational speed. Furthermore, the MS-SBR2 algorithm permits us to minimise the order growth of polynomial matrices by shifting rows (or columns) in the same direction across iterations, which can potentially reduce the computational load of the algorithm. The effectiveness of the proposed MS-SBR2 algorithm is demonstrated by various para-Hermitian matrix examples, including randomly generated matrices with different sizes and matrices generated from source models with different dynamic ranges and relations between the sources’ power spectral densities. A worked example is presented to demonstrate how the MS-SBR2 algorithm can be used to strongly decorrelate a set of convolutively mixed signals. Furthermore, the performance metrics and computational complexity of MS-SBR2 are analysed and compared to other existing PEVD algorithms by means of numerical examples. Finally, two potential applications of theMS-SBR2 algorithm, includingmultichannel spectral factorisation and decoupling of broadband multiple-input multiple-output (MIMO) systems, are demonstrated in this dissertation

    Adaptive heterogeneous parallelism for semi-empirical lattice dynamics in computational materials science.

    Get PDF
    With the variability in performance of the multitude of parallel environments available today, the conceptual overhead created by the need to anticipate runtime information to make design-time decisions has become overwhelming. Performance-critical applications and libraries carry implicit assumptions based on incidental metrics that are not portable to emerging computational platforms or even alternative contemporary architectures. Furthermore, the significance of runtime concerns such as makespan, energy efficiency and fault tolerance depends on the situational context. This thesis presents a case study in the application of both Mattsons prescriptive pattern-oriented approach and the more principled structured parallelism formalism to the computational simulation of inelastic neutron scattering spectra on hybrid CPU/GPU platforms. The original ad hoc implementation as well as new patternbased and structured implementations are evaluated for relative performance and scalability. Two new structural abstractions are introduced to facilitate adaptation by lazy optimisation and runtime feedback. A deferred-choice abstraction represents a unified space of alternative structural program variants, allowing static adaptation through model-specific exhaustive calibration with regards to the extrafunctional concerns of runtime, average instantaneous power and total energy usage. Instrumented queues serve as mechanism for structural composition and provide a representation of extrafunctional state that allows realisation of a market-based decentralised coordination heuristic for competitive resource allocation and the Lyapunov drift algorithm for cooperative scheduling

    Context adaptivity for selected computational kernels with applications in optoelectronics and in phylogenetics

    Get PDF
    Computational Kernels sind der kritische Teil rechenintensiver Software, wofĂŒr der grĂ¶ĂŸte Rechenaufwand anfĂ€llt; daher mĂŒssen deren Design und Implementierung sorgfĂ€ltig vorgenommen werden. Zwei wissenschaftliche Anwendungsprobleme aus der Optoelektronik und aus der Phylogenetik, sowie dazugehörige Computational Kernels motivieren diese Arbeit. Im ersten Anwendungsproblem werden Komponenten zur Berechnung komplex-symmetrischer Eigenwertprobleme diskutiert, welche in der Simulation von Wellenleitern in der Optoelektronik auftreten. LAPACK und ScaLAPACK beinhalten sehr leistungsfĂ€hige Referenzimplementierungen fĂŒr bestimmte Problemstellungen der linearen Algebra. In Bezug auf Eigenwertprobleme werden ausschließlich reell-symmetrische und komplex-hermitesche Varianten angeboten, daher sind effiziente Codes fĂŒr komplex-symmetrische (nicht-hermitesche) Eigenwertprobleme sehr wĂŒnschenswert. Das zweite Anwendungsproblem behandelt einen parallelen, wissenschaftlichen Workflow zur Rekonstruktion von Phylogenien, welcher entworfen, umgesetzt und evaluiert wird. Die Rekonstruktion von phylogenetischen BĂ€umen ist ein NP-hartes Problem, welches Ă€ußerst viel RechenkapazitĂ€t benötigt, wodurch ein paralleler Ansatz erforderlich ist. Die grundlegende Idee dieser Arbeit ist die Untersuchung der Wechselbeziehung zwischen dem Kontext der behandelten Kernels und deren Effizienz. Ein Kontext eines Computational Kernels beinhaltet Modellaspekte (z.B. Struktur der Eingabedaten), Softwareaspekte (z.B. rechenintensive Bibliotheken), Hardwareaspekte (z.B. verfĂŒgbarer Hauptspeicher und unterstĂŒtzte darstellbare Genauigkeit), sowie weitere Anforderungen bzw. EinschrĂ€nkungen. EinschrĂ€nkungen sind hinsichtlich Laufzeit, Speicherverbrauch, gelieferte Genauigkeit usw., möglich. Das Konzept der KontextadaptivitĂ€t wird fĂŒr ausgewĂ€hlte Anwendungsprobleme in Computational Science gezeigt. Die vorgestellte Methode ist ein Meta-Algorithmus, der Aspekte des Kontexts verwendet, um optimale Leistung hinsichtlich der angewandten Metrik zu erzielen. Es ist wichtig, den Kontext einzubeziehen, weil Anforderungen gegeneinander ausgetauscht werden könnten, resultierend in einer höheren Leistung. Zum Beispiel kann im Falle einer niedrigen benötigten Genauigkeit ein schnellerer Algorithmus einer bewĂ€hrten, aber langsameren, Methode vorgezogen werden. Speziell fĂŒr komplex-symmetrische Eigenwertprobleme zugeschnittene Codes zielen darauf ab, Genauigkeit gegen Geschwindigkeit einzutauschen. Die Innovation wird durch neue algorithmische AnsĂ€tze belegt, welche die algebraische Struktur ausnutzen. BezĂŒglich der Berechnung von phylogenetischen BĂ€umen wird die Abbildung eines Workflows auf ein Campusgrid-System gezeigt. Die Innovation besteht in der anpassungsfĂ€higen Implementierung des Workflows, der nebenlĂ€ufige Instanzen von Computational Kernels in einem verteilten System darstellt. Die AdaptivitĂ€t bezeichnet hier die FĂ€higkeit des Workflows, die Rechenlast hinsichtlich verfĂŒgbarer Rechner, Zeit und QualitĂ€t der phylogenetischen BĂ€ume anzupassen. KontextadaptivitĂ€t wird durch die Implementierung und Evaluierung von wissenschaftlichen Problemstellungen aus der Optoelektronik und aus der Phylogenetik gezeigt. FĂŒr das Fachgebiet der Optoelektronik zielt eine Familie von Algorithmen auf die Lösung von verallgemeinerten komplex-symmetrischen Eigenwertproblemen ab. Unser alternativer Ansatz nutzt die symmetrische Struktur aus und spielt gĂŒnstigere Laufzeit gegen eine geringere Genauigkeit aus. Dieser Ansatz ist somit schneller, jedoch (meist) ungenauer als der konventionelle Lösungsweg. ZusĂ€tzlich zum sequentiellen Löser wird eine parallele Variante diskutiert und teilweise auf einem Cluster mit bis zu 1024 CPU-Cores evaluiert. Die erzielten Laufzeiten beweisen die Überlegenheit unseres Ansatzes -- allerdings sind weitere Untersuchungen zur Erhöhung der Genauigkeit notwendig. FĂŒr das Fachgebiet der Phylogenetik zeigen wir, dass die phylogenetische Baum-Rekonstruktion mittels eines Condor-basierten Campusgrids effizient parallelisiert werden kann. Dieser parallele wissenschaftliche Workflow weist einen geringen parallelen Overhead auf, resultierend in exzellenter Effizienz.Computational kernels are the crucial part of computationally intensive software, where most of the computing time is spent; hence, their design and implementation have to be accomplished carefully. Two scientific application problems from optoelectronics and from phylogenetics and corresponding computational kernels are motivating this thesis. In the first application problem, components for the computational solution of complex symmetric EVPs are discussed, arising in the simulation of waveguides in optoelectronics. LAPACK and ScaLAPACK contain highly effective reference implementations for certain numerical problems in linear algebra. With respect to EVPs, only real symmetric and complex Hermitian codes are available, therefore efficient codes for complex symmetric (non-Hermitian) EVPs are highly desirable. In the second application problem, a parallel scientific workflow for computing phylogenies is designed, implemented, and evaluated. The reconstruction of phylogenetic trees is an NP-hard problem that demands huge scale computing capabilities, and therefore a parallel approach is necessary. One idea underlying this thesis is to investigate the interaction between the context of the kernels considered and their efficiency. The context of a computational kernel comprises model aspects (for instance, structure of input data), software aspects (for instance, computational libraries), hardware aspects (for instance, available RAM and supported precision), and certain requirements or constraints. Constraints may exist with respect to runtime, memory usage, accuracy required, etc.. The concept of context adaptivity is demonstrated to selected computational problems in computational science. The method proposed here is a meta-algorithm that utilizes aspects of the context to result in an optimal performance concerning the applied metric. It is important to consider the context, because requirements may be traded for each other, resulting in a higher performance. For instance, in case of a low required accuracy, a faster algorithmic approach may be favored over an established but slower method. With respect to EVPs, prototypical codes that are especially targeted at complex symmetric EVPs aim at trading accuracy for speed. The innovation is evidenced by the implementation of new algorithmic approaches exploiting structure. Concerning the computation of phylogenetic trees, the mapping of a scientific workflow onto a campus grid system is demonstrated. The adaptive implementation of the workflow features concurrent instances of a computational kernel on a distributed system. Here, adaptivity refers to the ability of the workflow to vary computational load in terms of available computing resources, available time, and quality of reconstructed phylogenetic trees. Context adaptivity is discussed by means of computational problems from optoelectronics and from phylogenetics. For the field of optoelectronics, a family of implemented algorithms aim at solving generalized complex symmetric EVPs. Our alternative approach exploiting structural symmetry trades runtime for accuracy, hence, it is faster but (usually) features a lower accuracy than the conventional approach. In addition to a complete sequential solver, a parallel variant is discussed and partly evaluated on a cluster utilizing up to 1024 CPU cores. Achieved runtimes evidence the superiority of our approach, however, further investigations on improving accuracy are suggested. For the field of phylogenetics, we show that phylogenetic tree reconstruction can efficiently be parallelized on a campus grid infrastructure. The parallel scientific workflow features a moderate parallel overhead, resulting in an excellent efficiency

    A variational approach to linear control structure problems

    Get PDF
    Imperial Users onl

    The Foundations of Infinite-Dimensional Spectral Computations

    Get PDF
    Spectral computations in infinite dimensions are ubiquitous in the sciences. However, their many applications and theoretical studies depend on computations which are infamously difficult. This thesis, therefore, addresses the broad question, “What is computationally possible within the field of spectral theory of separable Hilbert spaces?” The boundaries of what computers can achieve in computational spectral theory and mathematical physics are unknown, leaving many open questions that have been unsolved for decades. This thesis provides solutions to several such long-standing problems. To determine these boundaries, we use the Solvability Complexity Index (SCI) hierarchy, an idea which has its roots in Smale's comprehensive programme on the foundations of computational mathematics. The Smale programme led to a real-number counterpart of the Turing machine, yet left a substantial gap between theory and practice. The SCI hierarchy encompasses both these models and provides universal bounds on what is computationally possible. What makes spectral problems particularly delicate is that many of the problems can only be computed by using several limits, a phenomenon also shared in the foundations of polynomial root-finding as shown by McMullen. We develop and extend the SCI hierarchy to prove optimality of algorithms and construct a myriad of different methods for infinite-dimensional spectral problems, solving many computational spectral problems for the first time. For arguably almost any operator of applicable interest, we solve the long-standing computational spectral problem and construct algorithms that compute spectra with error control. This is done for partial differential operators with coefficients of locally bounded total variation and also for discrete infinite matrix operators. We also show how to compute spectral measures of normal operators (when the spectrum is a subset of a regular enough Jordan curve), including spectral measures of classes of self-adjoint operators with error control and the construction of high-order rational kernel methods. We classify the problems of computing measures, measure decompositions, types of spectra (pure point, absolutely continuous, singular continuous), functional calculus, and Radon--Nikodym derivatives in the SCI hierarchy. We construct algorithms for and classify; fractal dimensions of spectra, Lebesgue measures of spectra, spectral gaps, discrete spectra, eigenvalue multiplicities, capacity, different spectral radii and the problem of detecting algorithmic failure of previous methods (finite section method). The infinite-dimensional QR algorithm is also analysed, recovering extremal parts of spectra, corresponding eigenvectors, and invariant subspaces, with convergence rates and error control. Finally, we analyse pseudospectra of pseudoergodic operators (a generalisation of random operators) on vector-valued lpl^p spaces. All of the algorithms developed in this thesis are sharp in the sense of the SCI hierarchy. In other words, we prove that they are optimal, realising the boundaries of what digital computers can achieve. They are also implementable and practical, and the majority are parallelisable. Extensive numerical examples are given throughout, demonstrating efficiency and tackling difficult problems taken from mathematics and also physical applications. In summary, this thesis allows scientists to rigorously and efficiently compute many spectral properties for the first time. The framework provided by this thesis also encompasses a vast number of areas in computational mathematics, including the classical problem of polynomial root-finding, as well as optimisation, neural networks, PDEs and computer-assisted proofs. This framework will be explored in the future work of the author within these settings

    Mathematical modelling in neurophysiology: Neuronal geometry in the construction of neuronal models

    Get PDF
    The underlying theme of this thesis is that neuronal morphology influences neuronal behaviour. Three distinct but related projects in the application of mathematical models to neurophysiology are presented. The first problem is an investigation into the source of the discrepancy between the observed conduction speed of the propagated action potential in the squid giant axon, and its value predicted on the basis of the Hodgkin-Huxley membrane model. It is shown that measurement error and biological variability cannot explain the discrepancy, nor can the use of a three-dimensional model to represent the squid giant axon. If the propagated action potential achieved the travelling wave speed in the experimental apparatus, as assumed implicitly by Hodgkin and Huxley, then it is suggested that the model of the membrane kinetics requires modification. The second problem involves the generalisation of Rall's equivalent cylinder to the equivalent cable. The equivalent cable is an unbranched structure with electrotonic length equal to the sum of the electrotonic lengths of the segments of the original branched structure, and an associated bijective mapping relating currents on the original branched structure to those on the cable. The equivalent cable is derived analytically and can be applied to any branched dendrite, unlike the Rall equivalent cylinder, which only exists for dendrites satisfying very restrictive morphological constraints. Furthermore, the bijective mapping generated in the construction of the equivalent cable can be used to investigate the role of dendritic morphology in shaping neuronal behaviour. Examples of equivalent cables are given for spinal inter neurons from the dorsal horn of the spinal cord. The third problem develops a new procedure to simulate neuronal morphology from a sample of neurons of the same type. It is conjectured that neurons may be simulated on the basis of the single assumption that they are composed of uniform dendritic sections with joint distribution of diameter and length that is independent of location in a dendritic tree. This assumption, in combination with the kernel density estimation technique, is used to construct samples of simulated interneurons from samples of real interneurons, and the procedure is successful in predicting features of the original samples that are not assumed by the construction process

    Application de la mécanique quantique à la résolution de problÚmes de spectroscopie : développement de méthodes pour le calcul de propriétés d'états métastables

    Full text link
    ThÚse numérisée par la Direction des bibliothÚques de l'Université de Montréal
    corecore