139 research outputs found

    A Note on Randomized Kaczmarz Algorithm for Solving Doubly-Noisy Linear Systems

    Full text link
    Large-scale linear systems, Ax=bAx=b, frequently arise in practice and demand effective iterative solvers. Often, these systems are noisy due to operational errors or faulty data-collection processes. In the past decade, the randomized Kaczmarz (RK) algorithm has been studied extensively as an efficient iterative solver for such systems. However, the convergence study of RK in the noisy regime is limited and considers measurement noise in the right-hand side vector, bb. Unfortunately, in practice, that is not always the case; the coefficient matrix AA can also be noisy. In this paper, we analyze the convergence of RK for noisy linear systems when the coefficient matrix, AA, is corrupted with both additive and multiplicative noise, along with the noisy vector, bb. In our analyses, the quantity R~=A~22A~F2\tilde R=\| \tilde A^{\dagger} \|_2^2 \|\tilde A \|_F^2 influences the convergence of RK, where A~\tilde A represents a noisy version of AA. We claim that our analysis is robust and realistically applicable, as we do not require information about the noiseless coefficient matrix, AA, and considering different conditions on noise, we can control the convergence of RK. We substantiate our theoretical findings by performing comprehensive numerical experiments

    On Vector Sequence Transforms and Acceleration Techniques

    Get PDF
    This dissertation is devoted to the acceleration of convergence of vector sequences. This means to produce a replacement sequence from the original sequence with higher rate of convergence. It is assumed that the sequence is generated from a linear matrix iteration xi+ i = Gxi + k where G is an n x n square matrix and xI+1 , xi,and k are n x 1 vectors. Acceleration of convergence is obtained when we are able to resolve approximations to low dimension invariant subspaces of G which contain large components of the error. When this occurs, simple weighted averages of iterates x,+|, i = 1 ,2 ,... k where k \u3c n are used to produce iterates which contain approximately no error in the selfsame low dimension invariant subspaces. We begin with simple techniques based upon the resolution of a simple dominant eigenvalue/eigenvector pair and extend the notion to higher dimensional invariant spaces. Discussion is given to using various subspace iteration methods and their convergence. These ideas are again generalized by solving the eigenelement for a projection of G onto an appropriate subspace. The use of Lanzcos-type methods are discussed for establishing these projections. We produce acceleration techniques based on the process of generalized inversion. The relationship between the minimal polynomial extrapolation technique (MPE) for acceleration of convergence and conjugate gradient type methods is explored. Further acceleration techniques are formed from conjugate gradient type techniques and a generalized inverse Newton\u27s method. An exposition is given to accelerations based upon generalizations of rational interpolation and Pade approximation. Further acceleration techniques using Sherman-Woodbury-Morrison type formulas are formulated and suggested as a replacement for the E-transform. We contrast the effect of several extrapolation techniques drawn from the dissertation on a nonsymmetric linear iteration. We pick the Minimal Polynomial Extrapolation (MPE) as a representative of techniques based on orthogonal residuals, the Vector ϵ\epsilon-Algorithm (VEA) as a representative vector interpolation technique and a technique formulated in this dissertation based on solving a projected eigenproblem. The results show the projected eigenproblem technique to be superior for certain iterations

    Computing the singular value decomposition with high relative accuracy

    Get PDF
    AbstractWe analyze when it is possible to compute the singular values and singular vectors of a matrix with high relative accuracy. This means that each computed singular value is guaranteed to have some correct digits, even if the singular values have widely varying magnitudes. This is in contrast to the absolute accuracy provided by conventional backward stable algorithms, which in general only guarantee correct digits in the singular values with large enough magnitudes. It is of interest to compute the tiniest singular values with several correct digits, because in some cases, such as finite element problems and quantum mechanics, it is the smallest singular values that have physical meaning, and should be determined accurately by the data. Many recent papers have identified special classes of matrices where high relative accuracy is possible, since it is not possible in general. The perturbation theory and algorithms for these matrix classes have been quite different, motivating us to seek a common perturbation theory and common algorithm. We provide these in this paper, and show that high relative accuracy is possible in many new cases as well. The briefest way to describe our results is that we can compute the SVD of G to high relative accuracy provided we can accurately factor G=XDYT where D is diagonal and X and Y are any well-conditioned matrices; furthermore, the LDU factorization frequently does the job. We provide many examples of matrix classes permitting such an LDU decomposition

    Learning latent variable models : efficient algorithms and applications

    Get PDF
    Learning latent variable models is a fundamental machine learning problem, and the models belonging to this class - which include topic models, hidden Markov models, mixture models and many others - have a variety of real-world applications, like text mining, clustering and time series analysis. For many practitioners, the decade-old Expectation Maximization method (EM) is still the tool of choice, despite its known proneness to local minima and long running times. To overcome these issues, algorithms based on the spectral method of moments have been recently proposed. These techniques recover the parameters of a latent variable model by solving - typically via tensor decomposition - a system of non-linear equations relating the low-order moments of the observable data with the parameters of the model to be learned. Moment-based algorithms are in general faster than EM as they require a single pass over the data, and have provable guarantees of learning accuracy in polynomial time. Nevertheless, methods of moments have room for improvements: their ability to deal with real-world data is often limited by a lack of robustness to input perturbations. Also, almost no theory studies their behavior when some of the model assumptions are violated by the input data. Extending the theory of methods of moments to learn latent variable models and providing meaningful applications to real-world contexts is the focus of this thesis. ssuming data to be generated by a certain latent variable model, the standard approach of methods of moments consists of two steps: first, finding the equations that relate the moments of the observable data with the model parameters and then, to solve these equations to retrieve estimators of the parameters of the model. In Part I of this thesis we will focus on both steps, providing and analyzing novel and improved model-specific moments estimators and techniques to solve the equations of the moments. In both the cases we will introduce theoretical results, providing guarantees on the behavior of the proposed methods, and we will perform experimental comparisons with existing algorithms. In Part II, we will analyze the behavior of methods of moments when data violates some of the model assumptions performed by a user. First, we will observe that in this context most of the theoretical infrastructure underlying methods of moments is not valid anymore, and consequently we will develop a theoretical foundation to methods of moments in the misspecified setting, developing efficient methods, guaranteed to provide meaningful results even when some of the model assumptions are violated. During all the thesis, we will apply the developed theoretical results to challenging real-world applications, focusing on two main domains: topic modeling and healthcare analytics. We will extend the existing theory of methods of moments to learn models that are traditionally used to do topic modeling – like the single-topic model and Latent Dirichlet Allocation – providing improved learning techniques and comparing them with existing methods, which we prove to outperform in terms of speed and learning accuracy. Furthermore, we will propose applications of latent variable models to the analysis of electronic healthcare records, which, similarly to text mining, are very likely to become massive datasets; we will propose a method to discover recurrent phenotypes in populations of patients and to cluster them in groups with similar clinical profiles - a task where the efficiency properties of methods of moments will constitute a competitive advantage over traditional approaches.Aprender modelos de variable latente es un problema fundamental de machine learning, y los modelos que pertenecen a esta clase, como topic models, hidden Markov models o mixture models, tienen muchas aplicaciones en el mundo real, por ejemplo text mining, clustering y time series analysis. El método de Expectation Maximization (EM) sigue siendo la herramienta más utilizada, a pesar de su conocida tendencia a producir soluciones subóptimas y sus largos tiempos de ejecución. Para superar estos problemas, se han propuesto recientemente algoritmos basados en el método de los momentos. Estas técnicas aprenden los parámetros de un modelo resolviendo un sistema de ecuaciones no lineales que relacionan los momentos de los datos observables con los parámetros del modelo que se debe aprender. Los métodos de los momentos son en general más rápidos que EM, ya que requieren una sola pasada sobre los datos y tienen garantías de producir estimadores consistentes en tiempo polinomial. A pesar de estas ventajas, los métodos de los momentos todavía tienen margen de mejora: cuando se utilizan con datos reales, los métodos de los momentos se revelan inestables, con una fuerte sensibilidad a las perturbaciones. Además, las garantías de estos métodos son válidas solo si el usuario conoce el modelo probabilístico que genera los datos, y no existe alguna teoría que estudie lo que pasa cuando ese modelo es desconocido o no correctamente especificado. El objetivo de esta tesis es ampliar la teoría de métodos de los momentos, estudiar sus aplicaciones para aprender modelos de variable latente, extendiendo la teoría actual. Además se proporcionarán aplicaciones significativas a contextos reales. Típicamente, el método de los momentos consta de de dos partes: primero, encontrar las ecuaciones que relacionan los momentos de los datos observables con los parámetros del modelo y segundo, resolver estas ecuaciones para recuperar estimadores consistentes de los parámetros del modelo. En la Parte I de esta tesis, nos centraremos en ambos pasos, proporcionando y analizando nuevos estimadores de momentos para una variedad de modelos, y técnicas para resolver las ecuaciones de los momentos. En ambos casos, introduciremos resultados teóricos, proporcionaremos garantías sobre el comportamiento de los métodos propuestos y realizaremos comparaciones experimentales con algoritmos existentes. En la Parte II, analizaremos el comportamiento de los métodos de los momentos cuando algunas de las hipótesis de modelo se encuentran violadas por los datos. Como primera cosa, observaremos que en este contexto, la mayoría de la infraestructura teórica que subyace a estos métodos pierde de validez y, por lo tanto, desarrollaremos una base teórica nueva, presentando métodos eficientes, garantizados para proporcionar resultados razonables incluso cuando algunas de las hipótesis del modelo son violadas. En toda la tesis aplicamos los resultados obtenidos a nivel teórico a aplicaciones del mundo real, centrándonos en dos áreas principales: topic modeling y healthcare analytics. Ampliamos la teoría existente de los métodos de momentos para aprender los modelos que se usan tradicionalmente en el ámbito de topic modeling, como el single-topic model y la Latent Dirichlet Allocation, proporcionando nuevas técnicas de aprendizaje y comparándolas con los métodos existentes. Además, estudiamos aplicaciones de modelos de variable latente en el análisis de datos del ámbito healthcare; proponemos un método para descubrir fenotipos recurrentes en poblaciones de pacientes y agruparlos en clusters con perfiles clínicos similares, una tarea donde las propiedades de eficiencia de los métodos de los momentos constituyen una ventaja competitiva sobre los métodos tradicionales.Postprint (published version

    Towards an Efficient Gas Exchange Monitoring with Electrical Impedance Tomography - Optimization and validation of methods to investigate and understand pulmonary blood flow with indicator dilution

    Get PDF
    In vielen Fällen sind bei Patienten, die unter stark gestörtem Gasaustausch der Lunge leiden, die regionale Lungenventilation und die Perfusion nicht aufeinander abgestimmt. Besonders bei Patienten mit akutem Lungenversagen sind sehr heterogene räumliche Verteilungen von Belüftung und Perfusion der Lunge zu beobachten. Diese Patienten müssen auf der Intensivstation künstlich beatmet und überwacht werden, um einen ausreichenden Gasaustausch sicherzustellen. Bei schweren Lungenverletzungen ist es schwierig, durch die Anwendung hoher Beatmungsdrücke und -volumina eine optimale Balance zwischen dem Rekrutieren kollabierter Regionen zu finden, und gleichzeitig die Lunge vor weiterem Schaden durch die von außen angelegten Drücke zu schützen. Das Interesse für eine bettseitige Messung und Darstellung der regionalen Belüftungs- und Perfusionsverteilung für den Einsatz auf der Intensivstation ist in den letzten Jahren stark gestiegen, um eine lungenprotektive Beatmung zu ermöglichen und klinische Diagnosen zu vereinfachen. Die Elektrische-Impedanztomographie (EIT) ist ein nicht-invasives, strahlungsfreies und sehr mobil einsetzbares System. Es bietet eine hohe zeitliche Abtastung und eine funktionelle räumliche Auflösung, die es ermöglicht, dynamische (patho-) physiologische Prozesse zu visualisieren und zu überwachen. Die medizinische Forschung an EIT hat sich dabei hauptsächlich auf die Schätzung der räumlichen Belüftung konzentriert. Kommerziell erhältliche Systeme haben gezeigt, dass die EIT eine wertvolle Entscheidungshilfe während der mechanischen Beatmung darstellt. Allerdings ist die Abschätzung der pulmonalen Perfusion mit EIT noch nicht etabliert. Dies könnte das fehlende Glied sein, um die Analyse des pulmonalen Gasaustauschs am Krankenbett zu ermöglichen. Obwohl einige Publikationen die prinzipielle Machbarkeit der indikatorgestützten EIT zur Schätzung der räumlichen Verteilung des pulmonalen Blutflusses gezeigt haben, müssen diese Methoden optimiert und durch Vergleich mit dem Goldstandard des Lungenperfusions-Monitorings validiert werden. Darüber hinaus ist weitere Forschung notwendig, um zu verstehen welche physiologischen Informationen der EIT-Perfusionsschätzung zugrunde liegen. Mit der vorliegenden Arbeit soll die Frage beantwortet werden, ob bei der klinischen Anwendung von EIT neben der regionalen Belüftung auch räumliche Informationen des pulmonalen Blutflusses geschätzt werden können, um damit potenziell den pulmonalen Gasaustausch am Krankenbett beurteilen zu können. Die räumliche Verteilung der Perfusion wurde durch Bolusinjektion einer leitfähigen Kochsalzlösung als Indikator geschätzt, um die Verteilung des Indikators während seines Durchgangs durch das Gefäßsystem der Lunge zu verfolgen. Verschiedene dynamische EIT-Rekonstruktionsmethoden und Perfusionsparameter Schätzmethoden wurden entwickelt und verglichen, um den pulmonalen Blutfluss robust beurteilen zu können. Die geschätzten regionalen EIT-Perfusionsverteilungen wurden gegen Goldstandard Messverfahren der Lungenperfusion validiert. Eine erste Validierung wurde anhand von Daten einer tierexperimentellen Studie durchgeführt, bei der die Multidetektor-Computertomographie als vergleichende Lungenperfusionsmessung verwendet wurde. Darüber hinaus wurde im Rahmen dieser Arbeit eine umfassende präklinische Tierstudie durchgeführt, um die Lungenperfusion mit indikatorverstärkter EIT und Positronen-Emissions-Tomographie während mehrerer verschiedener experimenteller Zustände zu untersuchen. Neben einem gründlichen Methodenvergleich sollte die klinische Anwendbarkeit der indikatorgestützten EIT-Perfusionsmessung untersucht werden, indem wir vor allem die minimale Indikatorkonzentration analysierten, die eine robuste Perfusionsschätzung erlaubte und den geringsten Einfluss für den Patienten darstellt. Neben den experimentellen Validierungsstudien wurden zwei in-silico-Untersuchungen durchgeführt, um erstens die Sensitivität von EIT gegenüber des Durchgangs eines leitfähigen Indikators durch die Lunge vor stark heterogenem pulmonalen Hintergrund zu bewerten. Zweitens untersuchten wir die physiologischen Einflüsse, die zu den rekonstruierten EITPerfusionsbildern beitragen, um die Limitationen der Methode besser zu verstehen. Die Analysen zeigten, dass die Schätzung der Lungenperfusion auf der Basis der indikatorverstärkten EIT ein großes Potenzial für die Anwendung in der klinischen Praxis aufweist, da wir sie mit zwei Goldstandard-Perfusionsmesstechniken validieren konnten. Zudem konnten wertvolle Schlüsse über die physiologischen Einflüsse auf die geschätzten EIT Perfusionsverteilungen gezogen werden

    Improving the forward model for electrical impedance tomography of brain function through rapid generation of subject specific finite element models

    Get PDF
    Electrical Impedance Tomography (EIT) is a non-invasive imaging method which allows internal electrical impedance of any conductive object to be imaged by means of current injection and surface voltage measurements through an array of externally applied electrodes. The successful generation of the image requires the simulation of the current injection patterns on either an analytical or a numerical model of the domain under examination, known as the forward model, and using the resulting voltage data in the inverse solution from which images of conductivity changes can be constructed. Recent research strongly indicates that geometric and anatomical conformance of the forward model to the subject under investigation significantly affects the quality of the images. This thesis focuses mainly on EIT of brain function and describes a novel approach for the rapid generation of patient or subject specific finite element models for use as the forward model. After introduction of the topic, methods of generating accurate finite element (FE) models using commercially available Computer-Aided Design (CAD) tools are described and show that such methods, though effective and successful, are inappropriate for time critical clinical use. The feasibility of warping or morphing a finite element mesh as a means of reducing the lead time for model generation is then presented and demonstrated. This leads on to the description of methods of acquiring and utilising known system geometry, namely the positions of electrodes and registration landmarks, to construct an accurate surface of the subject, the results of which are successfully validated. The outcome of this procedure is then used to specify boundary conditions to a mesh warping algorithm based on elastic deformation using well-established continuum mechanics procedures. The algorithm is applied to a range of source models to empirically establish optimum values for the parameters defining the problem which can successfully generate meshes of acceptable quality in terms of discretization errors and which more accurately define the geometry of the target subject. Further validation of the algorithm is performed by comparison of boundary voltages and image reconstructions from simulated and laboratory data to demonstrate that benefits in terms of image artefact reduction and localisation of conductivity changes can be gained. The processes described in the thesis are evaluated and discussed and topics of further work and application are described
    corecore