479 research outputs found

    Long-term future prediction under uncertainty and multi-modality

    Get PDF
    Humans have an innate ability to excel at activities that involve prediction of complex object dynamics such as predicting the possible trajectory of a billiard ball after it has been hit by the player or the prediction of motion of pedestrians while on the road. A key feature that enables humans to perform such tasks is anticipation. There has been continuous research in the area of Computer Vision and Artificial Intelligence to mimic this human ability for autonomous agents to succeed in the real world scenarios. Recent advances in the field of deep learning and the availability of large scale datasets has enabled the pursuit of fully autonomous agents with complex decision making abilities such as self-driving vehicles or robots. One of the main challenges encompassing the deployment of these agents in the real world is their ability to perform anticipation tasks with at least human level efficiency. To advance the field of autonomous systems, particularly, self-driving agents, in this thesis, we focus on the task of future prediction in diverse real world settings, ranging from deterministic scenarios such as prediction of paths of balls on a billiard table to the predicting the future of non-deterministic street scenes. Specifically, we identify certain core challenges for long-term future prediction: long-term prediction, uncertainty, multi-modality, and exact inference. To address these challenges, this thesis makes the following core contributions. Firstly, for accurate long-term predictions, we develop approaches that effectively utilize available observed information in the form of image boundaries in videos or interactions in street scenes. Secondly, as uncertainty increases into the future in case of non-deterministic scenarios, we leverage Bayesian inference frameworks to capture calibrated distributions of likely future events. Finally, to further improve performance in highly-multimodal non-deterministic scenarios such as street scenes, we develop deep generative models based on conditional variational autoencoders as well as normalizing flow based exact inference methods. Furthermore, we introduce a novel dataset with dense pedestrian-vehicle interactions to further aid the development of anticipation methods for autonomous driving applications in urban environments.Menschen haben die angeborene Fähigkeit, Vorgänge mit komplexer Objektdynamik vorauszusehen, wie z. B. die Vorhersage der möglichen Flugbahn einer Billardkugel, nachdem sie vom Spieler gestoßen wurde, oder die Vorhersage der Bewegung von Fußgängern auf der Straße. Eine Schlüsseleigenschaft, die es dem Menschen ermöglicht, solche Aufgaben zu erfüllen, ist die Antizipation. Im Bereich der Computer Vision und der Künstlichen Intelligenz wurde kontinuierlich daran geforscht, diese menschliche Fähigkeit nachzuahmen, damit autonome Agenten in der realen Welt erfolgreich sein können. Jüngste Fortschritte auf dem Gebiet des Deep Learning und die Verfügbarkeit großer Datensätze haben die Entwicklung vollständig autonomer Agenten mit komplexen Entscheidungsfähigkeiten wie selbstfahrende Fahrzeugen oder Roboter ermöglicht. Eine der größten Herausforderungen beim Einsatz dieser Agenten in der realen Welt ist ihre Fähigkeit, Antizipationsaufgaben mit einer Effizienz durchzuführen, die mindestens der menschlichen entspricht. Um das Feld der autonomen Systeme, insbesondere der selbstfahrenden Agenten, voranzubringen, konzentrieren wir uns in dieser Arbeit auf die Aufgabe der Zukunftsvorhersage in verschiedenen realen Umgebungen, die von deterministischen Szenarien wie der Vorhersage der Bahnen von Kugeln auf einem Billardtisch bis zur Vorhersage der Zukunft von nicht-deterministischen Straßenszenen reichen. Insbesondere identifizieren wir bestimmte grundlegende Herausforderungen für langfristige Zukunftsvorhersagen: Langzeitvorhersage, Unsicherheit, Multimodalität und exakte Inferenz. Um diese Herausforderungen anzugehen, leistet diese Arbeit die folgenden grundlegenden Beiträge. Erstens: Für genaue Langzeitvorhersagen entwickeln wir Ansätze, die verfügbare Beobachtungsinformationen in Form von Bildgrenzen in Videos oder Interaktionen in Straßenszenen effektiv nutzen. Zweitens: Da die Unsicherheit in der Zukunft bei nicht-deterministischen Szenarien zunimmt, nutzen wir Bayes’sche Inferenzverfahren, um kalibrierte Verteilungen wahrscheinlicher zukünftiger Ereignisse zu erfassen. Drittens: Um die Leistung in hochmultimodalen, nichtdeterministischen Szenarien wie Straßenszenen weiter zu verbessern, entwickeln wir tiefe generative Modelle, die sowohl auf konditionalen Variations-Autoencodern als auch auf normalisierenden fließenden exakten Inferenzmethoden basieren. Darüber hinaus stellen wir einen neuartigen Datensatz mit dichten Fußgänger-Fahrzeug- Interaktionen vor, um Antizipationsmethoden für autonome Fahranwendungen in urbanen Umgebungen weiter zu entwickeln

    Bayesian Local Smoothing Modeling and Inference for Pre-surgical FMRI Data.

    Full text link
    There is a growing interest in using fMRI measurements and analyses as tools for pre-surgical planning. For such applications, spatial precision and control over false negatives and false positives are vital, requiring careful design of an image smoothing method and a classification procedure. This dissertation seeks computationally efficient approaches to overcome the limitation of existing methods and address new challenges in pre-surgical fMRI analyses. In the first study, we develop a Bayesian solution for the pre-surgical analysis of a single fMRI brain image. Specifically, we propose a novel spatially adaptive conditionally autoregressive model (CWAS) that adaptively and locally smoothes the fMRI data. We introduce a Bayesian theoretical decision approach that allows control of both false positives and false negatives to identify activated and deactivated brain regions. We benchmark the proposed solution to two existing spatially adaptive smoothing models, through simulation studies and two patients' pre-surgical fMRI datasets. In the second study, we extend the idea of spatially adaptive smoothing to multiple fMRI brain images in order to leverage spatial correlations across multiple images. In particular, we propose three spatially adaptive multivariate conditional autoregressive models that can be considered as extensions of the multivariate conditional autoregressive (MCAR) model (Gelfand and Vounatsou, 2003), the CWAS model, and the model of Reich and Hodges (2008), respectively, and one mixed-effects model assuming that all observed fMRI images originate from one common image. We compare the performance of the proposed models with those from the MCAR and CWAS models using simulation studies and two sets of fMRI brain images, acquired either from the same patient, same paradigm or same patient, different paradigms. The last study is motivated by fMRI brain images acquired at two different spatial resolutions from the same patient. We develop a Bayesian hierarchical model with spatially varying coefficients to retain the spatial precision from the high resolution image while utilizing information from the low resolution image to improve estimation and inference. Comparisons between the proposed model and the CWAS model, which operates at a single spatial resolution, are performed on simulated data and a patient's multi-resolution pre-surgical fMRI data.PhDBiostatisticsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/133339/1/zhuqingl_1.pd

    Temporal Subsampling Diminishes Small Spatial Scales in Recurrent Neural Network Emulators of Geophysical Turbulence

    Full text link
    The immense computational cost of traditional numerical weather and climate models has sparked the development of machine learning (ML) based emulators. Because ML methods benefit from long records of training data, it is common to use datasets that are temporally subsampled relative to the time steps required for the numerical integration of differential equations. Here, we investigate how this often overlooked processing step affects the quality of an emulator's predictions. We implement two ML architectures from a class of methods called reservoir computing: (1) a form of Nonlinear Vector Autoregression (NVAR), and (2) an Echo State Network (ESN). Despite their simplicity, it is well documented that these architectures excel at predicting low dimensional chaotic dynamics. We are therefore motivated to test these architectures in an idealized setting of predicting high dimensional geophysical turbulence as represented by Surface Quasi-Geostrophic dynamics. In all cases, subsampling the training data consistently leads to an increased bias at small spatial scales that resembles numerical diffusion. Interestingly, the NVAR architecture becomes unstable when the temporal resolution is increased, indicating that the polynomial based interactions are insufficient at capturing the detailed nonlinearities of the turbulent flow. The ESN architecture is found to be more robust, suggesting a benefit to the more expensive but more general structure. Spectral errors are reduced by including a penalty on the kinetic energy density spectrum during training, although the subsampling related errors persist. Future work is warranted to understand how the temporal resolution of training data affects other ML architectures

    Solving Inverse Problems with Piecewise Linear Estimators: From Gaussian Mixture Models to Structured Sparsity

    Full text link
    A general framework for solving image inverse problems is introduced in this paper. The approach is based on Gaussian mixture models, estimated via a computationally efficient MAP-EM algorithm. A dual mathematical interpretation of the proposed framework with structured sparse estimation is described, which shows that the resulting piecewise linear estimate stabilizes the estimation when compared to traditional sparse inverse problem techniques. This interpretation also suggests an effective dictionary motivated initialization for the MAP-EM algorithm. We demonstrate that in a number of image inverse problems, including inpainting, zooming, and deblurring, the same algorithm produces either equal, often significantly better, or very small margin worse results than the best published ones, at a lower computational cost.Comment: 30 page

    Sources of residual autocorrelation in multiband task fMRI and strategies for effective mitigation

    Full text link
    In task fMRI analysis, OLS is typically used to estimate task-induced activation in the brain. Since task fMRI residuals often exhibit temporal autocorrelation, it is common practice to perform prewhitening prior to OLS to satisfy the assumption of residual independence, equivalent to GLS. While theoretically straightforward, a major challenge in prewhitening in fMRI is accurately estimating the residual autocorrelation at each location of the brain. Assuming a global autocorrelation model, as in several fMRI software programs, may under- or over-whiten particular regions and fail to achieve nominal false positive control across the brain. Faster multiband acquisitions require more sophisticated models to capture autocorrelation, making prewhitening more difficult. These issues are becoming more critical now because of a trend towards subject-level analysis, where prewhitening has a greater impact than in group-average analyses. In this article, we first thoroughly examine the sources of residual autocorrelation in multiband task fMRI. We find that residual autocorrelation varies spatially throughout the cortex and is affected by the task, the acquisition method, modeling choices, and individual differences. Second, we evaluate the ability of different AR-based prewhitening strategies to effectively mitigate autocorrelation and control false positives. We find that allowing the prewhitening filter to vary spatially is the most important factor for successful prewhitening, even more so than increasing AR model order. To overcome the computational challenge associated with spatially variable prewhitening, we developed a computationally efficient R implementation based on parallelization and fast C++ backend code. This implementation is included in the open source R package BayesfMRI.Comment: 26 pages with 1 page of appendix, 11 figures with 1 figure of supplementary figur

    Restoration of Atmospheric Turbulence Degraded Video using Kurtosis Minimization and Motion Compensation

    Get PDF
    In this thesis work, the background of atmospheric turbulence degradation in imaging was reviewed and two aspects are highlighted: blurring and geometric distortion. The turbulence burring parameter is determined by the atmospheric turbulence condition that is often unknown; therefore, a blur identification technique was developed that is based on a higher order statistics (HOS). It was observed that the kurtosis generally increases as an image becomes blurred (smoothed). Such an observation was interpreted in the frequency domain in terms of phase correlation. Kurtosis minimization based blur identification is built upon this observation. It was shown that kurtosis minimization is effective in identifying the blurring parameter directly from the degraded image. Kurtosis minimization is a general method for blur identification. It has been tested on a variety of blurs such as Gaussian blur, out of focus blur as well as motion blur. To compensate for the geometric distortion, earlier work on the turbulent motion compensation was extended to deal with situations in which there is camera/object motion. Trajectory smoothing is used to suppress the turbulent motion while preserving the real motion. Though the scintillation effect of atmospheric turbulence is not considered separately, it can be handled the same way as multiple frame denoising while motion trajectories are built.Ph.D.Committee Chair: Mersereau, Russell; Committee Co-Chair: Smith, Mark; Committee Member: Lanterman, Aaron; Committee Member: Wang, May; Committee Member: Tannenbaum, Allen; Committee Member: Williams, Dougla
    corecore