1,707 research outputs found

    Flood dynamics derived from video remote sensing

    Get PDF
    Flooding is by far the most pervasive natural hazard, with the human impacts of floods expected to worsen in the coming decades due to climate change. Hydraulic models are a key tool for understanding flood dynamics and play a pivotal role in unravelling the processes that occur during a flood event, including inundation flow patterns and velocities. In the realm of river basin dynamics, video remote sensing is emerging as a transformative tool that can offer insights into flow dynamics and thus, together with other remotely sensed data, has the potential to be deployed to estimate discharge. Moreover, the integration of video remote sensing data with hydraulic models offers a pivotal opportunity to enhance the predictive capacity of these models. Hydraulic models are traditionally built with accurate terrain, flow and bathymetric data and are often calibrated and validated using observed data to obtain meaningful and actionable model predictions. Data for accurately calibrating and validating hydraulic models are not always available, leaving the assessment of the predictive capabilities of some models deployed in flood risk management in question. Recent advances in remote sensing have heralded the availability of vast video datasets of high resolution. The parallel evolution of computing capabilities, coupled with advancements in artificial intelligence are enabling the processing of data at unprecedented scales and complexities, allowing us to glean meaningful insights into datasets that can be integrated with hydraulic models. The aims of the research presented in this thesis were twofold. The first aim was to evaluate and explore the potential applications of video from air- and space-borne platforms to comprehensively calibrate and validate two-dimensional hydraulic models. The second aim was to estimate river discharge using satellite video combined with high resolution topographic data. In the first of three empirical chapters, non-intrusive image velocimetry techniques were employed to estimate river surface velocities in a rural catchment. For the first time, a 2D hydraulicvmodel was fully calibrated and validated using velocities derived from Unpiloted Aerial Vehicle (UAV) image velocimetry approaches. This highlighted the value of these data in mitigating the limitations associated with traditional data sources used in parameterizing two-dimensional hydraulic models. This finding inspired the subsequent chapter where river surface velocities, derived using Large Scale Particle Image Velocimetry (LSPIV), and flood extents, derived using deep neural network-based segmentation, were extracted from satellite video and used to rigorously assess the skill of a two-dimensional hydraulic model. Harnessing the ability of deep neural networks to learn complex features and deliver accurate and contextually informed flood segmentation, the potential value of satellite video for validating two dimensional hydraulic model simulations is exhibited. In the final empirical chapter, the convergence of satellite video imagery and high-resolution topographical data bridges the gap between visual observations and quantitative measurements by enabling the direct extraction of velocities from video imagery, which is used to estimate river discharge. Overall, this thesis demonstrates the significant potential of emerging video-based remote sensing datasets and offers approaches for integrating these data into hydraulic modelling and discharge estimation practice. The incorporation of LSPIV techniques into flood modelling workflows signifies a methodological progression, especially in areas lacking robust data collection infrastructure. Satellite video remote sensing heralds a major step forward in our ability to observe river dynamics in real time, with potentially significant implications in the domain of flood modelling science

    Backpropagation Beyond the Gradient

    Get PDF
    Automatic differentiation is a key enabler of deep learning: previously, practitioners were limited to models for which they could manually compute derivatives. Now, they can create sophisticated models with almost no restrictions and train them using first-order, i. e. gradient, information. Popular libraries like PyTorch and TensorFlow compute this gradient efficiently, automatically, and conveniently with a single line of code. Under the hood, reverse-mode automatic differentiation, or gradient backpropagation, powers the gradient computation in these libraries. Their entire design centers around gradient backpropagation. These frameworks are specialized around one specific task—computing the average gradient in a mini-batch. This specialization often complicates the extraction of other information like higher-order statistical moments of the gradient, or higher-order derivatives like the Hessian. It limits practitioners and researchers to methods that rely on the gradient. Arguably, this hampers the field from exploring the potential of higher-order information and there is evidence that focusing solely on the gradient has not lead to significant recent advances in deep learning optimization. To advance algorithmic research and inspire novel ideas, information beyond the batch-averaged gradient must be made available at the same level of computational efficiency, automation, and convenience. This thesis presents approaches to simplify experimentation with rich information beyond the gradient by making it more readily accessible. We present an implementation of these ideas as an extension to the backpropagation procedure in PyTorch. Using this newly accessible information, we demonstrate possible use cases by (i) showing how it can inform our understanding of neural network training by building a diagnostic tool, and (ii) enabling novel methods to efficiently compute and approximate curvature information. First, we extend gradient backpropagation for sequential feedforward models to Hessian backpropagation which enables computing approximate per-layer curvature. This perspective unifies recently proposed block- diagonal curvature approximations. Like gradient backpropagation, the computation of these second-order derivatives is modular, and therefore simple to automate and extend to new operations. Based on the insight that rich information beyond the gradient can be computed efficiently and at the same time, we extend the backpropagation in PyTorch with the BackPACK library. It provides efficient and convenient access to statistical moments of the gradient and approximate curvature information, often at a small overhead compared to computing just the gradient. Next, we showcase the utility of such information to better understand neural network training. We build the Cockpit library that visualizes what is happening inside the model during training through various instruments that rely on BackPACK’s statistics. We show how Cockpit provides a meaningful statistical summary report to the deep learning engineer to identify bugs in their machine learning pipeline, guide hyperparameter tuning, and study deep learning phenomena. Finally, we use BackPACK’s extended automatic differentiation functionality to develop ViViT, an approach to efficiently compute curvature information, in particular curvature noise. It uses the low-rank structure of the generalized Gauss-Newton approximation to the Hessian and addresses shortcomings in existing curvature approximations. Through monitoring curvature noise, we demonstrate how ViViT’s information helps in understanding challenges to make second-order optimization methods work in practice. This work develops new tools to experiment more easily with higher-order information in complex deep learning models. These tools have impacted works on Bayesian applications with Laplace approximations, out-of-distribution generalization, differential privacy, and the design of automatic differentia- tion systems. They constitute one important step towards developing and establishing more efficient deep learning algorithms

    Learning and Control of Dynamical Systems

    Get PDF
    Despite the remarkable success of machine learning in various domains in recent years, our understanding of its fundamental limitations remains incomplete. This knowledge gap poses a grand challenge when deploying machine learning methods in critical decision-making tasks, where incorrect decisions can have catastrophic consequences. To effectively utilize these learning-based methods in such contexts, it is crucial to explicitly characterize their performance. Over the years, significant research efforts have been dedicated to learning and control of dynamical systems where the underlying dynamics are unknown or only partially known a priori, and must be inferred from collected data. However, much of these classical results have focused on asymptotic guarantees, providing limited insights into the amount of data required to achieve desired control performance while satisfying operational constraints such as safety and stability, especially in the presence of statistical noise. In this thesis, we study the statistical complexity of learning and control of unknown dynamical systems. By utilizing recent advances in statistical learning theory, high-dimensional statistics, and control theoretic tools, we aim to establish a fundamental understanding of the number of samples required to achieve desired (i) accuracy in learning the unknown dynamics, (ii) performance in the control of the underlying system, and (iii) satisfaction of the operational constraints such as safety and stability. We provide finite-sample guarantees for these objectives and propose efficient learning and control algorithms that achieve the desired performance at these statistical limits in various dynamical systems. Our investigation covers a broad range of dynamical systems, starting from fully observable linear dynamical systems to partially observable linear dynamical systems, and ultimately, nonlinear systems. We deploy our learning and control algorithms in various adaptive control tasks in real-world control systems and demonstrate their strong empirical performance along with their learning, robustness, and stability guarantees. In particular, we implement one of our proposed methods, Fourier Adaptive Learning and Control (FALCON), on an experimental aerodynamic testbed under extreme turbulent flow dynamics in a wind tunnel. The results show that FALCON achieves state-of-the-art stabilization performance and consistently outperforms conventional and other learning-based methods by at least 37%, despite using 8 times less data. The superior performance of FALCON arises from its physically and theoretically accurate modeling of the underlying nonlinear turbulent dynamics, which yields rigorous finite-sample learning and performance guarantees. These findings underscore the importance of characterizing the statistical complexity of learning and control of unknown dynamical systems.</p

    Anwendungen maschinellen Lernens fĂŒr datengetriebene PrĂ€vention auf Populationsebene

    Get PDF
    Healthcare costs are systematically rising, and current therapy-focused healthcare systems are not sustainable in the long run. While disease prevention is a viable instrument for reducing costs and suffering, it requires risk modeling to stratify populations, identify high- risk individuals and enable personalized interventions. In current clinical practice, however, systematic risk stratification is limited: on the one hand, for the vast majority of endpoints, no risk models exist. On the other hand, available models focus on predicting a single disease at a time, rendering predictor collection burdensome. At the same time, the den- sity of individual patient data is constantly increasing. Especially complex data modalities, such as -omics measurements or images, may contain systemic information on future health trajectories relevant for multiple endpoints simultaneously. However, to date, this data is inaccessible for risk modeling as no dedicated methods exist to extract clinically relevant information. This study built on recent advances in machine learning to investigate the ap- plicability of four distinct data modalities not yet leveraged for risk modeling in primary prevention. For each data modality, a neural network-based survival model was developed to extract predictive information, scrutinize performance gains over commonly collected covariates, and pinpoint potential clinical utility. Notably, the developed methodology was able to integrate polygenic risk scores for cardiovascular prevention, outperforming existing approaches and identifying benefiting subpopulations. Investigating NMR metabolomics, the developed methodology allowed the prediction of future disease onset for many common diseases at once, indicating potential applicability as a drop-in replacement for commonly collected covariates. Extending the methodology to phenome-wide risk modeling, elec- tronic health records were found to be a general source of predictive information with high systemic relevance for thousands of endpoints. Assessing retinal fundus photographs, the developed methodology identified diseases where retinal information most impacted health trajectories. In summary, the results demonstrate the capability of neural survival models to integrate complex data modalities for multi-disease risk modeling in primary prevention and illustrate the tremendous potential of machine learning models to disrupt medical practice toward data-driven prevention at population scale.Die Kosten im Gesundheitswesen steigen systematisch und derzeitige therapieorientierte Gesundheitssysteme sind nicht nachhaltig. Angesichts vieler verhinderbarer Krankheiten stellt die PrĂ€vention ein veritables Instrument zur Verringerung von Kosten und Leiden dar. Risikostratifizierung ist die grundlegende Voraussetzung fĂŒr ein prĂ€ventionszentri- ertes Gesundheitswesen um Personen mit hohem Risiko zu identifizieren und Maßnah- men einzuleiten. Heute ist eine systematische Risikostratifizierung jedoch nur begrenzt möglich, da fĂŒr die meisten Krankheiten keine Risikomodelle existieren und sich verfĂŒg- bare Modelle auf einzelne Krankheiten beschrĂ€nken. Weil fĂŒr deren Berechnung jeweils spezielle Sets an PrĂ€diktoren zu erheben sind werden in Praxis oft nur wenige Modelle angewandt. Gleichzeitig versprechen komplexe DatenmodalitĂ€ten, wie Bilder oder -omics- Messungen, systemische Informationen ĂŒber zukĂŒnftige GesundheitsverlĂ€ufe, mit poten- tieller Relevanz fĂŒr viele Endpunkte gleichzeitig. Da es an dedizierten Methoden zur Ex- traktion klinisch relevanter Informationen fehlt, sind diese Daten jedoch fĂŒr die Risikomod- ellierung unzugĂ€nglich, und ihr Potenzial blieb bislang unbewertet. Diese Studie nutzt ma- chinelles Lernen, um die Anwendbarkeit von vier DatenmodalitĂ€ten in der PrimĂ€rprĂ€ven- tion zu untersuchen: polygene Risikoscores fĂŒr die kardiovaskulĂ€re PrĂ€vention, NMR Meta- bolomicsdaten, elektronische Gesundheitsakten und Netzhautfundusfotos. Pro Datenmodal- itĂ€t wurde ein neuronales Risikomodell entwickelt, um relevante Informationen zu extra- hieren, additive Information gegenĂŒber ĂŒblicherweise erfassten Kovariaten zu quantifizieren und den potenziellen klinischen Nutzen der DatenmodalitĂ€t zu ermitteln. Die entwickelte Me-thodik konnte polygene Risikoscores fĂŒr die kardiovaskulĂ€re PrĂ€vention integrieren. Im Falle der NMR-Metabolomik erschloss die entwickelte Methodik wertvolle Informa- tionen ĂŒber den zukĂŒnftigen Ausbruch von Krankheiten. Unter Einsatz einer phĂ€nomen- weiten Risikomodellierung erwiesen sich elektronische Gesundheitsakten als Quelle prĂ€dik- tiver Information mit hoher systemischer Relevanz. Bei der Analyse von Fundusfotografien der Netzhaut wurden Krankheiten identifiziert fĂŒr deren Vorhersage Netzhautinformationen genutzt werden könnten. Zusammengefasst zeigten die Ergebnisse das Potential neuronaler Risikomodelle die medizinische Praxis in Richtung einer datengesteuerten, prĂ€ventionsori- entierten Medizin zu verĂ€ndern

    Quadratic Speedups in Parallel Sampling from Determinantal Distributions

    Full text link
    We study the problem of parallelizing sampling from distributions related to determinants: symmetric, nonsymmetric, and partition-constrained determinantal point processes, as well as planar perfect matchings. For these distributions, the partition function, a.k.a. the count, can be obtained via matrix determinants, a highly parallelizable computation; Csanky proved it is in NC. However, parallel counting does not automatically translate to parallel sampling, as classic reductions between the two are inherently sequential. We show that a nearly quadratic parallel speedup over sequential sampling can be achieved for all the aforementioned distributions. If the distribution is supported on subsets of size kk of a ground set, we show how to approximately produce a sample in O~(k12+c)\widetilde{O}(k^{\frac{1}{2} + c}) time with polynomially many processors for any constant c>0c>0. In the two special cases of symmetric determinantal point processes and planar perfect matchings, our bound improves to O~(k)\widetilde{O}(\sqrt k) and we show how to sample exactly in these cases. As our main technical contribution, we fully characterize the limits of batching for the steps of sampling-to-counting reductions. We observe that only O(1)O(1) steps can be batched together if we strive for exact sampling, even in the case of nonsymmetric determinantal point processes. However, we show that for approximate sampling, Ω~(k12−c)\widetilde{\Omega}(k^{\frac{1}{2}-c}) steps can be batched together, for any entropically independent distribution, which includes all mentioned classes of determinantal point processes. Entropic independence and related notions have been the source of breakthroughs in Markov chain analysis in recent years, so we expect our framework to prove useful for distributions beyond those studied in this work.Comment: 33 pages, SPAA 202

    Measuring the impact of COVID-19 on hospital care pathways

    Get PDF
    Care pathways in hospitals around the world reported significant disruption during the recent COVID-19 pandemic but measuring the actual impact is more problematic. Process mining can be useful for hospital management to measure the conformance of real-life care to what might be considered normal operations. In this study, we aim to demonstrate that process mining can be used to investigate process changes associated with complex disruptive events. We studied perturbations to accident and emergency (A &E) and maternity pathways in a UK public hospital during the COVID-19 pandemic. Co-incidentally the hospital had implemented a Command Centre approach for patient-flow management affording an opportunity to study both the planned improvement and the disruption due to the pandemic. Our study proposes and demonstrates a method for measuring and investigating the impact of such planned and unplanned disruptions affecting hospital care pathways. We found that during the pandemic, both A &E and maternity pathways had measurable reductions in the mean length of stay and a measurable drop in the percentage of pathways conforming to normative models. There were no distinctive patterns of monthly mean values of length of stay nor conformance throughout the phases of the installation of the hospital’s new Command Centre approach. Due to a deficit in the available A &E data, the findings for A &E pathways could not be interpreted

    Mathematical and Physical Methods to Construct Approximately Neutral Surfaces

    Full text link
    The magnitude of the diffusivity that characterizes lateral mixing in the ocean is about 106 -108 times larger than that of vertical mixing. The lateral direction is along the direction of the neutral tangent plane (same as the direction of the locally referenced potential density surface). However, due to the helical nature of the neutral trajectories (the normal vector of the neutral tangent plane is not curl-free), well-defined neutral surfaces do not exist. Well-defined but only approximately neutral surfaces have traditionally been chosen based on either (i) constructing a three-dimensional density variable whose iso-surface (the surface with a constant density value of the density variable) describes the lateral direction, or (ii) creating a two-dimensional approximately neutral surfaces (ANS), which are normally more neutral than the iso-surfaces of the three-dimensional density variable A three-dimensional neutral density variable is here derived called rSCV, which is an improvement on the neutral density rn of Jackett and McDougall (1997). Compared with rn, rSCV is independent of pressure and thus is insensitive to the ubiquitous vertical heaving motions of waves and eddies, and has similar neutrality as rn. The material derivatives (the rate of change of the density variables) of rSCV and rn have also been derived using numerical methods. The material derivative of rSCV is shown to be close to that of rn. Oceanographers have traditionally estimated the quality of an ANS by focusing on the fictitious vertical diffusion caused by lateral diffusion being applied in the wrong direction. This thesis shows that the spurious advection through an ANS is another important consideration that limits the accuracy and usefulness of an ANS. Because of this concern, a two-dimensional approximately neutral surface is constructed called the Wu.s-surface, which minimizes the spurious dia-surface advection through the surface. The spurious dia-surface advection through the Wu.s-surface is more than a hundred times smaller than that on the most neutral ANS to date, however, the fictitious diapycnal diffusion on it is larger. Therefore, the Wu.s+s2-surface is created to control both the spurious dia-surface advection and the fictitious diapycnal diffusion on the surface. It is shown that minimizing the fictitious diffusion and the spurious dia-surface advection is important for using such surfaces in inverse studies. Hence the Wu.s+s2-surface is the best choice of surface for such studies

    Anpassen verteilter eingebetteter Anwendungen im laufenden Betrieb

    Get PDF
    The availability of third-party apps is among the key success factors for software ecosystems: The users benefit from more features and innovation speed, while third-party solution vendors can leverage the platform to create successful offerings. However, this requires a certain decoupling of engineering activities of the different parties not achieved for distributed control systems, yet. While late and dynamic integration of third-party components would be required, resulting control systems must provide high reliability regarding real-time requirements, which leads to integration complexity. Closing this gap would particularly contribute to the vision of software-defined manufacturing, where an ecosystem of modern IT-based control system components could lead to faster innovations due to their higher abstraction and availability of various frameworks. Therefore, this thesis addresses the research question: How we can use modern IT technologies and enable independent evolution and easy third-party integration of software components in distributed control systems, where deterministic end-to-end reactivity is required, and especially, how can we apply distributed changes to such systems consistently and reactively during operation? This thesis describes the challenges and related approaches in detail and points out that existing approaches do not fully address our research question. To tackle this gap, a formal specification of a runtime platform concept is presented in conjunction with a model-based engineering approach. The engineering approach decouples the engineering steps of component definition, integration, and deployment. The runtime platform supports this approach by isolating the components, while still offering predictable end-to-end real-time behavior. Independent evolution of software components is supported through a concept for synchronous reconfiguration during full operation, i.e., dynamic orchestration of components. Time-critical state transfer is supported, too, and can lead to bounded quality degradation, at most. The reconfiguration planning is supported by analysis concepts, including simulation of a formally specified system and reconfiguration, and analyzing potential quality degradation with the evolving dataflow graph (EDFG) method. A platform-specific realization of the concepts, the real-time container architecture, is described as a reference implementation. The model and the prototype are evaluated regarding their feasibility and applicability of the concepts by two case studies. The first case study is a minimalistic distributed control system used in different setups with different component variants and reconfiguration plans to compare the model and the prototype and to gather runtime statistics. The second case study is a smart factory showcase system with more challenging application components and interface technologies. The conclusion is that the concepts are feasible and applicable, even though the concepts and the prototype still need to be worked on in future -- for example, to reach shorter cycle times.Eine große Auswahl von Drittanbieter-Lösungen ist einer der SchlĂŒsselfaktoren fĂŒr Software Ecosystems: Nutzer profitieren vom breiten Angebot und schnellen Innovationen, wĂ€hrend Drittanbieter ĂŒber die Plattform erfolgreiche Lösungen anbieten können. Das jedoch setzt eine gewisse Entkopplung von Entwicklungsschritten der Beteiligten voraus, welche fĂŒr verteilte Steuerungssysteme noch nicht erreicht wurde. WĂ€hrend Drittanbieter-Komponenten möglichst spĂ€t -- sogar Laufzeit -- integriert werden mĂŒssten, mĂŒssen Steuerungssysteme jedoch eine hohe ZuverlĂ€ssigkeit gegenĂŒber Echtzeitanforderungen aufweisen, was zu IntegrationskomplexitĂ€t fĂŒhrt. Dies zu lösen wĂŒrde insbesondere zur Vision von Software-definierter Produktion beitragen, da ein Ecosystem fĂŒr moderne IT-basierte Steuerungskomponenten wegen deren höherem Abstraktionsgrad und der Vielzahl verfĂŒgbarer Frameworks zu schnellerer Innovation fĂŒhren wĂŒrde. Daher behandelt diese Dissertation folgende Forschungsfrage: Wie können wir moderne IT-Technologien verwenden und unabhĂ€ngige Entwicklung und einfache Integration von Software-Komponenten in verteilten Steuerungssystemen ermöglichen, wo Ende-zu-Ende-Echtzeitverhalten gefordert ist, und wie können wir insbesondere verteilte Änderungen an solchen Systemen konsistent und im Vollbetrieb vornehmen? Diese Dissertation beschreibt Herausforderungen und verwandte AnsĂ€tze im Detail und zeigt auf, dass existierende AnsĂ€tze diese Frage nicht vollstĂ€ndig behandeln. Um diese LĂŒcke zu schließen, beschreiben wir eine formale Spezifikation einer Laufzeit-Plattform und einen zugehörigen Modell-basierten Engineering-Ansatz. Dieser Ansatz entkoppelt die Design-Schritte der Entwicklung, Integration und des Deployments von Komponenten. Die Laufzeit-Plattform unterstĂŒtzt den Ansatz durch Isolation von Komponenten und zugleich Zeit-deterministischem Ende-zu-Ende-Verhalten. UnabhĂ€ngige Entwicklung und Integration werden durch Konzepte fĂŒr synchrone Rekonfiguration im Vollbetrieb unterstĂŒtzt, also durch dynamische Orchestrierung. Dies beinhaltet auch Zeit-kritische Zustands-Transfers mit höchstens begrenzter QualitĂ€tsminderung, wenn ĂŒberhaupt. Rekonfigurationsplanung wird durch Analysekonzepte unterstĂŒtzt, einschließlich der Simulation formal spezifizierter Systeme und Rekonfigurationen und der Analyse der etwaigen QualitĂ€tsminderung mit dem Evolving Dataflow Graph (EDFG). Die Real-Time Container Architecture wird als Referenzimplementierung und Evaluationsplattform beschrieben. Zwei Fallstudien untersuchen Machbarkeit und NĂŒtzlichkeit der Konzepte. Die erste verwendet verschiedene Varianten und Rekonfigurationen eines minimalistischen verteilten Steuerungssystems, um Modell und Prototyp zu vergleichen sowie Laufzeitstatistiken zu erheben. Die zweite Fallstudie ist ein Smart-Factory-Demonstrator, welcher herausforderndere Applikationskomponenten und Schnittstellentechnologien verwendet. Die Konzepte sind den Studien nach machbar und nĂŒtzlich, auch wenn sowohl die Konzepte als auch der Prototyp noch weitere Arbeit benötigen -- zum Beispiel, um kĂŒrzere Zyklen zu erreichen

    Perturbation theory of polynomials and linear operators

    Full text link
    This survey revolves around the question how the roots of a monic polynomial (resp. the spectral decomposition of a linear operator), whose coefficients depend in a smooth way on parameters, depend on those parameters. The parameter dependence of the polynomials (resp. operators) ranges from real analytic over C∞C^\infty to differentiable of finite order with often drastically different regularity results for the roots (resp. eigenvalues and eigenvectors). Another interesting point is the difference between the perturbation theory of hyperbolic polynomials (where, by definition, all roots are real) and that of general complex polynomials. The subject, which started with Rellich's work in the 1930s, enjoyed sustained interest through time that intensified in the last two decades, bringing some definitive optimal results. Throughout we try to explain the main proof ideas; Rellich's theorem and Bronshtein's theorem on hyperbolic polynomials are presented with full proofs. The survey is written for readers interested in singularity theory but also for those who intend to apply the results in other fields.Comment: 65 page
    • 

    corecore