5,854 research outputs found
Deep learning for unsupervised domain adaptation in medical imaging: Recent advancements and future perspectives
Deep learning has demonstrated remarkable performance across various tasks in
medical imaging. However, these approaches primarily focus on supervised
learning, assuming that the training and testing data are drawn from the same
distribution. Unfortunately, this assumption may not always hold true in
practice. To address these issues, unsupervised domain adaptation (UDA)
techniques have been developed to transfer knowledge from a labeled domain to a
related but unlabeled domain. In recent years, significant advancements have
been made in UDA, resulting in a wide range of methodologies, including feature
alignment, image translation, self-supervision, and disentangled representation
methods, among others. In this paper, we provide a comprehensive literature
review of recent deep UDA approaches in medical imaging from a technical
perspective. Specifically, we categorize current UDA research in medical
imaging into six groups and further divide them into finer subcategories based
on the different tasks they perform. We also discuss the respective datasets
used in the studies to assess the divergence between the different domains.
Finally, we discuss emerging areas and provide insights and discussions on
future research directions to conclude this survey.Comment: Under Revie
Soliton Gas: Theory, Numerics and Experiments
The concept of soliton gas was introduced in 1971 by V. Zakharov as an
infinite collection of weakly interacting solitons in the framework of
Korteweg-de Vries (KdV) equation. In this theoretical construction of a diluted
soliton gas, solitons with random parameters are almost non-overlapping. More
recently, the concept has been extended to dense gases in which solitons
strongly and continuously interact. The notion of soliton gas is inherently
associated with integrable wave systems described by nonlinear partial
differential equations like the KdV equation or the one-dimensional nonlinear
Schr\"odinger equation that can be solved using the inverse scattering
transform. Over the last few years, the field of soliton gases has received a
rapidly growing interest from both the theoretical and experimental points of
view. In particular, it has been realized that the soliton gas dynamics
underlies some fundamental nonlinear wave phenomena such as spontaneous
modulation instability and the formation of rogue waves. The recently
discovered deep connections of soliton gas theory with generalized
hydrodynamics have broadened the field and opened new fundamental questions
related to the soliton gas statistics and thermodynamics. We review the main
recent theoretical and experimental results in the field of soliton gas. The
key conceptual tools of the field, such as the inverse scattering transform,
the thermodynamic limit of finite-gap potentials and the Generalized Gibbs
Ensembles are introduced and various open questions and future challenges are
discussed.Comment: 35 pages, 8 figure
Technical Dimensions of Programming Systems
Programming requires much more than just writing code in a programming language. It is usually done in the context of a stateful environment, by interacting with a system through a graphical user interface. Yet, this wide space of possibilities lacks a common structure for navigation. Work on programming systems fails to form a coherent body of research, making it hard to improve on past work and advance the state of the art.
In computer science, much has been said and done to allow comparison of programming languages, yet no similar theory exists for programming systems; we believe that programming systems deserve a theory too.
We present a framework of technical dimensions which capture the underlying characteristics of programming systems and provide a means for conceptualizing and comparing them.
We identify technical dimensions by examining past influential programming systems and reviewing their design principles, technical capabilities, and styles of user interaction. Technical dimensions capture characteristics that may be studied, compared and advanced independently. This makes it possible to talk about programming systems in a way that can be shared and constructively debated rather than relying solely on personal impressions.
Our framework is derived using a qualitative analysis of past programming systems. We outline two concrete ways of using our framework. First, we show how it can analyze a recently developed novel programming system. Then, we use it to identify an interesting unexplored point in the design space of programming systems.
Much research effort focuses on building programming systems that are easier to use, accessible to non-experts, moldable and/or powerful, but such efforts are disconnected. They are informal, guided by the personal vision of their authors and thus are only evaluable and comparable on the basis of individual experience using them. By providing foundations for more systematic research, we can help programming systems researchers to stand, at last, on the shoulders of giants
A Design Science Research Approach to Smart and Collaborative Urban Supply Networks
Urban supply networks are facing increasing demands and challenges and thus constitute a relevant field for research and practical development. Supply chain management holds enormous potential and relevance for society and everyday life as the flow of goods and information are important economic functions. Being a heterogeneous field, the literature base of supply chain management research is difficult to manage and navigate. Disruptive digital technologies and the implementation of cross-network information analysis and sharing drive the need for new organisational and technological approaches. Practical issues are manifold and include mega trends such as digital transformation, urbanisation, and environmental awareness.
A promising approach to solving these problems is the realisation of smart and collaborative supply networks. The growth of artificial intelligence applications in recent years has led to a wide range of applications in a variety of domains. However, the potential of artificial intelligence utilisation in supply chain management has not yet been fully exploited. Similarly, value creation increasingly takes place in networked value creation cycles that have become continuously more collaborative, complex, and dynamic as interactions in business processes involving information technologies have become more intense.
Following a design science research approach this cumulative thesis comprises the development and discussion of four artefacts for the analysis and advancement of smart and collaborative urban supply networks. This thesis aims to highlight the potential of artificial intelligence-based supply networks, to advance data-driven inter-organisational collaboration, and to improve last mile supply network sustainability. Based on thorough machine learning and systematic literature reviews, reference and system dynamics modelling, simulation, and qualitative empirical research, the artefacts provide a valuable contribution to research and practice
Limit theorems for non-Markovian and fractional processes
This thesis examines various non-Markovian and fractional processes---rough volatility models, stochastic Volterra equations, Wiener chaos expansions---through the prism of asymptotic analysis.
Stochastic Volterra systems serve as a conducive framework encompassing most rough volatility models used in mathematical finance. In Chapter 2, we provide a unified treatment of pathwise large and moderate deviations principles for a general class of multidimensional stochastic Volterra equations with singular kernels, not necessarily of convolution form. Our methodology is based on the weak convergence approach by Budhiraja, Dupuis and Ellis.
This powerful approach also enables us to investigate the pathwise large deviations of families of white noise functionals characterised by their Wiener chaos expansion as~
In Chapter 3, we provide sufficient conditions for the large deviations principle to hold in path space, thereby refreshing a problem left open By Pérez-Abreu (1993). Hinging on analysis on Wiener space, the proof involves describing, controlling and identifying the limit of perturbed multiple stochastic integrals.
In Chapter 4, we come back to mathematical finance via the route of Malliavin calculus. We present explicit small-time formulae for the at-the-money implied volatility, skew and curvature in a large class of models, including rough volatility models and their multi-factor versions. Our general setup encompasses both European options on a stock and VIX options. In particular, we develop a detailed analysis of the two-factor rough Bergomi model.
Finally, in Chapter 5, we consider the large-time behaviour of affine stochastic Volterra equations, an under-developed area in the absence of Markovianity.
We leverage on a measure-valued Markovian lift introduced by Cuchiero and Teichmann and the associated notion of generalised Feller property.
This setting allows us to prove the existence of an invariant measure for the lift and hence of a stationary distribution for the affine Volterra process, featuring in the rough Heston model.Open Acces
Statistical-dynamical analyses and modelling of multi-scale ocean variability
This thesis aims to provide a comprehensive analysis of multi-scale oceanic variabilities using various statistical and dynamical tools and explore the data-driven methods for correct statistical emulation of the oceans. We considered the classical, wind-driven, double-gyre ocean circulation model in quasi-geostrophic approximation and obtained its eddy-resolving solutions in terms of potential vorticity anomaly and geostrophic streamfunctions. The reference solutions possess two asymmetric gyres of opposite circulations and a strong meandering eastward jet separating them with rich eddy activities around it, such as the Gulf Stream in the North Atlantic and Kuroshio in the North Pacific.
This thesis is divided into two parts. The first part discusses a novel scale-separation method based on the local spatial correlations, called correlation-based decomposition (CBD), and provides a comprehensive analysis of mesoscale eddy forcing. In particular, we analyse the instantaneous and time-lagged interactions between the diagnosed eddy forcing and the evolving large-scale PVA using the novel `product integral' characteristics. The product integral time series uncover robust causality between two drastically different yet interacting flow quantities, termed `eddy backscatter'. We also show data-driven augmentation of non-eddy-resolving ocean models by feeding them the eddy fields to restore the missing eddy-driven features, such as the merging western boundary currents, their eastward extension and low-frequency variabilities of gyres.
In the second part, we present a systematic inter-comparison of Linear Regression (LR), stochastic and deep-learning methods to build low-cost reduced-order statistical emulators of the oceans. We obtain the forecasts on seasonal and centennial timescales and assess them for their skill, cost and complexity. We found that the multi-level linear stochastic model performs the best, followed by the ``hybrid stochastically-augmented deep learning models''. The superiority of these methods underscores the importance of incorporating core dynamics, memory effects and model errors for robust emulation of multi-scale dynamical systems, such as the oceans.Open Acces
Annals [...].
Pedometrics: innovation in tropics; Legacy data: how turn it useful?; Advances in soil sensing; Pedometric guidelines to systematic soil surveys.Evento online. Coordenado por: Waldir de Carvalho Junior, Helena Saraiva Koenow Pinheiro, Ricardo Simão Diniz Dalmolin
Understanding Deep Learning Optimization via Benchmarking and Debugging
Das zentrale Prinzip des maschinellen Lernens (ML) ist die Vorstellung, dass Computer die notwendigen Strategien zur Lösung einer Aufgabe erlernen können, ohne explizit dafür programmiert worden zu sein. Die Hoffnung ist, dass Computer anhand von Daten die zugrunde liegenden Muster erkennen und selbst feststellen, wie sie Aufgaben erledigen können, ohne dass sie dabei von Menschen geleitet werden müssen. Um diese Aufgabe zu erfüllen, werden viele Probleme des maschinellen Lernens als Minimierung einer Verlustfunktion formuliert. Daher sind Optimierungsverfahren ein zentraler Bestandteil des Trainings von ML-Modellen. Obwohl das maschinelle Lernen und insbesondere das tiefe Lernen oft als innovative Spitzentechnologie wahrgenommen wird, basieren viele der zugrunde liegenden Optimierungsalgorithmen eher auf simplen, fast archaischen Verfahren. Um moderne neuronale Netze erfolgreich zu trainieren, bedarf es daher häufig umfangreicher menschlicher Unterstützung. Ein Grund für diesen mühsamen, umständlichen und langwierigen Trainingsprozess ist unser mangelndes Verständnis der Optimierungsmethoden im anspruchsvollen Rahmen des tiefen Lernens. Auch deshalb hat das Training neuronaler Netze bis heute den Ruf, eher eine Kunstform als eine echte Wissenschaft zu sein und erfordert ein Maß an menschlicher Beteiligung, welche dem Kernprinzip des maschinellen Lernens widerspricht. Obwohl bereits Hunderte Optimierungsverfahren für das tiefe Lernen vorgeschlagen wurden, gibt es noch kein allgemein anerkanntes Protokoll zur Beurteilung ihrer Qualität. Ohne ein standardisiertes und unabhängiges Bewertungsprotokoll ist es jedoch schwierig, die Nützlichkeit neuartiger Methoden zuverlässig nachzuweisen. In dieser Arbeit werden Strategien vorgestellt, mit denen sich Optimierer für das tiefe Lernen quantitativ, reproduzierbar und aussagekräftig vergleichen lassen. Dieses Protokoll berücksichtigt die einzigartigen Herausforderungen des tiefen Lernens, wie etwa die inhärente Stochastizität oder die wichtige Unterscheidung zwischen Lernen und reiner Optimierung. Die Erkenntnisse sind im Python-Paket DeepOBS formalisiert und automatisiert, wodurch gerechtere, schnellere und überzeugendere empirische Vergleiche von Optimierern ermöglicht werden. Auf der Grundlage dieses Benchmarking-Protokolls werden anschließend fünfzehn populäre Deep-Learning-Optimierer verglichen, um einen Überblick über den aktuellen Entwicklungsstand in diesem Bereich zu gewinnen. Um fundierte Entscheidungshilfen für die Auswahl einer Optimierungsmethode aus der wachsenden Liste zu erhalten, evaluiert der Benchmark sie umfassend anhand von fast 50 000 Trainingsprozessen. Unser Benchmark zeigt, dass der vergleichsweise traditionelle Adam-Optimierer eine gute, aber nicht dominierende Methode ist und dass neuere Algorithmen ihn nicht kontinuierlich übertreffen können. Neben dem verwendeten Optimierer können auch andere Ursachen das Training neuronaler Netze erschweren, etwa ineffiziente Modellarchitekturen oder Hyperparameter. Herkömmliche Leistungsindikatoren, wie etwa die Verlustfunktion auf den Trainingsdaten oder die erreichte Genauigkeit auf einem separaten Validierungsdatensatz, können zwar zeigen, ob das Modell lernt oder nicht, aber nicht warum. Um dieses Verständnis und gleichzeitig einen Blick in die Blackbox der neuronalen Netze zu liefern, wird in dieser Arbeit Cockpit präsentiert, ein Debugging-Tool speziell für das tiefe Lernen. Es kombiniert neuartige und bewährte Observablen zu einem Echtzeit-Überwachungswerkzeug für das Training neuronaler Netze. Cockpit macht unter anderem deutlich,dass gut getunte Trainingsprozesse konsequent über das lokale Minimum hinausgehen, zumindest für wesentliche Phasen des Trainings. Der Einsatz von sorgfältigen Benchmarking-Experimenten und maßgeschneiderten Debugging-Tools verbessert unser Verständnis des Trainings neuronaler Netze. Angesichts des Mangels an theoretischen Erkenntnissen sind diese empirischen Ergebnisse und praktischen Instrumente unerlässlich für die Unterstützung in der Praxis. Vor allem aber zeigen sie auf, dass es einen Bedarf und einen klaren Weg für grundlegend neuartigen Optimierungsmethoden gibt, um das tiefe Lernen zugänglicher, robuster und ressourcenschonender zu machen.The central paradigm of machine learning (ML) is the idea that computers can learn the strategies needed to solve a task without being explicitly programmed to do so. The hope is that given data, computers can recognize underlying patterns and figure out how to perform tasks without extensive human oversight. To achieve this, many machine learning problems are framed as minimizing a loss function, which makes optimization methods a core part of training ML models. Machine learning and in particular deep learning is often perceived as a cutting-edge technology, the underlying optimization algorithms, however, tend to resemble rather simplistic, even archaic methods. Crucially, they rely on extensive human intervention to successfully train modern neural networks. One reason for this tedious, finicky, and lengthy training process lies in our insufficient understanding of optimization methods in the challenging deep learning setting. As a result, training neural nets, to this day, has the reputation of being more of an art form than a science and requires a level of human assistance that runs counter to the core principle of ML. Although hundreds of optimization algorithms for deep learning have been proposed, there is no widely agreed-upon protocol for evaluating their performance. Without a standardized and independent evaluation protocol, it is difficult to reliably demonstrate the usefulness of novel methods. In this thesis, we present strategies for quantitatively and reproducibly comparing deep learning optimizers in a meaningful way. This protocol considers the unique challenges of deep learning such as the inherent stochasticity or the crucial distinction between learning and pure optimization. It is formalized and automatized in the Python package DeepOBS and allows fairer, faster, and more convincing empirical comparisons of deep learning optimizers. Based on this benchmarking protocol, we compare fifteen popular deep learning optimizers to gain insight into the field’s current state. To provide evidence-backed heuristics for choosing among the growing list of optimization methods, we extensively evaluate them with roughly 50,000 training runs. Our benchmark indicates that the comparably traditional Adam optimizer remains a strong but not dominating contender and that newer methods fail to consistently outperform it. In addition to the optimizer, other causes can impede neural network training, such as inefficient model architectures or hyperparameters. Traditional performance metrics, such as training loss or validation accuracy, can show if a model is learning or not, but not why. To provide this understanding and a glimpse into the black box of neural networks, we developed Cockpit, a debugging tool specifically for deep learning. It combines novel and proven observables into a live monitoring tool for practitioners. Among other findings, Cockpit reveals that well-tuned training runs consistently overshoot the local minimum, at least for significant portions of the training. The use of thorough benchmarking experiments and tailored debugging tools improves our understanding of neural network training. In the absence of theoretical insights, these empirical results and practical tools are essential for guiding practitioners. More importantly, our results show that there is a need and a clear path for fundamentally different optimization methods to make deep learning more accessible, robust, and resource-efficient
- …