968 research outputs found

    LIPIcs, Volume 251, ITCS 2023, Complete Volume

    Get PDF
    LIPIcs, Volume 251, ITCS 2023, Complete Volum

    Robust estimation in exponential families: from theory to practice

    Get PDF

    Quality of experience and access network traffic management of HTTP adaptive video streaming

    Get PDF
    The thesis focuses on Quality of Experience (QoE) of HTTP adaptive video streaming (HAS) and traffic management in access networks to improve the QoE of HAS. First, the QoE impact of adaptation parameters and time on layer was investigated with subjective crowdsourcing studies. The results were used to compute a QoE-optimal adaptation strategy for given video and network conditions. This allows video service providers to develop and benchmark improved adaptation logics for HAS. Furthermore, the thesis investigated concepts to monitor video QoE on application and network layer, which can be used by network providers in the QoE-aware traffic management cycle. Moreover, an analytic and simulative performance evaluation of QoE-aware traffic management on a bottleneck link was conducted. Finally, the thesis investigated socially-aware traffic management for HAS via Wi-Fi offloading of mobile HAS flows. A model for the distribution of public Wi-Fi hotspots and a platform for socially-aware traffic management on private home routers was presented. A simulative performance evaluation investigated the impact of Wi-Fi offloading on the QoE and energy consumption of mobile HAS.Die Doktorarbeit beschäftigt sich mit Quality of Experience (QoE) – der subjektiv empfundenen Dienstgüte – von adaptivem HTTP Videostreaming (HAS) und mit Verkehrsmanagement, das in Zugangsnetzwerken eingesetzt werden kann, um die QoE des adaptiven Videostreamings zu verbessern. Zuerst wurde der Einfluss von Adaptionsparameters und der Zeit pro Qualitätsstufe auf die QoE von adaptivem Videostreaming mittels subjektiver Crowdsourcingstudien untersucht. Die Ergebnisse wurden benutzt, um die QoE-optimale Adaptionsstrategie für gegebene Videos und Netzwerkbedingungen zu berechnen. Dies ermöglicht Dienstanbietern von Videostreaming verbesserte Adaptionsstrategien für adaptives Videostreaming zu entwerfen und zu benchmarken. Weiterhin untersuchte die Arbeit Konzepte zum Überwachen von QoE von Videostreaming in der Applikation und im Netzwerk, die von Netzwerkbetreibern im Kreislauf des QoE-bewussten Verkehrsmanagements eingesetzt werden können. Außerdem wurde eine analytische und simulative Leistungsbewertung von QoE-bewusstem Verkehrsmanagement auf einer Engpassverbindung durchgeführt. Schließlich untersuchte diese Arbeit sozialbewusstes Verkehrsmanagement für adaptives Videostreaming mittels WLAN Offloading, also dem Auslagern von mobilen Videoflüssen über WLAN Netzwerke. Es wurde ein Modell für die Verteilung von öffentlichen WLAN Zugangspunkte und eine Plattform für sozialbewusstes Verkehrsmanagement auf privaten, häuslichen WLAN Routern vorgestellt. Abschließend untersuchte eine simulative Leistungsbewertung den Einfluss von WLAN Offloading auf die QoE und den Energieverbrauch von mobilem adaptivem Videostreaming

    Markov field models of molecular kinetics

    Get PDF
    Computer simulations such as molecular dynamics (MD) provide a possible means to understand protein dynamics and mechanisms on an atomistic scale. The resulting simulation data can be analyzed with Markov state models (MSMs), yielding a quantitative kinetic model that, e.g., encodes state populations and transition rates. However, the larger an investigated system, the more data is required to estimate a valid kinetic model. In this work, we show that this scaling problem can be escaped when decomposing a system into smaller ones, leveraging weak couplings between local domains. Our approach, termed independent Markov decomposition (IMD), is a first-order approximation neglecting couplings, i.e., it represents a decomposition of the underlying global dynamics into a set of independent local ones. We demonstrate that for truly independent systems, IMD can reduce the sampling by three orders of magnitude. IMD is applied to two biomolecular systems. First, synaptotagmin-1 is analyzed, a rapid calcium switch from the neurotransmitter release machinery. Within its C2A domain, local conformational switches are identified and modeled with independent MSMs, shedding light on the mechanism of its calcium-mediated activation. Second, the catalytic site of the serine protease TMPRSS2 is analyzed with a local drug-binding model. Equilibrium populations of different drug-binding modes are derived for three inhibitors, mirroring experimentally determined drug efficiencies. IMD is subsequently extended to an end-to-end deep learning framework called iVAMPnets, which learns a domain decomposition from simulation data and simultaneously models the kinetics in the local domains. We finally classify IMD and iVAMPnets as Markov field models (MFM), which we define as a class of models that describe dynamics by decomposing systems into local domains. Overall, this thesis introduces a local approach to Markov modeling that enables to quantitatively assess the kinetics of large macromolecular complexes, opening up possibilities to tackle current and future computational molecular biology questions

    Machine learning for the sustainable energy transition: a data-driven perspective along the value chain from manufacturing to energy conversion

    Get PDF
    According to the special report Global Warming of 1.5 °C of the IPCC, climate action is not only necessary but more than ever urgent. The world is witnessing rising sea levels, heat waves, events of flooding, droughts, and desertification resulting in the loss of lives and damage to livelihoods, especially in countries of the Global South. To mitigate climate change and commit to the Paris agreement, it is of the uttermost importance to reduce greenhouse gas emissions coming from the most emitting sector, namely the energy sector. To this end, large-scale penetration of renewable energy systems into the energy market is crucial for the energy transition toward a sustainable future by replacing fossil fuels and improving access to energy with socio-economic benefits. With the advent of Industry 4.0, Internet of Things technologies have been increasingly applied to the energy sector introducing the concept of smart grid or, more in general, Internet of Energy. These paradigms are steering the energy sector towards more efficient, reliable, flexible, resilient, safe, and sustainable solutions with huge environmental and social potential benefits. To realize these concepts, new information technologies are required, and among the most promising possibilities are Artificial Intelligence and Machine Learning which in many countries have already revolutionized the energy industry. This thesis presents different Machine Learning algorithms and methods for the implementation of new strategies to make renewable energy systems more efficient and reliable. It presents various learning algorithms, highlighting their advantages and limits, and evaluating their application for different tasks in the energy context. In addition, different techniques are presented for the preprocessing and cleaning of time series, nowadays collected by sensor networks mounted on every renewable energy system. With the possibility to install large numbers of sensors that collect vast amounts of time series, it is vital to detect and remove irrelevant, redundant, or noisy features, and alleviate the curse of dimensionality, thus improving the interpretability of predictive models, speeding up their learning process, and enhancing their generalization properties. Therefore, this thesis discussed the importance of dimensionality reduction in sensor networks mounted on renewable energy systems and, to this end, presents two novel unsupervised algorithms. The first approach maps time series in the network domain through visibility graphs and uses a community detection algorithm to identify clusters of similar time series and select representative parameters. This method can group both homogeneous and heterogeneous physical parameters, even when related to different functional areas of a system. The second approach proposes the Combined Predictive Power Score, a method for feature selection with a multivariate formulation that explores multiple sub-sets of expanding variables and identifies the combination of features with the highest predictive power over specified target variables. This method proposes a selection algorithm for the optimal combination of variables that converges to the smallest set of predictors with the highest predictive power. Once the combination of variables is identified, the most relevant parameters in a sensor network can be selected to perform dimensionality reduction. Data-driven methods open the possibility to support strategic decision-making, resulting in a reduction of Operation & Maintenance costs, machine faults, repair stops, and spare parts inventory size. Therefore, this thesis presents two approaches in the context of predictive maintenance to improve the lifetime and efficiency of the equipment, based on anomaly detection algorithms. The first approach proposes an anomaly detection model based on Principal Component Analysis that is robust to false alarms, can isolate anomalous conditions, and can anticipate equipment failures. The second approach has at its core a neural architecture, namely a Graph Convolutional Autoencoder, which models the sensor network as a dynamical functional graph by simultaneously considering the information content of individual sensor measurements (graph node features) and the nonlinear correlations existing between all pairs of sensors (graph edges). The proposed neural architecture can capture hidden anomalies even when the turbine continues to deliver the power requested by the grid and can anticipate equipment failures. Since the model is unsupervised and completely data-driven, this approach can be applied to any wind turbine equipped with a SCADA system. When it comes to renewable energies, the unschedulable uncertainty due to their intermittent nature represents an obstacle to the reliability and stability of energy grids, especially when dealing with large-scale integration. Nevertheless, these challenges can be alleviated if the natural sources or the power output of renewable energy systems can be forecasted accurately, allowing power system operators to plan optimal power management strategies to balance the dispatch between intermittent power generations and the load demand. To this end, this thesis proposes a multi-modal spatio-temporal neural network for multi-horizon wind power forecasting. In particular, the model combines high-resolution Numerical Weather Prediction forecast maps with turbine-level SCADA data and explores how meteorological variables on different spatial scales together with the turbines' internal operating conditions impact wind power forecasts. The world is undergoing a third energy transition with the main goal to tackle global climate change through decarbonization of the energy supply and consumption patterns. This is not only possible thanks to global cooperation and agreements between parties, power generation systems advancements, and Internet of Things and Artificial Intelligence technologies but also necessary to prevent the severe and irreversible consequences of climate change that are threatening life on the planet as we know it. This thesis is intended as a reference for researchers that want to contribute to the sustainable energy transition and are approaching the field of Artificial Intelligence in the context of renewable energy systems

    Behavior quantification as the missing link between fields: Tools for digital psychiatry and their role in the future of neurobiology

    Full text link
    The great behavioral heterogeneity observed between individuals with the same psychiatric disorder and even within one individual over time complicates both clinical practice and biomedical research. However, modern technologies are an exciting opportunity to improve behavioral characterization. Existing psychiatry methods that are qualitative or unscalable, such as patient surveys or clinical interviews, can now be collected at a greater capacity and analyzed to produce new quantitative measures. Furthermore, recent capabilities for continuous collection of passive sensor streams, such as phone GPS or smartwatch accelerometer, open avenues of novel questioning that were previously entirely unrealistic. Their temporally dense nature enables a cohesive study of real-time neural and behavioral signals. To develop comprehensive neurobiological models of psychiatric disease, it will be critical to first develop strong methods for behavioral quantification. There is huge potential in what can theoretically be captured by current technologies, but this in itself presents a large computational challenge -- one that will necessitate new data processing tools, new machine learning techniques, and ultimately a shift in how interdisciplinary work is conducted. In my thesis, I detail research projects that take different perspectives on digital psychiatry, subsequently tying ideas together with a concluding discussion on the future of the field. I also provide software infrastructure where relevant, with extensive documentation. Major contributions include scientific arguments and proof of concept results for daily free-form audio journals as an underappreciated psychiatry research datatype, as well as novel stability theorems and pilot empirical success for a proposed multi-area recurrent neural network architecture.Comment: PhD thesis cop

    Decoding algorithms for surface codes

    Full text link
    Quantum technologies have the potential to solve computationally hard problems that are intractable via classical means. Unfortunately, the unstable nature of quantum information makes it prone to errors. For this reason, quantum error correction is an invaluable tool to make quantum information reliable and enable the ultimate goal of fault-tolerant quantum computing. Surface codes currently stand as the most promising candidates to build error corrected qubits given their two-dimensional architecture, a requirement of only local operations, and high tolerance to quantum noise. Decoding algorithms are an integral component of any error correction scheme, as they are tasked with producing accurate estimates of the errors that affect quantum information, so that it can subsequently be corrected. A critical aspect of decoding algorithms is their speed, since the quantum state will suffer additional errors with the passage of time. This poses a connundrum-like tradeoff, where decoding performance is improved at the expense of complexity and viceversa. In this review, a thorough discussion of state-of-the-art surface code decoding algorithms is provided. The core operation of these methods is described along with existing variants that show promise for improved results. In addition, both the decoding performance, in terms of error correction capability, and decoding complexity, are compared. A review of the existing software tools regarding surface code decoding is also provided.Comment: 54 pages, 31 figure

    If interpretability is the answer, what is the question?

    Get PDF
    Due to the ability to model even complex dependencies, machine learning (ML) can be used to tackle a broad range of (high-stakes) prediction problems. The complexity of the resulting models comes at the cost of transparency, meaning that it is difficult to understand the model by inspecting its parameters. This opacity is considered problematic since it hampers the transfer of knowledge from the model, undermines the agency of individuals affected by algorithmic decisions, and makes it more challenging to expose non-robust or unethical behaviour. To tackle the opacity of ML models, the field of interpretable machine learning (IML) has emerged. The field is motivated by the idea that if we could understand the model's behaviour -- either by making the model itself interpretable or by inspecting post-hoc explanations -- we could also expose unethical and non-robust behaviour, learn about the data generating process, and restore the agency of affected individuals. IML is not only a highly active area of research, but the developed techniques are also widely applied in both industry and the sciences. Despite the popularity of IML, the field faces fundamental criticism, questioning whether IML actually helps in tackling the aforementioned problems of ML and even whether it should be a field of research in the first place: First and foremost, IML is criticised for lacking a clear goal and, thus, a clear definition of what it means for a model to be interpretable. On a similar note, the meaning of existing methods is often unclear, and thus they may be misunderstood or even misused to hide unethical behaviour. Moreover, estimating conditional-sampling-based techniques poses a significant computational challenge. With the contributions included in this thesis, we tackle these three challenges for IML. We join a range of work by arguing that the field struggles to define and evaluate "interpretability" because incoherent interpretation goals are conflated. However, the different goals can be disentangled such that coherent requirements can inform the derivation of the respective target estimands. We demonstrate this with the examples of two interpretation contexts: recourse and scientific inference. To tackle the misinterpretation of IML methods, we suggest deriving formal interpretation rules that link explanations to aspects of the model and data. In our work, we specifically focus on interpreting feature importance. Furthermore, we collect interpretation pitfalls and communicate them to a broader audience. To efficiently estimate conditional-sampling-based interpretation techniques, we propose two methods that leverage the dependence structure in the data to simplify the estimation problems for Conditional Feature Importance (CFI) and SAGE. A causal perspective proved to be vital in tackling the challenges: First, since IML problems such as algorithmic recourse are inherently causal; Second, since causality helps to disentangle the different aspects of model and data and, therefore, to distinguish the insights that different methods provide; And third, algorithms developed for causal structure learning can be leveraged for the efficient estimation of conditional-sampling based IML methods.Aufgrund der Fähigkeit, selbst komplexe Abhängigkeiten zu modellieren, kann maschinelles Lernen (ML) zur Lösung eines breiten Spektrums von anspruchsvollen Vorhersageproblemen eingesetzt werden. Die Komplexität der resultierenden Modelle geht auf Kosten der Interpretierbarkeit, d. h. es ist schwierig, das Modell durch die Untersuchung seiner Parameter zu verstehen. Diese Undurchsichtigkeit wird als problematisch angesehen, da sie den Wissenstransfer aus dem Modell behindert, sie die Handlungsfähigkeit von Personen, die von algorithmischen Entscheidungen betroffen sind, untergräbt und sie es schwieriger macht, nicht robustes oder unethisches Verhalten aufzudecken. Um die Undurchsichtigkeit von ML-Modellen anzugehen, hat sich das Feld des interpretierbaren maschinellen Lernens (IML) entwickelt. Dieses Feld ist von der Idee motiviert, dass wir, wenn wir das Verhalten des Modells verstehen könnten - entweder indem wir das Modell selbst interpretierbar machen oder anhand von post-hoc Erklärungen - auch unethisches und nicht robustes Verhalten aufdecken, über den datengenerierenden Prozess lernen und die Handlungsfähigkeit betroffener Personen wiederherstellen könnten. IML ist nicht nur ein sehr aktiver Forschungsbereich, sondern die entwickelten Techniken werden auch weitgehend in der Industrie und den Wissenschaften angewendet. Trotz der Popularität von IML ist das Feld mit fundamentaler Kritik konfrontiert, die in Frage stellt, ob IML tatsächlich dabei hilft, die oben genannten Probleme von ML anzugehen, und ob es überhaupt ein Forschungsgebiet sein sollte: In erster Linie wird an IML kritisiert, dass es an einem klaren Ziel und damit an einer klaren Definition dessen fehlt, was es für ein Modell bedeutet, interpretierbar zu sein. Weiterhin ist die Bedeutung bestehender Methoden oft unklar, so dass sie missverstanden oder sogar missbraucht werden können, um unethisches Verhalten zu verbergen. Letztlich stellt die Schätzung von auf bedingten Stichproben basierenden Verfahren eine erhebliche rechnerische Herausforderung dar. In dieser Arbeit befassen wir uns mit diesen drei grundlegenden Herausforderungen von IML. Wir schließen uns der Argumentation an, dass es schwierig ist, "Interpretierbarkeit" zu definieren und zu bewerten, weil inkohärente Interpretationsziele miteinander vermengt werden. Die verschiedenen Ziele lassen sich jedoch entflechten, sodass kohärente Anforderungen die Ableitung der jeweiligen Zielgrößen informieren. Wir demonstrieren dies am Beispiel von zwei Interpretationskontexten: algorithmischer Regress und wissenschaftliche Inferenz. Um der Fehlinterpretation von IML-Methoden zu begegnen, schlagen wir vor, formale Interpretationsregeln abzuleiten, die Erklärungen mit Aspekten des Modells und der Daten verknüpfen. In unserer Arbeit konzentrieren wir uns speziell auf die Interpretation von sogenannten Feature Importance Methoden. Darüber hinaus tragen wir wichtige Interpretationsfallen zusammen und kommunizieren sie an ein breiteres Publikum. Zur effizienten Schätzung auf bedingten Stichproben basierender Interpretationstechniken schlagen wir zwei Methoden vor, die die Abhängigkeitsstruktur in den Daten nutzen, um die Schätzprobleme für Conditional Feature Importance (CFI) und SAGE zu vereinfachen. Eine kausale Perspektive erwies sich als entscheidend für die Bewältigung der Herausforderungen: Erstens, weil IML-Probleme wie der algorithmische Regress inhärent kausal sind; zweitens, weil Kausalität hilft, die verschiedenen Aspekte von Modell und Daten zu entflechten und somit die Erkenntnisse, die verschiedene Methoden liefern, zu unterscheiden; und drittens können wir Algorithmen, die für das Lernen kausaler Struktur entwickelt wurden, für die effiziente Schätzung von auf bindingten Verteilungen basierenden IML-Methoden verwenden

    Reinforcement Learning Curricula as Interpolations between Task Distributions

    Get PDF
    In the last decade, the increased availability of powerful computing machinery has led to an increasingly widespread application of machine learning methods. Machine learning has been particularly successful when large models, typically neural networks with an ever-increasing number of parameters, can leverage vast data to make predictions. While reinforcement learning (RL) has been no exception from this development, a distinguishing feature of RL is its well-known exploration-exploitation trade-off, whose optimal solution – while possible to model as a partially observable Markov decision process – evades computation in all but the simplest problems. Consequently, it seems unsurprising that notable demonstrations of reinforcement learning, such as an RL-based Go agent (AlphaGo) by Deepmind beating the professional Go player Lee Sedol, relied both on the availability of massive computing capabilities and specific forms of regularization that facilitate learning. In the case of AlphaGo, this regularization came in the form of self-play, enabling learning by interacting with gradually more proficient opponents. In this thesis, we develop techniques that, similarly to the concept of self-play of AlphaGo, improve the learning performance of RL agents by training on sequences of increasingly complex tasks. These task sequences are typically called curricula and are known to side-step problems such as slow learning or convergence to poor behavior that may occur when directly learning in complicated tasks. The algorithms we develop in this thesis create curricula by minimizing distances or divergences between probability distributions of learning tasks, generating interpolations between an initial distribution of easy learning tasks and a target task distribution. Apart from improving the learning performance of RL agents in experiments, developing methods that realize curricula as interpolations between task distributions results in a nuanced picture of key aspects of successful reinforcement learning curricula. In Chapter 1, we start this thesis by introducing required reinforcement learning notation and then motivating curriculum reinforcement learning from the perspective of continuation methods for non-linear optimization. Similar to curricula for reinforcement learning agents, continuation methods have been used in non-linear optimization to solve challenging optimization problems. This similarity provides an intuition about the effect of the curricula we aim to generate and their limits. In Chapter 2, we transfer the concept of self-paced learning, initially proposed in the supervised learning community, to the problem of RL, showing that an automated curriculum generation for RL agents can be motivated by a regularized RL objective. This regularized RL objective implies generating a curriculum as a sequence of task distributions that trade off the expected agent performance against similarity to a specified distribution of target tasks. This view on curriculum RL contrasts existing approaches, as it motivates curricula via a regularized RL objective instead of generating them from a set of assumptions about an optimal curriculum. In experiments, we show that an approximate implementation of the aforementioned curriculum – that restricts the interpolating task distribution to a Gaussian – results in improved learning performance compared to regular reinforcement learning, matching or surpassing the performance of existing curriculum-based methods. Subsequently, Chapter 3 builds up on the intuition of curricula as sequences of interpolating task distributions established in Chapter 2. Motivated by using more flexible task distribution representations, we show how parametric assumptions play a crucial role in the empirical success of the previous approach and subsequently uncover key ingredients that enable the generation of meaningful curricula without assuming a parametric model of the task distributions. One major ingredient is an explicit notion of task similarity via a distance function of two Markov Decision Processes. We turn towards optimal transport theory, allowing for flexible particle-based representations of the task distributions while properly considering the newly introduced metric structure of the task space. Combined with other improvements to our first method, such as a more aggressive restriction of the curriculum to tasks that are not too hard for the agent, the resulting approach delivers consistently high learning performance in multiple experiments. In the final Chapter 4, we apply the refined method of Chapter 3 to a trajectory-tracking task, in which we task an RL agent to follow a three-dimensional reference trajectory with the tip of an inverted pendulum mounted on a Barrett Whole Arm Manipulator. The access to only positional information results in a partially observable system that, paired with its inherent instability, underactuation, and non-trivial kinematic structure, presents a challenge for modern reinforcement learning algorithms, which we tackle via curricula. The technically infinite-dimensional task space of target trajectories allows us to probe the developed curriculum learning method for flaws that have not surfaced in the rather low-dimensional experiments of the previous chapters. Through an improved optimization scheme that better respects the non-Euclidean structure of target trajectories, we reliably generate curricula of trajectories to be tracked, resulting in faster and more robust learning compared to an RL baseline that does not exploit this form of structured learning. The learned policy matches the performance of an optimal control baseline on the real system, demonstrating the potential of curriculum RL to learn state estimation and control for non-linear tracking tasks jointly. In summary, this thesis introduces a perspective on reinforcement learning curricula as interpolations between task distributions. The methods developed under this perspective enjoy a precise formulation as optimization problems and deliver empirical benefits throughout experiments. Building upon this precise formulation may allow future work to advance the formal understanding of reinforcement learning curricula and, with that, enable the solution of challenging decision-making and control problems with reinforcement learning

    Deep material networks for efficient scale-bridging in thermomechanical simulations of solids

    Get PDF
    We investigate deep material networks (DMN). We lay the mathematical foundation of DMNs and present a novel DMN formulation, which is characterized by a reduced number of degrees of freedom. We present a efficient solution technique for nonlinear DMNs to accelerate complex two-scale simulations with minimal computational effort. A new interpolation technique is presented enabling the consideration of fluctuating microstructure characteristics in macroscopic simulations
    • …
    corecore