801 research outputs found

    LIPIcs, Volume 251, ITCS 2023, Complete Volume

    Get PDF
    LIPIcs, Volume 251, ITCS 2023, Complete Volum

    Peak Estimation of Time Delay Systems using Occupation Measures

    Full text link
    This work proposes a method to compute the maximum value obtained by a state function along trajectories of a Delay Differential Equation (DDE). An example of this task is finding the maximum number of infected people in an epidemic model with a nonzero incubation period. The variables of this peak estimation problem include the stopping time and the original history (restricted to a class of admissible histories). The original nonconvex DDE peak estimation problem is approximated by an infinite-dimensional Linear Program (LP) in occupation measures, inspired by existing measure-based methods in peak estimation and optimal control. This LP is approximated from above by a sequence of Semidefinite Programs (SDPs) through the moment-Sum of Squares (SOS) hierarchy. Effectiveness of this scheme in providing peak estimates for DDEs is demonstrated with provided examplesComment: 34 pages, 14 figures, 3 table

    Temperament & Character account for brain functional connectivity at rest: A diathesis-stress model of functional dysregulation in psychosis

    Get PDF
    The human brain’s resting-state functional connectivity (rsFC) provides stable trait-like measures of differences in the perceptual, cognitive, emotional, and social functioning of individuals. The rsFC of the prefrontal cortex is hypothesized to mediate a person’s rational self-government, as is also measured by personality, so we tested whether its connectivity networks account for vulnerability to psychosis and related personality configurations. Young adults were recruited as outpatients or controls from the same communities around psychiatric clinics. Healthy controls (n = 30) and clinically stable outpatients with bipolar disorder (n = 35) or schizophrenia (n = 27) were diagnosed by structured interviews, and then were assessed with standardized protocols of the Human Connectome Project. Data-driven clustering identified five groups of patients with distinct patterns of rsFC regardless of diagnosis. These groups were distinguished by rsFC networks that regulate specific biopsychosocial aspects of psychosis: sensory hypersensitivity, negative emotional balance, impaired attentional control, avolition, and social mistrust. The rsFc group differences were validated by independent measures of white matter microstructure, personality, and clinical features not used to identify the subjects. We confirmed that each connectivity group was organized by differential collaborative interactions among six prefrontal and eight other automatically-coactivated networks. The temperament and character traits of the members of these groups strongly accounted for the differences in rsFC between groups, indicating that configurations of rsFC are internal representations of personality organization. These representations involve weakly self-regulated emotional drives of fear, irrational desire, and mistrust, which predispose to psychopathology. However, stable outpatients with different diagnoses (bipolar or schizophrenic psychoses) were highly similar in rsFC and personality. This supports a diathesis-stress model in which different complex adaptive systems regulate predisposition (which is similar in stable outpatients despite diagnosis) and stress-induced clinical dysfunction (which differs by diagnosis)

    Real-time simulations of transmon systems with time-dependent Hamiltonian models

    Full text link
    In this thesis we study aspects of Hamiltonian models which can affect the time evolution of transmon systems. We model the time evolution of various systems as a unitary real-time process by numerically solving the time-dependent Schr\"odinger equation (TDSE). We denote the corresponding computer models as non-ideal gate-based quantum computer (NIGQC) models since transmons are usually used as transmon qubits in superconducting prototype gate-based quantum computers (PGQCs).We first review the ideal gate-based quantum computer (IGQC) model and provide a distinction between the IGQC, PGQCs and the NIGQC models we consider in this thesis. Then, we derive the circuit Hamiltonians which generate the dynamics of fixed-frequency and flux-tunable transmons. Furthermore, we also provide clear and concise derivations of effective Hamiltonians for both types of transmons. We use the circuit and effective Hamiltonians we derived to define two many-particle Hamiltonians, namely a circuit and an associated effective Hamiltonian. The interactions between the different subsystems are modelled as dipole-dipole interactions. Next, we develop two product-formula algorithms which solve the TDSE for the Hamiltonians we defined. Afterwards, we use these algorithms to investigate how various frequently applied assumptions affect the time evolution of transmon systems modelled with the many-particle effective Hamiltonian when a control pulse is applied. Here we also compare the time evolutions generated by the effective and circuit Hamiltonian. We find that the assumptions we investigate can substantially affect the time evolution of the probability amplitudes we model. Next, we investigate how susceptible gate-error quantifiers are to assumptions which make up the NIGQC model. We find that the assumptions we consider clearly affect gate-error quantifiers like the diamond distance and the average infidelity.Comment: Dissertation, 203 pages, RWTH Aachen University, 2023. This dissertation includes and extends the results of arXiv:2201.02402 and arXiv:2211.1101

    Reinforcement Learning Curricula as Interpolations between Task Distributions

    Get PDF
    In the last decade, the increased availability of powerful computing machinery has led to an increasingly widespread application of machine learning methods. Machine learning has been particularly successful when large models, typically neural networks with an ever-increasing number of parameters, can leverage vast data to make predictions. While reinforcement learning (RL) has been no exception from this development, a distinguishing feature of RL is its well-known exploration-exploitation trade-off, whose optimal solution – while possible to model as a partially observable Markov decision process – evades computation in all but the simplest problems. Consequently, it seems unsurprising that notable demonstrations of reinforcement learning, such as an RL-based Go agent (AlphaGo) by Deepmind beating the professional Go player Lee Sedol, relied both on the availability of massive computing capabilities and specific forms of regularization that facilitate learning. In the case of AlphaGo, this regularization came in the form of self-play, enabling learning by interacting with gradually more proficient opponents. In this thesis, we develop techniques that, similarly to the concept of self-play of AlphaGo, improve the learning performance of RL agents by training on sequences of increasingly complex tasks. These task sequences are typically called curricula and are known to side-step problems such as slow learning or convergence to poor behavior that may occur when directly learning in complicated tasks. The algorithms we develop in this thesis create curricula by minimizing distances or divergences between probability distributions of learning tasks, generating interpolations between an initial distribution of easy learning tasks and a target task distribution. Apart from improving the learning performance of RL agents in experiments, developing methods that realize curricula as interpolations between task distributions results in a nuanced picture of key aspects of successful reinforcement learning curricula. In Chapter 1, we start this thesis by introducing required reinforcement learning notation and then motivating curriculum reinforcement learning from the perspective of continuation methods for non-linear optimization. Similar to curricula for reinforcement learning agents, continuation methods have been used in non-linear optimization to solve challenging optimization problems. This similarity provides an intuition about the effect of the curricula we aim to generate and their limits. In Chapter 2, we transfer the concept of self-paced learning, initially proposed in the supervised learning community, to the problem of RL, showing that an automated curriculum generation for RL agents can be motivated by a regularized RL objective. This regularized RL objective implies generating a curriculum as a sequence of task distributions that trade off the expected agent performance against similarity to a specified distribution of target tasks. This view on curriculum RL contrasts existing approaches, as it motivates curricula via a regularized RL objective instead of generating them from a set of assumptions about an optimal curriculum. In experiments, we show that an approximate implementation of the aforementioned curriculum – that restricts the interpolating task distribution to a Gaussian – results in improved learning performance compared to regular reinforcement learning, matching or surpassing the performance of existing curriculum-based methods. Subsequently, Chapter 3 builds up on the intuition of curricula as sequences of interpolating task distributions established in Chapter 2. Motivated by using more flexible task distribution representations, we show how parametric assumptions play a crucial role in the empirical success of the previous approach and subsequently uncover key ingredients that enable the generation of meaningful curricula without assuming a parametric model of the task distributions. One major ingredient is an explicit notion of task similarity via a distance function of two Markov Decision Processes. We turn towards optimal transport theory, allowing for flexible particle-based representations of the task distributions while properly considering the newly introduced metric structure of the task space. Combined with other improvements to our first method, such as a more aggressive restriction of the curriculum to tasks that are not too hard for the agent, the resulting approach delivers consistently high learning performance in multiple experiments. In the final Chapter 4, we apply the refined method of Chapter 3 to a trajectory-tracking task, in which we task an RL agent to follow a three-dimensional reference trajectory with the tip of an inverted pendulum mounted on a Barrett Whole Arm Manipulator. The access to only positional information results in a partially observable system that, paired with its inherent instability, underactuation, and non-trivial kinematic structure, presents a challenge for modern reinforcement learning algorithms, which we tackle via curricula. The technically infinite-dimensional task space of target trajectories allows us to probe the developed curriculum learning method for flaws that have not surfaced in the rather low-dimensional experiments of the previous chapters. Through an improved optimization scheme that better respects the non-Euclidean structure of target trajectories, we reliably generate curricula of trajectories to be tracked, resulting in faster and more robust learning compared to an RL baseline that does not exploit this form of structured learning. The learned policy matches the performance of an optimal control baseline on the real system, demonstrating the potential of curriculum RL to learn state estimation and control for non-linear tracking tasks jointly. In summary, this thesis introduces a perspective on reinforcement learning curricula as interpolations between task distributions. The methods developed under this perspective enjoy a precise formulation as optimization problems and deliver empirical benefits throughout experiments. Building upon this precise formulation may allow future work to advance the formal understanding of reinforcement learning curricula and, with that, enable the solution of challenging decision-making and control problems with reinforcement learning

    Temperament & Character account for brain functional connectivity at rest: A diathesis-stress model of functional dysregulation in psychosis

    Get PDF
    The online version contains supplementary material available at https://doi.org/10.1038/s41380-023-02039-6The human brain’s resting-state functional connectivity (rsFC) provides stable trait-like measures of differences in the perceptual, cognitive, emotional, and social functioning of individuals. The rsFC of the prefrontal cortex is hypothesized to mediate a person’s rational self-government, as is also measured by personality, so we tested whether its connectivity networks account for vulnerability to psychosis and related personality configurations. Young adults were recruited as outpatients or controls from the same communities around psychiatric clinics. Healthy controls (n = 30) and clinically stable outpatients with bipolar disorder (n = 35) or schizophrenia (n = 27) were diagnosed by structured interviews, and then were assessed with standardized protocols of the Human Connectome Project. Data-driven clustering identified five groups of patients with distinct patterns of rsFC regardless of diagnosis. These groups were distinguished by rsFC networks that regulate specific biopsychosocial aspects of psychosis: sensory hypersensitivity, negative emotional balance, impaired attentional control, avolition, and social mistrust. The rsFc group differences were validated by independent measures of white matter microstructure, personality, and clinical features not used to identify the subjects. We confirmed that each connectivity group was organized by differential collaborative interactions among six prefrontal and eight other automatically-coactivated networks. The temperament and character traits of the members of these groups strongly accounted for the differences in rsFC between groups, indicating that configurations of rsFC are internal representations of personality organization. These representations involve weakly self-regulated emotional drives of fear, irrational desire, and mistrust, which predispose to psychopathology. However, stable outpatients with different diagnoses (bipolar or schizophrenic psychoses) were highly similar in rsFC and personality. This supports a diathesis-stress model in which different complex adaptive systems regulate predisposition (which is similar in stable outpatients despite diagnosis) and stress-induced clinical dysfunction (which differs by diagnosis).EU FEDER grants through the Spanish Ministry of Science and Technology PID2021-125017OB-I00, RTI2018-098983-B-I00, D43 TW011793-06A1, PID2021-125017OB-I00, RTI2018-098983-B-I00, D43 TW011793-06A1United States Department of Health & Human Services National Institutes of Health (NIH) - USA R01-MH124060Psychosis-Risk Outcomes Network U01 MH12463

    Least-cost distribution network tariff design in theory and practice

    Get PDF
    First published online: 31 December 2020In this paper a game-theoretical model with self-interest pursuing consumers is introduced in order to assess how to design a least-cost distribution tariff under two constraints that regulators typically face. The first constraint is related to difficulties regarding the implementation of cost-reflective tariffs. In practice, so-called cost-reflective tariffs are only a proxy for the actual cost driver(s) in distribution grids. The second constraint has to do with fairness. There is a fear that active consumers investing in distributed energy resources (DER) might benefit at the expense of passive consumers. We find that both constraints have a significant impact on the least-cost network tariff design, and the results depend on the state of the grid. If most of the grid investments still have to be made, passive and active consumers can both benefit from cost-reflective tariffs, while this is not the case for passive consumers if the costs are mostly sunk.Published version of EUI WP RSCAS, 2018/1

    Safety and Liveness of Quantitative Automata

    Get PDF
    The safety-liveness dichotomy is a fundamental concept in formal languages which plays a key role in verification. Recently, this dichotomy has been lifted to quantitative properties, which are arbitrary functions from infinite words to partially-ordered domains. We look into harnessing the dichotomy for the specific classes of quantitative properties expressed by quantitative automata. These automata contain finitely many states and rational-valued transition weights, and their common value functions Inf, Sup, LimInf, LimSup, LimInfAvg, LimSupAvg, and DSum map infinite words into the totally-ordered domain of real numbers. In this automata-theoretic setting, we establish a connection between quantitative safety and topological continuity and provide an alternative characterization of quantitative safety and liveness in terms of their boolean counterparts. For all common value functions, we show how the safety closure of a quantitative automaton can be constructed in PTime, and we provide PSpace-complete checks of whether a given quantitative automaton is safe or live, with the exception of LimInfAvg and LimSupAvg automata, for which the safety check is in ExpSpace. Moreover, for deterministic Sup, LimInf, and LimSup automata, we give PTime decompositions into safe and live automata. These decompositions enable the separation of techniques for safety and liveness verification for quantitative specifications

    Decision-Aware Actor-Critic with Function Approximation and Theoretical Guarantees

    Full text link
    Actor-critic (AC) methods are widely used in reinforcement learning (RL) and benefit from the flexibility of using any policy gradient method as the actor and value-based method as the critic. The critic is usually trained by minimizing the TD error, an objective that is potentially decorrelated with the true goal of achieving a high reward with the actor. We address this mismatch by designing a joint objective for training the actor and critic in a decision-aware fashion. We use the proposed objective to design a generic, AC algorithm that can easily handle any function approximation. We explicitly characterize the conditions under which the resulting algorithm guarantees monotonic policy improvement, regardless of the choice of the policy and critic parameterization. Instantiating the generic algorithm results in an actor that involves maximizing a sequence of surrogate functions (similar to TRPO, PPO) and a critic that involves minimizing a closely connected objective. Using simple bandit examples, we provably establish the benefit of the proposed critic objective over the standard squared error. Finally, we empirically demonstrate the benefit of our decision-aware actor-critic framework on simple RL problems.Comment: 44 page

    Time, movement, urban space. “Places of transit”: an urban asynchrony?

    Get PDF
    The aim of the research project is to examine the spatial transformation of “places of transit” for migrants in 21st century Europe. The aim is to analyse the relationship between these places and the contemporary city, and how the latter is being transformed, or should be transformed, taking account - or not - of their existence. Transit places are part of a short timeframe, with a normatively limited duration; their existence, however, has an impact on the morphology of the city over a (more or less) long period. Rather than considering them as “out-of-the-way places” or as “high places” (usually in the media), let’s start from the premise that, like the other elements of the urban jigsaw, they are places that (de) structure, organise and shape the city and the lives of those who live there. Characterised by a specific temporality, one in which exile becomes waiting, these places are “intervals” because they are as much perimeters obeying specific social and spatial dynamics as they are “pockets of time” covering ways of living in this territory that are out of sync or poorly synchronised with the “urban pulse”. The question of time is therefore a key factor in understanding the relationship between these places. Bringing space and time together here amounts to “discovering” what constitutes a mesh, paradoxically not very visible and yet the matrix of urban space, where the nodes are called chronotopes (rhythms), urban continuities (duration) and spatial resistances (memory). This is my line of enquiry. Based on five case studies (Calais, Lampedusa, Lavrio, Lesbos, Amman/Zaatari), this thesis offers a viewpoint on the dynamics that govern the integration, or otherwise, of “places of transit” into a network that extends beyond them both spatially and temporally.Le projet de recherche entend questionner la transformation spatiale des « lieux de transit » pour personnes migrantes dans l’Europe du 21e siècle. Il s’agit d’analyser les relations entre ces lieux et la ville contemporaine, comment cette dernière se transforme ou doit se transformer en tenant compte – ou non – de leur existence. Les « lieux de transit » sont inscrits dans un temps court, à durée « normativement » limitée ; leur existence a toutefois un impact sur la morphologie de la ville selon une (plus ou moins) longue durée. Plutôt que de les considérer ou comme des « hors lieux » ou comme des « hauts lieux » (médiatiques le plus souvent), partons du postulat qu’ils sont, à l’instar des autres éléments du puzzle urbain, des lieux qui (dé)structurent, organisent, nervurent la cité et la vie de ceux qui l’habitent. Caractérisés par une temporalité spécifique, celle où l’exil se fait attente, ces lieux sont « intervalles » car ils sont tout autant des périmètres obéissant à des dynamiques sociales et spatiales particulières que des « poches de temps » qui recouvrent des manières, désynchronisées ou mal synchronisées par rapport à la « pulsation urbaine », de vivre ce territoire. La question du temps est dès lors une entrée déterminante pour lire la relation entre ces lieux. Agréger ici l’espace et le temps revient à « découvrir » ce qui constitue un maillage, paradoxalement peu visible et pourtant matriciel de l’espace urbain où les nƓuds se nomment chronotopes (rythmes), continuités urbaines (durée) et résistances spatiales (mémoire). Ceci est ma ligne d’horizon. À partir de cinq cas d’étude (Calais, Lampedusa, Lavrio, Lesbos, Amman/Zaatari), cette thèse propose un point de vue sur les dynamiques qui président à l’intégration, ou non, des « lieux de transit » dans un maillage qui les dépasse et spatialement et temporellement
    • 

    corecore