1,013 research outputs found

    LIPIcs, Volume 251, ITCS 2023, Complete Volume

    Get PDF
    LIPIcs, Volume 251, ITCS 2023, Complete Volum

    AI: Limits and Prospects of Artificial Intelligence

    Get PDF
    The emergence of artificial intelligence has triggered enthusiasm and promise of boundless opportunities as much as uncertainty about its limits. The contributions to this volume explore the limits of AI, describe the necessary conditions for its functionality, reveal its attendant technical and social problems, and present some existing and potential solutions. At the same time, the contributors highlight the societal and attending economic hopes and fears, utopias and dystopias that are associated with the current and future development of artificial intelligence

    Essays on noncausal and noninvertible time series

    Get PDF
    Over the last two decades, there has been growing interest among economists in nonfundamental univariate processes, generally represented by noncausal and non-invertible time series. These processes have become increasingly popular due to their ability to capture nonlinear dynamics such as volatility clustering, asymmetric cycles, and local explosiveness - all of which are commonly observed in Macroeconomics and Finance. In particular, the incorporation of both past and future components into noncausal and noninvertible processes makes them attractive options for modeling forward-looking behavior in economic activities. However, the classical techniques used for analyzing time series models are largely limited to causal and invertible counterparts. This dissertation seeks to contribute to the field by providing theoretical tools robust to noncausal and noninvertible time series in testing and estimation. In the first chapter, "Quantile Autoregression-Based Non-causality Testing", we investigate the statistical properties of empirical conditional quantiles of non-causal processes. Specifically, we show that the quantile autoregression (QAR) estimates for non-causal processes do not remain constant across different quantiles in contrast to their causal counterparts. Furthermore, we demonstrate that non-causal autoregressive processes admit nonlinear representations for conditional quantiles given past observations. Exploiting these properties, we propose three novel testing strategies of non-causality for non-Gaussian processes within the QAR framework. The tests are constructed either by verifying the constancy of the slope coefficients or by applying a misspecification test of the linear QAR model over different quantiles of the process. Some numerical experiments are included to examine the finite sample performance of the testing strategies, where we compare different specification tests for dynamic quantiles with the Kolmogorov-Smirnov constancy test. The new methodology is applied to some time series from financial markets to investigate the presence of speculative bubbles. The extension of the approach based on the specification tests to AR processes driven by innovations with heteroskedasticity is studied through simulations. The performance of QAR estimates of non-causal processes at extreme quantiles is also explored. In the second chapter, "Estimation of Time Series Models Using the Empirical Distribution of Residuals", we introduce a novel estimation technique for general linear time series models, potentially noninvertible and noncausal, by utilizing the empirical cumulative distribution function of residuals. The proposed method relies on the generalized spectral cumulative function to characterize the pairwise dependence of residuals at all lags. Model identification can be achieved by exploiting the information in the joint distribution of residuals under the iid assumption. This method yields consistent estimates of the model parameters without imposing stringent conditions on the higher-order moments or any distributional assumptions on the innovations beyond non-Gaussianity. We investigate the asymptotic distribution of the estimates by employing a smoothed cumulative distribution function to approximate the indicator function, considering the non-differentiability of the original loss function. Efficiency improvements can be achieved by properly choosing the scaling parameter for residuals. Finite sample properties are explored through Monte Carlo simulations. An empirical application to illustrate this methodology is provided by fitting the daily trading volume of Microsoft stock by autoregressive models with noncausal representation. The flexibility of the cumulative distribution function permits the proposed method to be extended to more general dependence structures where innovations are only conditional mean or quantile independent. In the third chapter, "Directional Predictability Tests", joint with Carlos Velasco, we propose new tests of predictability for non-Gaussian sequences that may display general nonlinear dependence in higher-order properties. We test the null of martingale difference against parametric alternatives which can introduce linear or nonlinear dependence as generated by ARMA and all-pass restricted ARMA models, respectively. We also develop tests to check for linear predictability under the white noise null hypothesis parameterized by an all-pass model driven by martingale difference innovations and tests of non-linear predictability on ARMA residuals. Our Lagrange Multiplier tests are developed from a loss function based on pairwise dependence measures that identify the predictability of levels. We provide asymptotic and finite sample analysis of the properties of the new tests and investigate the predictability of different series of financial returns.This thesis has been possible thanks to the financial support from the grant BES-2017-082695 from the Ministerio de Economía Industria y Competitividad.Programa de Doctorado en Economía por la Universidad Carlos III de MadridPresidente: Miguel ángel Delgado González.- Secretario: Manuel Domínguez Toribio.- Vocal: Majid M. Al Sadoo

    Advances and Applications of DSmT for Information Fusion. Collected Works, Volume 5

    Get PDF
    This fifth volume on Advances and Applications of DSmT for Information Fusion collects theoretical and applied contributions of researchers working in different fields of applications and in mathematics, and is available in open-access. The collected contributions of this volume have either been published or presented after disseminating the fourth volume in 2015 in international conferences, seminars, workshops and journals, or they are new. The contributions of each part of this volume are chronologically ordered. First Part of this book presents some theoretical advances on DSmT, dealing mainly with modified Proportional Conflict Redistribution Rules (PCR) of combination with degree of intersection, coarsening techniques, interval calculus for PCR thanks to set inversion via interval analysis (SIVIA), rough set classifiers, canonical decomposition of dichotomous belief functions, fast PCR fusion, fast inter-criteria analysis with PCR, and improved PCR5 and PCR6 rules preserving the (quasi-)neutrality of (quasi-)vacuous belief assignment in the fusion of sources of evidence with their Matlab codes. Because more applications of DSmT have emerged in the past years since the apparition of the fourth book of DSmT in 2015, the second part of this volume is about selected applications of DSmT mainly in building change detection, object recognition, quality of data association in tracking, perception in robotics, risk assessment for torrent protection and multi-criteria decision-making, multi-modal image fusion, coarsening techniques, recommender system, levee characterization and assessment, human heading perception, trust assessment, robotics, biometrics, failure detection, GPS systems, inter-criteria analysis, group decision, human activity recognition, storm prediction, data association for autonomous vehicles, identification of maritime vessels, fusion of support vector machines (SVM), Silx-Furtif RUST code library for information fusion including PCR rules, and network for ship classification. Finally, the third part presents interesting contributions related to belief functions in general published or presented along the years since 2015. These contributions are related with decision-making under uncertainty, belief approximations, probability transformations, new distances between belief functions, non-classical multi-criteria decision-making problems with belief functions, generalization of Bayes theorem, image processing, data association, entropy and cross-entropy measures, fuzzy evidence numbers, negator of belief mass, human activity recognition, information fusion for breast cancer therapy, imbalanced data classification, and hybrid techniques mixing deep learning with belief functions as well

    Uncertainty Quantification in Machine Learning for Engineering Design and Health Prognostics: A Tutorial

    Full text link
    On top of machine learning models, uncertainty quantification (UQ) functions as an essential layer of safety assurance that could lead to more principled decision making by enabling sound risk assessment and management. The safety and reliability improvement of ML models empowered by UQ has the potential to significantly facilitate the broad adoption of ML solutions in high-stakes decision settings, such as healthcare, manufacturing, and aviation, to name a few. In this tutorial, we aim to provide a holistic lens on emerging UQ methods for ML models with a particular focus on neural networks and the applications of these UQ methods in tackling engineering design as well as prognostics and health management problems. Toward this goal, we start with a comprehensive classification of uncertainty types, sources, and causes pertaining to UQ of ML models. Next, we provide a tutorial-style description of several state-of-the-art UQ methods: Gaussian process regression, Bayesian neural network, neural network ensemble, and deterministic UQ methods focusing on spectral-normalized neural Gaussian process. Established upon the mathematical formulations, we subsequently examine the soundness of these UQ methods quantitatively and qualitatively (by a toy regression example) to examine their strengths and shortcomings from different dimensions. Then, we review quantitative metrics commonly used to assess the quality of predictive uncertainty in classification and regression problems. Afterward, we discuss the increasingly important role of UQ of ML models in solving challenging problems in engineering design and health prognostics. Two case studies with source codes available on GitHub are used to demonstrate these UQ methods and compare their performance in the life prediction of lithium-ion batteries at the early stage and the remaining useful life prediction of turbofan engines

    Reinforcement learning in large state action spaces

    Get PDF
    Reinforcement learning (RL) is a promising framework for training intelligent agents which learn to optimize long term utility by directly interacting with the environment. Creating RL methods which scale to large state-action spaces is a critical problem towards ensuring real world deployment of RL systems. However, several challenges limit the applicability of RL to large scale settings. These include difficulties with exploration, low sample efficiency, computational intractability, task constraints like decentralization and lack of guarantees about important properties like performance, generalization and robustness in potentially unseen scenarios. This thesis is motivated towards bridging the aforementioned gap. We propose several principled algorithms and frameworks for studying and addressing the above challenges RL. The proposed methods cover a wide range of RL settings (single and multi-agent systems (MAS) with all the variations in the latter, prediction and control, model-based and model-free methods, value-based and policy-based methods). In this work we propose the first results on several different problems: e.g. tensorization of the Bellman equation which allows exponential sample efficiency gains (Chapter 4), provable suboptimality arising from structural constraints in MAS(Chapter 3), combinatorial generalization results in cooperative MAS(Chapter 5), generalization results on observation shifts(Chapter 7), learning deterministic policies in a probabilistic RL framework(Chapter 6). Our algorithms exhibit provably enhanced performance and sample efficiency along with better scalability. Additionally, we also shed light on generalization aspects of the agents under different frameworks. These properties have been been driven by the use of several advanced tools (e.g. statistical machine learning, state abstraction, variational inference, tensor theory). In summary, the contributions in this thesis significantly advance progress towards making RL agents ready for large scale, real world applications

    Contributions and applications around low resource deep learning modeling

    Get PDF
    El aprendizaje profundo representa la vanguardia del aprendizaje automático en multitud de aplicaciones. Muchas de estas tareas requieren una gran cantidad de recursos computacionales, lo que limita su adopción en dispositivos integrados. El objetivo principal de esta tesis es estudiar métodos y algoritmos que permiten abordar problemas utilizando aprendizaje profundo con bajos recursos computacionales. Este trabajo también tiene como objetivo presentar aplicaciones de aprendizaje profundo en la industria. La primera contribución es una nueva función de activación para redes de aprendizaje profundo: la función de módulo. Los experimentos muestran que la función de activación propuesta logra resultados superiores en tareas de visión artificial cuando se compara con las alternativas encontradas en la literatura. La segunda contribución es una nueva estrategia para combinar modelos preentrenados usando destilación de conocimiento. Los resultados de este capítulo muestran que es posible aumentar significativamente la precisión de los modelos preentrenados más pequeños, lo que permite un alto rendimiento a un menor costo computacional. La siguiente contribución de esta tesis aborda el problema de la previsión de ventas en el campo de la logística. Se proponen dos sistemas de extremo a extremo con dos técnicas diferentes de aprendizaje profundo (modelos de secuencia a secuencia y transformadores). Los resultados de este capítulo concluyen que es posible construir sistemas integrales para predecir las ventas de múltiples productos individuales, en múltiples puntos de venta y en diferentes momentos con un único modelo de aprendizaje automático. El modelo propuesto supera las alternativas encontradas en la literatura. Finalmente, las dos últimas contribuciones pertenecen al campo de la tecnología del habla. El primero estudia cómo construir un sistema de reconocimiento de voz Keyword Spotting utilizando una versión eficiente de una red neuronal convolucional. En este estudio, el sistema propuesto es capaz de superar el rendimiento de todos los puntos de referencia encontrados en la literatura cuando se prueba contra las subtareas más complejas. El último estudio propone un modelo independiente de texto a voz de última generación capaz de sintetizar voz inteligible en miles de perfiles de voz, mientras genera un discurso con variaciones de prosodia significativas y expresivas. El enfoque propuesto elimina la dependencia de los modelos anteriores de un sistema de voz adicional, lo que hace que el sistema propuesto sea más eficiente en el tiempo de entrenamiento e inferencia, y permite operaciones fuera de línea y en el dispositivo.Deep learning is the state of the art for several machine learning tasks. Many of these tasks require large amount of computational resources, which limits their adoption in embedded devices. The main goal of this dissertation is to study methods and algorithms that allow to approach problems using deep learning with restricted computational resources. This work also aims at presenting applications of deep learning in industry. The first contribution is a new activation function for deep learning networks: the modulus function. The experiments show that the proposed activation function achieves superior results in computer vision tasks when compared with the alternatives found in the literature. The second contribution is a new strategy to combine pre-trained models using knowledge distillation. The results of this chapter show that it is possible to significantly increase the accuracy of the smallest pre-trained models, allowing high performance at a lower computational cost. The following contribution in this thesis tackles the problem of sales fore- casting in the field of logistics. Two end-to-end systems with two different deep learning techniques (sequence-to-sequence models and transformers) are pro- posed. The results of this chapter conclude that it is possible to build end-to-end systems to predict the sales of multiple individual products, at multiple points of sale and different times with a single machine learning model. The proposed model outperforms the alternatives found in the literature. Finally, the last two contributions belong to the speech technology field. The former, studies how to build a Keyword Spotting speech recognition system using an efficient version of a convolutional neural network. In this study, the proposed system is able to beat the performance of all the benchmarks found in the literature when tested against the most complex subtasks. The latter study proposes a standalone state-of-the-art text-to-speech model capable of synthesizing intelligible voice in thousands of voice profiles, while generating speech with meaningful and expressive prosody variations. The proposed approach removes the dependency of previous models on an additional voice system, which makes the proposed system more efficient at training and inference time, and enables offline and on-device operations

    Point Cloud Processing for Environmental Analysis in Autonomous Driving using Deep Learning

    Get PDF
    Autonomous self-driving cars need a very precise perception system of their environment, working for every conceivable scenario. Therefore, different kinds of sensor types, such as lidar scanners, are in use. This thesis contributes highly efficient algorithms for 3D object recognition to the scientific community. It provides a Deep Neural Network with specific layers and a novel loss to safely localize and estimate the orientation of objects from point clouds originating from lidar sensors. First, a single-shot 3D object detector is developed that outputs dense predictions in only one forward pass. Next, this detector is refined by fusing complementary semantic features from cameras and joint probabilistic tracking to stabilize predictions and filter outliers. The last part presents an evaluation of data from automotive-grade lidar scanners. A Generative Adversarial Network is also being developed as an alternative for target-specific artificial data generation.One of the main objectives of leading automotive companies is autonomous self-driving cars. They need a very precise perception system of their environment, working for every conceivable scenario. Therefore, different kinds of sensor types are in use. Besides cameras, lidar scanners became very important. The development in that field is significant for future applications and system integration because lidar offers a more accurate depth representation, independent from environmental illumination. Especially algorithms and machine learning approaches, including Deep Learning and Artificial Intelligence based on raw laser scanner data, are very important due to the long range and three-dimensional resolution of the measured point clouds. Consequently, a broad field of research with many challenges and unsolved tasks has been established. This thesis aims to address this deficit and contribute highly efficient algorithms for 3D object recognition to the scientific community. It provides a Deep Neural Network with specific layers and a novel loss to safely localize and estimate the orientation of objects from point clouds. First, a single shot 3D object detector is developed that outputs dense predictions in only one forward pass. Next, this detector is refined by fusing complementary semantic features from cameras and a joint probabilistic tracking to stabilize predictions and filter outliers. In the last part, a concept for deployment into an existing test vehicle focuses on the semi-automated generation of a suitable dataset. Subsequently, an evaluation of data from automotive-grade lidar scanners is presented. A Generative Adversarial Network is also being developed as an alternative for target-specific artificial data generation. Experiments on the acquired application-specific and benchmark datasets show that the presented methods compete with a variety of state-of-the-art algorithms while being trimmed down to efficiency for use in self-driving cars. Furthermore, they include an extensive set of standard evaluation metrics and results to form a solid baseline for future research.Eines der Hauptziele führender Automobilhersteller sind autonome Fahrzeuge. Sie benötigen ein sehr präzises System für die Wahrnehmung der Umgebung, dass für jedes denkbare Szenario überall auf der Welt funktioniert. Daher sind verschiedene Arten von Sensoren im Einsatz, sodass neben Kameras u. a. auch Lidar Sensoren ein wichtiger Bestandteil sind. Die Entwicklung auf diesem Gebiet ist für künftige Anwendungen von höchster Bedeutung, da Lidare eine genauere, von der Umgebungsbeleuchtung unabhängige, Tiefendarstellung bieten. Insbesondere Algorithmen und maschinelle Lernansätze wie Deep Learning, die Rohdaten über Lernzprozesse direkt verarbeiten können, sind aufgrund der großen Reichweite und der dreidimensionalen Auflösung der gemessenen Punktwolken sehr wichtig. Somit hat sich ein weites Forschungsfeld mit vielen Herausforderungen und ungelösten Problemen etabliert. Diese Arbeit zielt darauf ab, dieses Defizit zu verringern und effiziente Algorithmen zur 3D-Objekterkennung zu entwickeln. Sie stellt ein tiefes Neuronales Netzwerk mit spezifischen Schichten und einer neuartigen Fehlerfunktion zur sicheren Lokalisierung und Schätzung der Orientierung von Objekten aus Punktwolken bereit. Zunächst wird ein 3D-Detektor entwickelt, der in nur einem Vorwärtsdurchlauf aus einer Punktwolke alle Objekte detektiert. Anschließend wird dieser Detektor durch die Fusion von komplementären semantischen Merkmalen aus Kamerabildern und einem gemeinsamen probabilistischen Tracking verfeinert, um die Detektionen zu stabilisieren und Ausreißer zu filtern. Im letzten Teil wird ein Konzept für den Einsatz in einem bestehenden Testfahrzeug vorgestellt, das sich auf die halbautomatische Generierung eines geeigneten Datensatzes konzentriert. Hierbei wird eine Auswertung auf Daten von Automotive-Lidaren vorgestellt. Als Alternative zur zielgerichteten künstlichen Datengenerierung wird ein weiteres generatives Neuronales Netzwerk untersucht. Experimente mit den erzeugten anwendungsspezifischen- und Benchmark-Datensätzen zeigen, dass sich die vorgestellten Methoden mit dem Stand der Technik messen können und gleichzeitig auf Effizienz für den Einsatz in selbstfahrenden Autos optimiert sind. Darüber hinaus enthalten sie einen umfangreichen Satz an Evaluierungsmetriken und -ergebnissen, die eine solide Grundlage für die zukünftige Forschung bilden

    Scalable Learning of Bayesian Networks Using Feedback Arc Set-Based Heuristics

    Get PDF
    Bayesianske nettverk er en viktig klasse av probabilistiske grafiske modeller. De består av en struktur (en rettet asyklisk graf) som beskriver betingede uavhengighet mellom stokastiske variabler og deres parametere (lokale sannsynlighetsfordelinger). Med andre ord er Bayesianske nettverk generative modeller som beskriver simultanfordelingene på en kompakt form. Den største utfordringen med å lære et Bayesiansk nettverk skyldes selve strukturen, og på grunn av den kombinatoriske karakteren til asyklisitetsegenskapen er det ingen overraskelse at strukturlæringsproblemet generelt er NP-hardt. Det eksisterer algoritmer som løser dette problemet eksakt: dynamisk programmering og heltalls lineær programmering er de viktigste kandidatene når man ønsker å finne strukturen til små til mellomstore Bayesianske nettverk fra data. På den annen side er heuristikk som bakkeklatringsvarianter ofte brukt når man forsøker å lære strukturen til større nettverk med tusenvis av variabler, selv om disse heuristikkene vanligvis ikke har teoretiske garantier og ytelsen i praksis kan bli uforutsigbar når man arbeider med storskala læring. Denne oppgaven tar for seg utvikling av skalerbare metoder som takler det strukturlæringsproblemet av Bayesianske nettverk, samtidig som det forsøkes å opprettholde et nivå av teoretisk kontroll. Dette ble oppnådd ved bruk av relaterte kombinatoriske problemer, nemlig det maksimale asykliske subgrafproblemet (maximum acyclic subgraph) og det duale problemet (feedback arc set). Selv om disse problemene er NP-harde i seg selv, er de betydelig mer håndterbare i praksis. Denne oppgaven utforsker måter å kartlegge Bayesiansk nettverksstrukturlæring til maksimale asykliske subgrafforekomster og trekke ut omtrentlige løsninger for det første problemet, basert på løsninger oppnådd for det andre. Vår forskning tyder på at selv om økt skalerbarhet kan oppnås på denne måten, er det adskillig mer utfordrende å opprettholde den teoretisk forståelsen med denne tilnærmingen. Videre fant vi ut at å lære strukturen til Bayesianske nettverk basert på maksimal asyklisk subgraf kanskje ikke er den beste metoden generelt, men vi identifiserte en kontekst - lineære strukturelle ligningsmodeller - der vi eksperimentelt kunne validere fordelene med denne tilnærmingen, som fører til rask og skalerbar identifisering av strukturen og med mulighet til å lære komplekse strukturer på en måte som er konkurransedyktig med moderne metoder.Bayesian networks form an important class of probabilistic graphical models. They consist of a structure (a directed acyclic graph) expressing conditional independencies among random variables, as well as parameters (local probability distributions). As such, Bayesian networks are generative models encoding joint probability distributions in a compact form. The main difficulty in learning a Bayesian network comes from the structure itself, owing to the combinatorial nature of the acyclicity property; it is well known and does not come as a surprise that the structure learning problem is NP-hard in general. Exact algorithms solving this problem exist: dynamic programming and integer linear programming are prime contenders when one seeks to recover the structure of small-to-medium sized Bayesian networks from data. On the other hand, heuristics such as hill climbing variants are commonly used when attempting to approximately learn the structure of larger networks with thousands of variables, although these heuristics typically lack theoretical guarantees and their performance in practice may become unreliable when dealing with large scale learning. This thesis is concerned with the development of scalable methods tackling the Bayesian network structure learning problem, while attempting to maintain a level of theoretical control. This was achieved via the use of related combinatorial problems, namely the maximum acyclic subgraph problem and its dual problem the minimum feedback arc set problem. Although these problems are NP-hard themselves, they exhibit significantly better tractability in practice. This thesis explores ways to map Bayesian network structure learning into maximum acyclic subgraph instances and extract approximate solutions for the first problem, based on the solutions obtained for the second. Our research suggests that although increased scalability can be achieved this way, maintaining theoretical understanding based on this approach is much more challenging. Furthermore, we found that learning the structure of Bayesian networks based on maximum acyclic subgraph/minimum feedback arc set may not be the go-to method in general, but we identified a setting - linear structural equation models - in which we could experimentally validate the benefits of this approach, leading to fast and scalable structure recovery with the ability to learn complex structures in a competitive way compared to state-of-the-art baselines.Doktorgradsavhandlin
    corecore