20 research outputs found

    Towards Thompson Sampling for Complex Bayesian Reasoning

    Get PDF
    Paper III, IV, and VI are not available as a part of the dissertation due to the copyright.Thompson Sampling (TS) is a state-of-art algorithm for bandit problems set in a Bayesian framework. Both the theoretical foundation and the empirical efficiency of TS is wellexplored for plain bandit problems. However, the Bayesian underpinning of TS means that TS could potentially be applied to other, more complex, problems as well, beyond the bandit problem, if suitable Bayesian structures can be found. The objective of this thesis is the development and analysis of TS-based schemes for more complex optimization problems, founded on Bayesian reasoning. We address several complex optimization problems where the previous state-of-art relies on a relatively myopic perspective on the problem. These includes stochastic searching on the line, the Goore game, the knapsack problem, travel time estimation, and equipartitioning. Instead of employing Bayesian reasoning to obtain a solution, they rely on carefully engineered rules. In all brevity, we recast each of these optimization problems in a Bayesian framework, introducing dedicated TS based solution schemes. For all of the addressed problems, the results show that besides being more effective, the TS based approaches we introduce are also capable of solving more adverse versions of the problems, such as dealing with stochastic liars.publishedVersio

    Three fundamental pillars of decision-centered teamwork

    Get PDF
    This thesis introduces a novel paradigm in artificial intelligence: decision-centered teamwork. Decision-centered teamwork is the analysis of agent teams that iteratively take joint decisions into solving complex problems. Although teams of agents have been used to take decisions in many important domains, such as: machine learning, crowdsourcing, forecasting systems, and even board games; a study of a general framework for decisioncentered teamwork has never been presented in the literature before. I divide decision-centered teamwork in three fundamental challenges: (i) Agent Selection, which consists of selecting a set of agents from an exponential universe of possible teams; (ii) Aggregation of Opinions, which consists of designing methods to aggregate the opinions of different agents into taking joint team decisions; (iii) Team Assessment, which consists of designing methods to identify whether a team is failing, allowing a “coordinator” to take remedial procedures. In this thesis, I handle all these challenges. For Agent Selection, I introduce novel models of diversity for teams of voting agents. My models rigorously show that teams made of the best agents are not necessarily optimal, and also clarify in which situations diverse teams should be preferred. In particular, I show that diverse teams get stronger as the number of actions increases, by analyzing how the agents’ probability distribution function over actions changes. This has never been presented before in the ensemble systems literature. I also show that diverse teams have a great applicability for design problems, where the objective is to maximize the number of optimal solutions for human selection, combining for the first time social choice with number theory. All of these theoretical models and predictions are verified in real systems, such as Computer Go and architectural design. In particular, for architectural design I optimize the design of buildings with agent teams not only for cost and project requirements, but also for energy-efficiency, being thus an essential domain for sustainability. Concerning Aggregation of Opinions, I evaluate classical ranked voting rules from social choice in Computer Go, only to discover that plurality leads to the best results. This happens because real agents tend to have very noisy rankings. Hence, I create a ranking by sampling extraction technique, leading to significantly better results with the Borda voting rule. A similar study is also performed in the social networks domain, in the context of influence maximization. Additionally, I study a novel problem in social networks: I assume only a subgraph of the network is initially known, and we must spread influence and learn the graph simultaneously. I analyze a linear combination of two greedy algorithms, outperforming both of them. This domain has a great potential for health, as I run experiments in four real-life social networks from the homeless population of Los Angeles, aiming at spreading HIV prevention information. Finally, with regards to Team Assessment, I develop a domain independent team assessment methodology for teams of voting agents. My method is within a machine learning framework, and learns a prediction model over the voting patterns of a team, instead of learning over the possible states of the problem. The methodology is tested and verified in Computer Go and Ensemble Learning

    How can humans leverage machine learning? From Medical Data Wrangling to Learning to Defer to Multiple Experts

    Get PDF
    Mención Internacional en el título de doctorThe irruption of the smartphone into everyone’s life and the ease with which we digitise or record any data supposed an explosion of quantities of data. Smartphones, equipped with advanced cameras and sensors, have empowered individuals to capture moments and contribute to the growing pool of data. This data-rich landscape holds great promise for research, decision-making, and personalized applications. By carefully analyzing and interpreting this wealth of information, valuable insights, patterns, and trends can be uncovered. However, big data is worthless in a vacuum. Its potential value is unlocked only when leveraged to drive decision-making. In recent times we have been participants of the outburst of artificial intelligence: the development of computer systems and algorithms capable of perceiving, reasoning, learning, and problem-solving, emulating certain aspects of human cognitive abilities. Nevertheless, our focus tends to be limited, merely skimming the surface of the problem, while the reality is that the application of machine learning models to data introduces is usually fraught. More specifically, there are two crucial pitfalls frequently neglected in the field of machine learning: the quality of the data and the erroneous assumption that machine learning models operate autonomously. These two issues have established the foundation for the motivation driving this thesis, which strives to offer solutions to two major associated challenges: 1) dealing with irregular observations and 2) learning when and who should we trust. The first challenge originates from our observation that the majority of machine learning research primarily concentrates on handling regular observations, neglecting a crucial technological obstacle encountered in practical big-data scenarios: the aggregation and curation of heterogeneous streams of information. Before applying machine learning algorithms, it is crucial to establish robust techniques for handling big data, as this specific aspect presents a notable bottleneck in the creation of robust algorithms. Data wrangling, which encompasses the extraction, integration, and cleaning processes necessary for data analysis, plays a crucial role in this regard. Therefore, the first objective of this thesis is to tackle the frequently disregarded challenge of addressing irregularities within the context of medical data. We will focus on three specific aspects. Firstly, we will tackle the issue of missing data by developing a framework that facilitates the imputation of missing data points using relevant information derived from alternative data sources or past observations. Secondly, we will move beyond the assumption of homogeneous observations, where only one statistical data type (such as Gaussian) is considered, and instead, work with heterogeneous observations. This means that different data sources can be represented by various statistical likelihoods, such as Gaussian, Bernoulli, categorical, etc. Lastly, considering the temporal enrichment of todays collected data and our focus on medical data, we will develop a novel algorithm capable of capturing and propagating correlations among different data streams over time. All these three problems are addressed in our first contribution which involves the development of a novel method based on Deep Generative Models (DGM) using Variational Autoencoders (VAE). The proposed model, the Sequential Heterogeneous Incomplete VAE (Shi- VAE), enables the aggregation of multiple heterogeneous data streams in a modular manner, taking into consideration the presence of potential missing data. To demonstrate the feasibility of our approach, we present proof-of-concept results obtained from a real database generated through continuous passive monitoring of psychiatric patients. Our second challenge relates to the misbelief that machine learning algorithms can perform independently. However, this notion that AI systems can solely account for automated decisionmaking, especially in critical domains such as healthcare, is far from reality. Our focus now shifts towards a specific scenario where the algorithm has the ability to make predictions independently or alternatively defer the responsibility to a human expert. The purpose of including the human is not to obtain jsut better performance, but also more reliable and trustworthy predictions we can rely on. In reality, however, important decisions are not made by one person but are usually committed by an ensemble of human experts. With this in mind, two important questions arise: 1) When should the human or the machine bear responsibility and 2) among the experts, who should we trust? To answer the first question, we will employ a recent theory known as Learning to defer (L2D). In L2D we are not only interested in abstaining from prediction but also in understanding the humans confidence for making such prediction. thus deferring only when the human is more likely to be correct. The second question about who to defer among a pool of experts has not been yet answered in the L2D literature, and this is what our contributions aim to provide. First, we extend the two yet proposed consistent surrogate losses in the L2D literature to the multiple-expert setting. Second, we study the frameworks ability to estimate the probability that a given expert correctly predicts and assess whether the two surrogate losses are confidence calibrated. Finally, we propose a conformal inference technique that chooses a subset of experts to query when the system defers. Ensembling experts based on confidence levels is vital to optimize human-machine collaboration. In conclusion, this doctoral thesis has investigated two cases where humans can leverage the power of machine learning: first, as a tool to assist in data wrangling and data understanding problems and second, as a collaborative tool where decision-making can be automated by the machine or delegated to human experts, fostering more transparent and trustworthy solutions.La irrupción de los smartphones en la vida de todos y la facilidad con la que digitalizamos o registramos cualquier situación ha supuesto una explosión en la cantidad de datos. Los teléfonos, equipados con cámaras y sensores avanzados, han contribuido a que las personas puedann capturar más momentos, favoreciendo así el creciente conjunto de datos. Este panorama repleto de datos aporta un gran potencial de cara a la investigación, la toma de decisiones y las aplicaciones personalizadas. Mediante el análisis minucioso y una cuidada interpretación de esta abundante información, podemos descubrir valiosos patrones, tendencias y conclusiones Sin embargo, este gran volumen de datos no tiene valor por si solo. Su potencial se desbloquea solo cuando se aprovecha para impulsar la toma de decisiones. En tiempos recientes, hemos sido testigos del auge de la inteligencia artificial: el desarrollo de sistemas informáticos y algoritmos capaces de percibir, razonar, aprender y resolver problemas, emulando ciertos aspectos de las capacidades cognitivas humanas. No obstante, solemos centrarnos solo en la superficie del problema mientras que la realidad es que la aplicación de modelos de aprendizaje automático a los datos presenta desafíos significativos. Concretamente, se suelen pasar por alto dos problemas cruciales en el campo del aprendizaje automático: la calidad de los datos y la suposición errónea de que los modelos de aprendizaje automático pueden funcionar de manera autónoma. Estos dos problemas han sido el fundamento de la motivación que impulsa esta tesis, que se esfuerza en ofrecer soluciones a dos desafíos importantes asociados: 1) lidiar con datos irregulares y 2) aprender cuándo y en quién debemos confiar. El primer desafío surge de nuestra observación de que la mayoría de las investigaciones en aprendizaje automático se centran principalmente en manejar datos regulares, descuidando un obstáculo tecnológico crucial que se encuentra en escenarios prácticos con gran cantidad de datos: la agregación y el curado de secuencias heterogéneas. Antes de aplicar algoritmos de aprendizaje automático, es crucial establecer técnicas robustas para manejar estos datos, ya que est problemática representa un cuello de botella claro en la creación de algoritmos robustos. El procesamiento de datos (en concreto, nos centraremos en el término inglés data wrangling), que abarca los procesos de extracción, integración y limpieza necesarios para el análisis de datos, desempeña un papel crucial en este sentido. Por lo tanto, el primer objetivo de esta tesis es abordar el desafío normalmente paso por alto de tratar datos irregulare. Específicamente, bajo el contexto de datos médicos. Nos centraremos en tres aspectos principales. En primer lugar, abordaremos el problema de los datos perdidos mediante el desarrollo de un marco que facilite la imputación de estos datos perdidos utilizando información relevante obtenida de fuentes de datos de diferente naturalaeza u observaciones pasadas. En segundo lugar, iremos más allá de la suposición de lidiar con observaciones homogéneas, donde solo se considera un tipo de dato estadístico (como Gaussianos) y, en su lugar, trabajaremos con observaciones heterogéneas. Esto significa que diferentes fuentes de datos pueden estar representadas por diversas distribuciones de probabilidad, como Gaussianas, Bernoulli, categóricas, etc. Por último, teniendo en cuenta el enriquecimiento temporal de los datos hoy en día y nuestro enfoque directo sobre los datos médicos, propondremos un algoritmo innovador capaz de capturar y propagar la correlación entre diferentes flujos de datos a lo largo del tiempo. Todos estos tres problemas se abordan en nuestra primera contribución, que implica el desarrollo de un método basado en Modelos Generativos Profundos (Deep Genarative Model en inglés) utilizando Autoencoders Variacionales (Variational Autoencoders en ingés). El modelo propuesto, Sequential Heterogeneous Incomplete VAE (Shi-VAE), permite la agregación de múltiples flujos de datos heterogéneos de manera modular, teniendo en cuenta la posible presencia de datos perdidos. Para demostrar la viabilidad de nuestro enfoque, presentamos resultados de prueba de concepto obtenidos de una base de datos real generada a través del monitoreo continuo pasivo de pacientes psiquiátricos. Nuestro segundo desafío está relacionado con la creencia errónea de que los algoritmos de aprendizaje automático pueden funcionar de manera independiente. Sin embargo, esta idea de que los sistemas de inteligencia artificial pueden ser los únicos responsables en la toma de decisione, especialmente en dominios críticos como la atención médica, está lejos de la realidad. Ahora, nuestro enfoque se centra en un escenario específico donde el algoritmo tiene la capacidad de realizar predicciones de manera independiente o, alternativamente, delegar la responsabilidad en un experto humano. La inclusión del ser humano no solo tiene como objetivo obtener un mejor rendimiento, sino también obtener predicciones más transparentes y seguras en las que podamos confiar. En la realidad, sin embargo, las decisiones importantes no las toma una sola persona, sino que generalmente son el resultado de la colaboración de un conjunto de expertos. Con esto en mente, surgen dos preguntas importantes: 1) ¿Cuándo debe asumir la responsabilidad el ser humano o cuándo la máquina? y 2) de entre los expertos, ¿en quién debemos confiar? Para responder a la primera pregunta, emplearemos una nueva teoría llamada Learning to defer (L2D). En L2D, no solo estamos interesados en abstenernos de hacer predicciones, sino también en comprender cómo de seguro estará el experto para hacer dichas predicciones, diferiendo solo cuando el humano sea más probable en predecir correcatmente. La segunda pregunta sobre a quién deferir entre un conjunto de expertos aún no ha sido respondida en la literatura de L2D, y esto es precisamente lo que nuestras contribuciones pretenden proporcionar. En primer lugar, extendemos las dos primeras surrogate losses consistentes propuestas hasta ahora en la literatura de L2D al contexto de múltiples expertos. En segundo lugar, estudiamos la capacidad de estos modelos para estimar la probabilidad de que un experto dado haga predicciones correctas y evaluamos si estas surrogate losses están calibradas en términos de confianza. Finalmente, proponemos una técnica de conformal inference que elige un subconjunto de expertos para consultar cuando el sistema decide diferir. Esta combinación de expertos basada en los respectivos niveles de confianza es fundamental para optimizar la colaboración entre humanos y máquinas En conclusión, esta tesis doctoral ha investigado dos casos en los que los humanos pueden aprovechar el poder del aprendizaje automático: primero, como herramienta para ayudar en problemas de procesamiento y comprensión de datos y, segundo, como herramienta colaborativa en la que la toma de decisiones puede ser automatizada para ser realizada por la máquina o delegada a expertos humanos, fomentando soluciones más transparentes y seguras.Programa de Doctorado en Multimedia y Comunicaciones por la Universidad Carlos III de Madrid y la Universidad Rey Juan CarlosPresidente: Joaquín Míguez Arenas.- Secretario: Juan José Murillo Fuentes.- Vocal: Mélanie Natividad Fernández Pradie

    Deep neural mobile networking

    Get PDF
    The next generation of mobile networks is set to become increasingly complex, as these struggle to accommodate tremendous data traffic demands generated by ever-more connected devices that have diverse performance requirements in terms of throughput, latency, and reliability. This makes monitoring and managing the multitude of network elements intractable with existing tools and impractical for traditional machine learning algorithms that rely on hand-crafted feature engineering. In this context, embedding machine intelligence into mobile networks becomes necessary, as this enables systematic mining of valuable information from mobile big data and automatically uncovering correlations that would otherwise have been too difficult to extract by human experts. In particular, deep learning based solutions can automatically extract features from raw data, without human expertise. The performance of artificial intelligence (AI) has achieved in other domains draws unprecedented interest from both academia and industry in employing deep learning approaches to address technical challenges in mobile networks. This thesis attacks important problems in the mobile networking area from various perspectives by harnessing recent advances in deep neural networks. As a preamble, we bridge the gap between deep learning and mobile networking by presenting a survey on the crossovers between the two areas. Secondly, we design dedicated deep learning architectures to forecast mobile traffic consumption at city scale. In particular, we tailor our deep neural network models to different mobile traffic data structures (i.e. data originating from urban grids and geospatial point-cloud antenna deployments) to deliver precise prediction. Next, we propose a mobile traffic super resolution (MTSR) technique to achieve coarse-to-fine grain transformations on mobile traffic measurements using generative adversarial network architectures. This can provide insightful knowledge to mobile operators about mobile traffic distribution, while effectively reducing the data post-processing overhead. Subsequently, the mobile traffic decomposition (MTD) technique is proposed to break the aggregated mobile traffic measurements into service-level time series, by using a deep learning based framework. With MTD, mobile operators can perform more efficient resource allocation for network slicing (i.e, the logical partitioning of physical infrastructure) and alleviate the privacy concerns that come with the extensive use of deep packet inspection. Finally, we study the robustness of network specific deep anomaly detectors with a realistic black-box threat model and propose reliable solutions for defending against attacks that seek to subvert existing network deep learning based intrusion detection systems (NIDS). Lastly, based on the results obtained, we identify important research directions that are worth pursuing in the future, including (i) serving deep learning with massive high-quality data (ii) deep learning for spatio-temporal mobile data mining (iii) deep learning for geometric mobile data mining (iv) deep unsupervised learning in mobile networks, and (v) deep reinforcement learning for mobile network control. Overall, this thesis demonstrates that deep learning can underpin powerful tools that address data-driven problems in the mobile networking domain. With such intelligence, future mobile networks can be monitored and managed more effectively and thus higher user quality of experience can be guaranteed

    Artificial intelligence for decision making in energy demand-side response

    Get PDF
    This thesis examines the role and application of data-driven Artificial Intelligence (AI) approaches for the energy demand-side response (DR). It follows the point of view of a service provider company/aggregator looking to support its decision-making and operation. Overall, the study identifies data-driven AI methods as an essential tool and a key enabler for DR. The thesis is organised into two parts. It first provides an overview of AI methods utilised for DR applications based on a systematic review of over 160 papers, 40 commercial initiatives, and 21 large-scale projects. The reviewed work is categorised based on the type of AI algorithm(s) employed and the DR application area of the AI methods. The end of the first part of the thesis discusses the advantages and potential limitations of the reviewed AI techniques for different DR tasks and how they compare to traditional approaches. The second part of the thesis centres around designing machine learning algorithms for DR. The undertaken empirical work highlights the importance of data quality for providing fair, robust, and safe AI systems in DR — a high-stakes domain. It furthers the state of the art by providing a structured approach for data preparation and data augmentation in DR to minimise propagating effects in the modelling process. The empirical findings on residential response behaviour show better response behaviour in households with internet access, air-conditioning systems, power-intensive appliances, and lower gas usage. However, some insights raise questions about whether the reported levels of consumers’ engagement in DR schemes translate to actual curtailment behaviour and the individual rationale of customer response to DR signals. The presented approach also proposes a reinforcement learning framework for the decision problem of an aggregator selecting a set of consumers for DR events. This approach can support an aggregator in leveraging small-scale flexibility resources by providing an automated end-to-end framework to select the set of consumers for demand curtailment during Demand-Side Response (DR) signals in a dynamic environment while considering a long-term view of their selection process

    Metalearning

    Get PDF
    This open access book as one of the fastest-growing areas of research in machine learning, metalearning studies principled methods to obtain efficient models and solutions by adapting machine learning and data mining processes. This adaptation usually exploits information from past experience on other tasks and the adaptive processes can involve machine learning approaches. As a related area to metalearning and a hot topic currently, automated machine learning (AutoML) is concerned with automating the machine learning processes. Metalearning and AutoML can help AI learn to control the application of different learning methods and acquire new solutions faster without unnecessary interventions from the user. This book offers a comprehensive and thorough introduction to almost all aspects of metalearning and AutoML, covering the basic concepts and architecture, evaluation, datasets, hyperparameter optimization, ensembles and workflows, and also how this knowledge can be used to select, combine, compose, adapt and configure both algorithms and models to yield faster and better solutions to data mining and data science problems. It can thus help developers to develop systems that can improve themselves through experience. This book is a substantial update of the first edition published in 2009. It includes 18 chapters, more than twice as much as the previous version. This enabled the authors to cover the most relevant topics in more depth and incorporate the overview of recent research in the respective area. The book will be of interest to researchers and graduate students in the areas of machine learning, data mining, data science and artificial intelligence. ; Metalearning is the study of principled methods that exploit metaknowledge to obtain efficient models and solutions by adapting machine learning and data mining processes. While the variety of machine learning and data mining techniques now available can, in principle, provide good model solutions, a methodology is still needed to guide the search for the most appropriate model in an efficient way. Metalearning provides one such methodology that allows systems to become more effective through experience. This book discusses several approaches to obtaining knowledge concerning the performance of machine learning and data mining algorithms. It shows how this knowledge can be reused to select, combine, compose and adapt both algorithms and models to yield faster, more effective solutions to data mining problems. It can thus help developers improve their algorithms and also develop learning systems that can improve themselves. The book will be of interest to researchers and graduate students in the areas of machine learning, data mining and artificial intelligence

    Metalearning

    Get PDF
    This open access book as one of the fastest-growing areas of research in machine learning, metalearning studies principled methods to obtain efficient models and solutions by adapting machine learning and data mining processes. This adaptation usually exploits information from past experience on other tasks and the adaptive processes can involve machine learning approaches. As a related area to metalearning and a hot topic currently, automated machine learning (AutoML) is concerned with automating the machine learning processes. Metalearning and AutoML can help AI learn to control the application of different learning methods and acquire new solutions faster without unnecessary interventions from the user. This book offers a comprehensive and thorough introduction to almost all aspects of metalearning and AutoML, covering the basic concepts and architecture, evaluation, datasets, hyperparameter optimization, ensembles and workflows, and also how this knowledge can be used to select, combine, compose, adapt and configure both algorithms and models to yield faster and better solutions to data mining and data science problems. It can thus help developers to develop systems that can improve themselves through experience. This book is a substantial update of the first edition published in 2009. It includes 18 chapters, more than twice as much as the previous version. This enabled the authors to cover the most relevant topics in more depth and incorporate the overview of recent research in the respective area. The book will be of interest to researchers and graduate students in the areas of machine learning, data mining, data science and artificial intelligence. ; Metalearning is the study of principled methods that exploit metaknowledge to obtain efficient models and solutions by adapting machine learning and data mining processes. While the variety of machine learning and data mining techniques now available can, in principle, provide good model solutions, a methodology is still needed to guide the search for the most appropriate model in an efficient way. Metalearning provides one such methodology that allows systems to become more effective through experience. This book discusses several approaches to obtaining knowledge concerning the performance of machine learning and data mining algorithms. It shows how this knowledge can be reused to select, combine, compose and adapt both algorithms and models to yield faster, more effective solutions to data mining problems. It can thus help developers improve their algorithms and also develop learning systems that can improve themselves. The book will be of interest to researchers and graduate students in the areas of machine learning, data mining and artificial intelligence

    WiFi-Based Human Activity Recognition Using Attention-Based BiLSTM

    Get PDF
    Recently, significant efforts have been made to explore human activity recognition (HAR) techniques that use information gathered by existing indoor wireless infrastructures through WiFi signals without demanding the monitored subject to carry a dedicated device. The key intuition is that different activities introduce different multi-paths in WiFi signals and generate different patterns in the time series of channel state information (CSI). In this paper, we propose and evaluate a full pipeline for a CSI-based human activity recognition framework for 12 activities in three different spatial environments using two deep learning models: ABiLSTM and CNN-ABiLSTM. Evaluation experiments have demonstrated that the proposed models outperform state-of-the-art models. Also, the experiments show that the proposed models can be applied to other environments with different configurations, albeit with some caveats. The proposed ABiLSTM model achieves an overall accuracy of 94.03%, 91.96%, and 92.59% across the 3 target environments. While the proposed CNN-ABiLSTM model reaches an accuracy of 98.54%, 94.25% and 95.09% across those same environments
    corecore