260 research outputs found

    Methods for Learning Directed and Undirected Graphical Models

    Get PDF
    Probabilistic graphical models provide a general framework for modeling relationships between multiple random variables. The main tool in this framework is a mathematical object called graph which visualizes the assertions of conditional independence between the variables. This thesis investigates methods for learning these graphs from observational data. Regarding undirected graphical models, we propose a new scoring criterion for learning a dependence structure of a Gaussian graphical model. The scoring criterion is derived as an approximation to often intractable Bayesian marginal likelihood. We prove that the scoring criterion is consistent and demonstrate its applicability to high-dimensional problems when combined with an efficient search algorithm. Secondly, we present a non-parametric method for learning undirected graphs from continuous data. The method combines a conditional mutual information estimator with a permutation test in order to perform conditional independence testing without assuming any specific parametric distributions for the involved random variables. Accompanying this test with a constraint-based structure learning algorithm creates a method which performs well in numerical experiments when the data generating mechanisms involve non-linearities. For directed graphical models, we propose a new scoring criterion for learning Bayesian network structures from discrete data. The criterion approximates a hard-to-compute quantity called the normalized maximum likelihood. We study the theoretical properties of the score and compare it experimentally to popular alternatives. Experiments show that the proposed criterion provides a robust and safe choice for structure learning and prediction over a wide variety of different settings. Finally, as an application of directed graphical models, we derive a closed form expression for Bayesian network Fisher kernel. This provides us with a similarity measure over discrete data vectors, capable of taking into account the dependence structure between the components. We illustrate the similarity measured by this kernel with an example where we use it to seek sets of observations that are important and representative of the underlying Bayesian network model.Graafiset todennäköisyysmallit ovat yleispätevä tapa mallintaa yhteyksiä usean satunnaismuuttujan välillä. Keskeinen työkalu näissä malleissa on verkko, eli graafi, jolla voidaan visuaalisesti esittää muuttujien välinen riippuvuusrakenne. Tämä väitöskirja käsittelee erilaisia menetelmiä suuntaamattomien ja suunnattujen verkkojen oppimiseen havaitusta aineistosta. Liittyen suuntaamattomiin verkkoihin, tässä työssä esitellään kaksi erilaisiin tilanteisiin soveltuvaa menetelmää verkkojen rakenteen oppimiseen. Ensiksi esitellään mallinvalintakriteeri, jolla voidaan oppia verkkojen rakenteita muuttujien ollessa normaalijakautuneita. Kriteeri johdetaan approksimaationa usein laskennallisesti vaativalle bayesiläiselle marginaaliuskottavuudelle (marginal likelihood). Työssä tutkitaan kriteerin teoreettisia ominaisuuksia ja näytetään kokeellisesti, että se toimii hyvin tilanteissa, joissa muuttujien määrä on suuri. Toinen esiteltävä menetelmä on ei-parametrinen, tarkoittaen karkeasti, että emme tarvitse tarkkoja oletuksia syötemuuttujien jakaumasta. Menetelmä käyttää hyväkseen aineistosta estimoitavia informaatioteoreettisia suureita sekä permutaatiotestiä. Kokeelliset tulokset osoittavat, että menetelmä toimii hyvin, kun riippuvuudet syöteaineiston muuttujien välillä ovat epälineaarisia. Väitöskirjan toinen osa käsittelee Bayes-verkkoja, jotka ovat suunnattuja graafisia malleja. Työssä esitellään uusi mallinvalintakriteeri Bayes-verkkojen oppimiseen diskreeteille muuttujille. Tätä kriteeriä tutkitaan teoreettisesti sekä verrataan kokeellisesti muihin yleisesti käytettyihin mallinvalintakriteereihin. Väitöskirjassa esitellään viimeisenä sovellus suunnatuille graafisille malleille johtamalla Bayes-verkkoon pohjautuva Fisher-ydin (Fisher kernel). Saatua Fisher-ydintä voidaan käyttää mittaamaan datavektoreiden samankaltaisuutta ottaen huomioon riippuvuudet vektoreiden komponenttien välillä, mitä havainnollistetaan kokeellisesti

    Automating Large-Scale Simulation Calibration to Real-World Sensor Data

    Get PDF
    Many key decisions and design policies are made using sophisticated computer simulations. However, these sophisticated computer simulations have several major problems. The two main issues are 1) gaps between the simulation model and the actual structure, and 2) limitations of the modeling engine\u27s capabilities. This dissertation\u27s goal is to address these simulation deficiencies by presenting a general automated process for tuning simulation inputs such that simulation output matches real world measured data. The automated process involves the following key components -- 1) Identify a model that accurately estimates the real world simulation calibration target from measured sensor data; 2) Identify the key real world measurements that best estimate the simulation calibration target; 3) Construct a mapping from the most useful real world measurements to actual simulation outputs; 4) Build fast and effective simulation approximation models that predict simulation output using simulation input; 5) Build a relational model that captures inter variable dependencies between simulation inputs and outputs; and finally 6) Use the relational model to estimate the simulation input variables from the mapped sensor data, and use either the simulation model or approximate simulation model to fine tune input simulation parameter estimates towards the calibration system. The work in this dissertation individually validates and completes five out of the six calibration components with respect to the residential energy domain. Step 1 is satisfied by identifying the best model for predicting next hour residential electrical consumption, the calibration target. Step 2 is completed by identifying the most important sensors for predicting residential electrical consumption, the real world measurements. While step 3 is completed by domain experts, step 4 is addressed by using techniques from the Big Data machine learning domain to build approximations for the EnergyPlus (E+) simulator. Step 5\u27s solution leverages the same Big Data machine learning techniques to build a relational model that describes how the simulator\u27s variables are probabilistically related. Finally, step 6 is partially demonstrated by using the relational model to estimate simulation parameters for E+ simulations with known ground truth simulation inputs

    Information metrics for localization and mapping

    Get PDF
    Decades of research have made possible the existence of several autonomous systems that successfully and efficiently navigate within a variety of environments under certain conditions. One core technology that has allowed this is simultaneous localization and mapping (SLAM), the process of building a representation of the environment while localizing the robot in it. State-of-the-art solutions to the SLAM problem still rely, however, on heuristic decisions and options set by the user. In this thesis we search for principled solutions to various aspects of the localization and mapping problem with the help of information metrics. One such aspect is the issue of scalability. In SLAM, the problem size grows indefinitely as the experiment goes by, increasing computational resource demands. To maintain the problem tractable, we develop methods to build an approximation to the original network of constraints of the SLAM problem by reducing its size while maintaining its sparsity. In this thesis we propose three methods to build the topology of such approximated network, and two methods to perform the approximation itself. In addition, SLAM is a passive application. It means, it does not drive the robot. The problem of driving the robot with the aim of both accurately localizing the robot and mapping the environment is called active SLAM. In this problem two normally opposite forces drive the robot, one to new places discovering unknown regions and another to revisit previous configurations to improve localization. As opposed to heuristics, in this thesis we pose the problem as the joint minimization of both map and trajectory estimation uncertainties, and present four different active SLAM approaches based on entropy-reduction formulation. All methods presented in this thesis have been rigorously validated in both synthetic and real datasets.Dècades de recerca han fet possible l’existència de nombrosos sistemes autònoms que naveguen eficaçment i eficient per varietat d’entorns sota certes condicions. Una de les principals tecnologies que ho han fet possible és la localització i mapeig simultanis (SLAM), el procés de crear una representació de l’entorn mentre es localitza el robot en aquesta. De tota manera, els algoritmes d’SLAM de l’estat de l’art encara basen moltes decisions en heurístiques i opcions a escollir per l’usuari final. Aquesta tesi persegueix solucions fonamentades per a varietat d’aspectes del problema de localització i mappeig amb l’ajuda de mesures d’informació. Un d’aquests aspectes és l’escalabilitat. En SLAM, el problema creix indefinidament a mesura que l’experiment avança fent créixer la demanda de recursos computacionals. Per mantenir el problema tractable, desenvolupem mètodes per construir una aproximació de la xarxa de restriccions original del problema d’SLAM, reduint així el seu tamany a l’hora que es manté la seva naturalesa dispersa. En aquesta tesi, proposem tres métodes per confeccionar la topologia de l’approximació i dos mètodes per calcular l’aproximació pròpiament. A més, l’SLAM és una aplicació passiva. És a dir que no dirigeix el robot. El problema de guiar el robot amb els objectius de localitzar el robot i mapejar l’entorn amb precisió es diu SLAM actiu. En aquest problema, dues forces normalment oposades guien el robot, una cap a llocs nous descobrint regions desconegudes i l’altra a revisitar prèvies configuracions per millorar la localització. En contraposició amb mètodes heurístics, en aquesta tesi plantegem el problema com una minimització de l’incertesa tant en el mapa com en l’estimació de la trajectòria feta i presentem quatre mètodes d’SLAM actiu basats en la reducció de l’entropia. Tots els mètodes presentats en aquesta tesi han estat rigurosament validats tant en sèries de dades sintètiques com en reals

    Information metrics for localization and mapping

    Get PDF
    Decades of research have made possible the existence of several autonomous systems that successfully and efficiently navigate within a variety of environments under certain conditions. One core technology that has allowed this is simultaneous localization and mapping (SLAM), the process of building a representation of the environment while localizing the robot in it. State-of-the-art solutions to the SLAM problem still rely, however, on heuristic decisions and options set by the user. In this thesis we search for principled solutions to various aspects of the localization and mapping problem with the help of information metrics. One such aspect is the issue of scalability. In SLAM, the problem size grows indefinitely as the experiment goes by, increasing computational resource demands. To maintain the problem tractable, we develop methods to build an approximation to the original network of constraints of the SLAM problem by reducing its size while maintaining its sparsity. In this thesis we propose three methods to build the topology of such approximated network, and two methods to perform the approximation itself. In addition, SLAM is a passive application. It means, it does not drive the robot. The problem of driving the robot with the aim of both accurately localizing the robot and mapping the environment is called active SLAM. In this problem two normally opposite forces drive the robot, one to new places discovering unknown regions and another to revisit previous configurations to improve localization. As opposed to heuristics, in this thesis we pose the problem as the joint minimization of both map and trajectory estimation uncertainties, and present four different active SLAM approaches based on entropy-reduction formulation. All methods presented in this thesis have been rigorously validated in both synthetic and real datasets.Dècades de recerca han fet possible l’existència de nombrosos sistemes autònoms que naveguen eficaçment i eficient per varietat d’entorns sota certes condicions. Una de les principals tecnologies que ho han fet possible és la localització i mapeig simultanis (SLAM), el procés de crear una representació de l’entorn mentre es localitza el robot en aquesta. De tota manera, els algoritmes d’SLAM de l’estat de l’art encara basen moltes decisions en heurístiques i opcions a escollir per l’usuari final. Aquesta tesi persegueix solucions fonamentades per a varietat d’aspectes del problema de localització i mappeig amb l’ajuda de mesures d’informació. Un d’aquests aspectes és l’escalabilitat. En SLAM, el problema creix indefinidament a mesura que l’experiment avança fent créixer la demanda de recursos computacionals. Per mantenir el problema tractable, desenvolupem mètodes per construir una aproximació de la xarxa de restriccions original del problema d’SLAM, reduint així el seu tamany a l’hora que es manté la seva naturalesa dispersa. En aquesta tesi, proposem tres métodes per confeccionar la topologia de l’approximació i dos mètodes per calcular l’aproximació pròpiament. A més, l’SLAM és una aplicació passiva. És a dir que no dirigeix el robot. El problema de guiar el robot amb els objectius de localitzar el robot i mapejar l’entorn amb precisió es diu SLAM actiu. En aquest problema, dues forces normalment oposades guien el robot, una cap a llocs nous descobrint regions desconegudes i l’altra a revisitar prèvies configuracions per millorar la localització. En contraposició amb mètodes heurístics, en aquesta tesi plantegem el problema com una minimització de l’incertesa tant en el mapa com en l’estimació de la trajectòria feta i presentem quatre mètodes d’SLAM actiu basats en la reducció de l’entropia. Tots els mètodes presentats en aquesta tesi han estat rigurosament validats tant en sèries de dades sintètiques com en reals

    Information metrics for localization and mapping

    Get PDF
    Aplicat embargament des de la defensa de la tesi fins al 12/2019Decades of research have made possible the existence of several autonomous systems that successfully and efficiently navigate within a variety of environments under certain conditions. One core technology that has allowed this is simultaneous localization and mapping (SLAM), the process of building a representation of the environment while localizing the robot in it. State-of-the-art solutions to the SLAM problem still rely, however, on heuristic decisions and options set by the user. In this thesis we search for principled solutions to various aspects of the localization and mapping problem with the help of information metrics. One such aspect is the issue of scalability. In SLAM, the problem size grows indefinitely as the experiment goes by, increasing computational resource demands. To maintain the problem tractable, we develop methods to build an approximation to the original network of constraints of the SLAM problem by reducing its size while maintaining its sparsity. In this thesis we propose three methods to build the topology of such approximated network, and two methods to perform the approximation itself. In addition, SLAM is a passive application. It means, it does not drive the robot. The problem of driving the robot with the aim of both accurately localizing the robot and mapping the environment is called active SLAM. In this problem two normally opposite forces drive the robot, one to new places discovering unknown regions and another to revisit previous configurations to improve localization. As opposed to heuristics, in this thesis we pose the problem as the joint minimization of both map and trajectory estimation uncertainties, and present four different active SLAM approaches based on entropy-reduction formulation. All methods presented in this thesis have been rigorously validated in both synthetic and real datasets.Dècades de recerca han fet possible l’existència de nombrosos sistemes autònoms que naveguen eficaçment i eficient per varietat d’entorns sota certes condicions. Una de les principals tecnologies que ho han fet possible és la localització i mapeig simultanis (SLAM), el procés de crear una representació de l’entorn mentre es localitza el robot en aquesta. De tota manera, els algoritmes d’SLAM de l’estat de l’art encara basen moltes decisions en heurístiques i opcions a escollir per l’usuari final. Aquesta tesi persegueix solucions fonamentades per a varietat d’aspectes del problema de localització i mappeig amb l’ajuda de mesures d’informació. Un d’aquests aspectes és l’escalabilitat. En SLAM, el problema creix indefinidament a mesura que l’experiment avança fent créixer la demanda de recursos computacionals. Per mantenir el problema tractable, desenvolupem mètodes per construir una aproximació de la xarxa de restriccions original del problema d’SLAM, reduint així el seu tamany a l’hora que es manté la seva naturalesa dispersa. En aquesta tesi, proposem tres métodes per confeccionar la topologia de l’approximació i dos mètodes per calcular l’aproximació pròpiament. A més, l’SLAM és una aplicació passiva. És a dir que no dirigeix el robot. El problema de guiar el robot amb els objectius de localitzar el robot i mapejar l’entorn amb precisió es diu SLAM actiu. En aquest problema, dues forces normalment oposades guien el robot, una cap a llocs nous descobrint regions desconegudes i l’altra a revisitar prèvies configuracions per millorar la localització. En contraposició amb mètodes heurístics, en aquesta tesi plantegem el problema com una minimització de l’incertesa tant en el mapa com en l’estimació de la trajectòria feta i presentem quatre mètodes d’SLAM actiu basats en la reducció de l’entropia. Tots els mètodes presentats en aquesta tesi han estat rigurosament validats tant en sèries de dades sintètiques com en reals.Postprint (published version
    • …
    corecore