    Efficient Bayesian inference via Monte Carlo and machine learning algorithms

    Mención Internacional en el título de doctorIn many fields of science and engineering, we are faced with an inverse problem where we aim to recover an unobserved parameter or variable of interest from a set of observed variables. Bayesian inference is a probabilistic approach for inferring this unknown parameter that has become extremely popular, finding application in myriad problems in fields such as machine learning, signal processing, remote sensing and astronomy. In Bayesian inference, all the information about the parameter is summarized by the posterior distribution. Unfortunately, the study of the posterior distribution requires the computation of complicated integrals, that are analytically intractable and need to be approximated. Monte Carlo is a huge family of sampling algorithms for performing optimization and numerical integration that has become the main horsepower for carrying out Bayesian inference. The main idea of Monte Carlo is that we can approximate the posterior distribution by a set of samples, obtained by an iterative process that involves sampling from a known distribution. Markov chain Monte Carlo (MCMC) and importance sampling (IS) are two important groups of Monte Carlo algorithms. This thesis focuses on developing and analyzing Monte Carlo algorithms (either MCMC, IS or combination of both) under different challenging scenarios presented below. In summary, in this thesis we address several important points, enumerated (a)–(f), that currently represent a challenge in Bayesian inference via Monte Carlo. A first challenge that we address is the problematic exploration of the parameter space by off-the-shelf MCMC algorithms when there is (a) multimodality, or with (b) highly concentrated posteriors. Another challenge that we address is the (c) proposal construction in IS. Furtheremore, in recent applications we need to deal with (d) expensive posteriors, and/or we need to handle (e) noisy posteriors. Finally, the Bayesian framework also offers a way of comparing competing hypothesis (models) in a principled way by means of marginal likelihoods. Hence, a task that arises as of fundamental importance is (f) marginal likelihood computation. Chapters 2 and 3 deal with (a), (b), and (c). In Chapter 2, we propose a novel population MCMC algorithm called Parallel Metropolis-Hastings Coupler (PMHC). PMHC is very suitable for multimodal scenarios since it works with a population of states, instead of a single one, hence allowing for sharing information. PMHC combines independent exploration by the use of parallel Metropolis-Hastings algorithms, with cooperative exploration by the use of a population MCMC technique called Normal Kernel Coupler. In Chapter 3, population MCMC are combined with IS within the layered adaptive IS (LAIS) framework. The combination of MCMC and IS serves two purposes. First, an automatic proposal construction. Second, it aims at increasing the robustness, since the MCMC samples are not used directly to form the sample approximation of the posterior. The use of minibatches of data is proposed to deal with highly concentrated posteriors. Other extensions for reducing the costs with respect to the vanilla LAIS framework, based on recycling and clustering, are discussed and analyzed. Chapters 4, 5 and 6 deal with (c), (d) and (e). The use of nonparametric approximations of the posterior plays an important role in the design of efficient Monte Carlo algorithms. Nonparametric approximations of the posterior can be obtained using machine learning algorithms for nonparametric regression, such as Gaussian Processes and Nearest Neighbors. Then, they can serve as cheap surrogate models, or for building efficient proposal distributions. In Chapter 4, in the context of expensive posteriors, we propose adaptive quadratures of posterior expectations and the marginal likelihood using a sequential algorithm that builds and refines a nonparametric approximation of the posterior. In Chapter 5, we propose Regression-based Adaptive Deep Importance Sampling (RADIS), an adaptive IS algorithm that uses a nonparametric approximation of the posterior as the proposal distribution. We illustrate the proposed algorithms in applications of astronomy and remote sensing. Chapter 4 and 5 consider noiseless posterior evaluations for building the nonparametric approximations. More generally, in Chapter 6 we give an overview and classification of MCMC and IS schemes using surrogates built with noisy evaluations. The motivation here is the study of posteriors that are both costly and noisy. The classification reveals a connection between algorithms that use the posterior approximation as a cheap surrogate, and algorithms that use it for building an efficient proposal. We illustrate specific instances of the classified schemes in an application of reinforcement learning. Finally, in Chapter 7 we study noisy IS, namely, IS when the posterior evaluations are noisy, and derive optimal proposal distributions for the different estimators in this setting. Chapter 8 deals with (f). In Chapter 8, we provide with an exhaustive review of methods for marginal likelihood computation, with special focus on the ones based on Monte Carlo. We derive many connections among the methods and compare them in several simulations setups. Finally, in Chapter 9 we summarize the contributions of this thesis and discuss some potential avenues of future research.Programa de Doctorado en Ingeniería Matemática por la Universidad Carlos III de MadridPresidente: Valero Laparra Pérez-Muelas.- Secretario: Michael Peter Wiper.- Vocal: Omer Deniz Akyildi

    Patient No-Show Prediction: A Systematic Literature Review

    Nowadays, across the most important problems faced by health centers are those caused by the existence of patients who do not attend their appointments. Among others, these patients cause loss of revenue to the health centers and increase the patients’ waiting list. In order to tackle these problems, several scheduling systems have been developed. Many of them require predicting whether a patient will show up for an appointment. However, obtaining these estimates accurately is currently a challenging problem. In this work, a systematic review of the literature on predicting patient no-shows is conducted aiming at establishing the current state-of-the-art. Based on a systematic review following the PRISMA methodology, 50 articles were found and analyzed. Of these articles, 82% were published in the last 10 years and the most used technique was logistic regression. In addition, there is significant growth in the size of the databases used to build the classifiers. An important finding is that only two studies achieved an accuracy higher than the show rate. Moreover, a single study attained an area under the curve greater than the 0.9 value. These facts indicate the difficulty of this problem and the need for further research

    Using WRF to generate high resolution offshore wind climatologies

    Ponencia presentada en: VIII Congreso de la Asociación Española de Climatología celebrado en Salamanca entre el 25 y el 28 de septiembre de 2012.Recently, the demand of gridded wind datasets over sea areas has increased due to the ongoing development of offshore wind farms. Currently available reanalysis datasets do not have enough resolution to deal with complex coastlines and coastal topography, and these do interact with the winds and meteorological systems well into the open sea. Here we present the main characteristics of a high resolution wind climatology that has been produced using the Weather Research and Forecasting model to downscale the ERA-INTERIM reanalysis. The simulations were carried out in a domain covering the Mediterranean basin and most of Europe, and thus areas with different wind regimes. The model has been kept close to the driving reanalysis by restarting it daily, as this running mode provided better results than nudging techniques. Results show that WRF is able to produce realistic offshore wind climatologies, probabilistic wind distributions and annual cycle. It also reproduces well-known regional winds remarkably well.This paper is a contribution to the financed projects by the Spanish government CORWES (CGL2010-22158-C02-01), WRF4G (CGL2010-22158-C02-01), EXTREMBLES (CGL2010-21869), C3E (200800050084091), iMar21 (CTM201015009) and MARUCA (E17/08), and was partially funded by projects ‘MAREN’ (Atlantic Area Transnational Programme) and ‘CoCoNet’ (FP7-OCEAN-2011)

    Amphiphilic Cationic Carbosilane-PEG Dendrimers: Synthesis and Applications in Gene Therapy

    Here we synthesized carbosilane, generation 1 to 3, and PEG-based dendrons functionalized at the periphery with NHBoc groups and at the focal point with azide and alkyne moieties, respectively. The coupling of these two types of dendrons via click chemistry led to the formation of new hybrid dendrimers with two distinct moieties, the hydrophobic carbosilane and the hydrophilic PEG-based dendron. The protected dendrimers were transformed into cationic ammonium dendrimers. These unique amphiphilic dendrimers were studied as vectors for gene therapy against HIV in peripheral blood mononuclear cells (PBMC) and their performance was compared with that of a PEG-free carbosilane dendrimer. The presence of the PEG moiety afforded lower toxicities and evidenced a weaker interaction between dendrimers and siRNA when compared to the homodendrimer analogous. Both features, lower toxicity and lower dendriplex strength, are key properties for use of these vectors as carriers of nucleic material.Comunidad de MadridMinisterio de Sanidad y ConsumoMinisterio de Economía y Empres

    El modelado tridimensional y la creación de imágenes de síntesis digital

    En el Máster Universitario en Diseño y Desarrollo de Productos e Instalaciones Industriales presentamos una serie de propuestas para desarrollar como proyectos de investigación en la Asignatura Diseño y Fabricación Asistidas por Ordenador (Bloque de Intensificación I) y en el Trabajo Fin de Máster, técnicas de visualización con modelos de iluminación global combinado con técnicas de investigación en aplicaciones informáticas orientadas al diseño. Estos trabajos han dado lugar a innovar desde el punto de vista docente incorporando nuevas técnicas y métodos en asignaturas como por ejemplo, Representación Fotorrealista y Animación de Productos por Ordenador. En esta comunicación presentaremos una serie de técnicas que llevamos desarrollando desde el modelado tridimensional del producto hasta la obtención de imágenes de síntesis digital, permitiéndonos obtener una imagen de síntesis digital no sólo del producto original, sino poder cambiar distintas formas y apariencias del mismo así como su integración en distintos entornos de uso o de operación, incluso antes de su posible fabricación.In the context of the official Master's Degree in Design and Product Development and Industrial Installations, we present a number of project proposals to be developed as research projects in the subject Design and Manufacturing ComputerAided (Enrichment Program I). Regarding the final project of the Master, different visualization techniques with global illumination models combined with research techniques in computer applications design with strong design orientation have been studied. These results have led us to innovate from an educational point of view by incorporating new techniques and methods in subjects such as Photorealistic Representation and Computer Animation. In this paper a series of techniques are presented, which have been developed from the three-dimensional modelling of the product to the synthetic images, allowing us to get a picture of synthetic images not only from the original product, but also being able to change different shapes and appearances, as well as the integration in different environments of use or operation, even before being manufactured

    Aplicación de Herramientas Multimedia al Dibujo Electrónico y su Normalización, en los Nuevos Planes de Estudio de Ingeniero Técnico Industrial

    El objeto de la ponencia presentada es el de compartir la experiencia llevada a cabo en la nueva titulación de Ingeniero Técnico Industrial en Electrónica (plan 2001), mediante la introducción de nuevas herramientas TIC aplicadas a las asignaturas del Área de Expresión Gráfica. Las herramientas en concreto constituyen un entorno profesional en electrónica y serán utilizadas en una práctica o trabajo final de dibujo electrónico

    Microcréditos para combatir la pobreza: una introducción a los conceptos básicos de microfinanzas como instrumento alternativo para la financiación del desarrollo

    Producción CientíficaSegún el Fondo Monetario Internacional (FMI) y el Banco Mundial (BM) en 2015 habrá en el mundo 920 millones de personas viviendo en situación de pobreza extrema, es decir, sobreviviendo con menos de 1,25$ al día, mientras que en 1990 eran 1800 millones. El porcentaje de personas que se encuentra en esta situación ha disminuido del 42% al 15% de la población mundial en estos últimos 20 años, aunque la situación varía dependiendo la región que analicemos. En el África subsahariana una de cada dos personas sobrevive con menos de un dólar diario y tanto en el África subsahariana como en Asia meridional casi tres de cada cuatro personas subsisten con menos de dos dólares diarios. Además, como veremos posteriormente, es en los países menos desarrollados donde existe mayor desigualdad en la distribución de la renta.Departamento de Ingeniería de Sistemas y AutomáticaAgencia Española de Cooperación Internacional para el Desarrollo (AECID) (proyecto CAP 10-CAP2-1513

    Ambient air pollution and thyroid function in Spanish adults. A nationwide population-based study ([email protected] study)

    Background Recent reports have suggested that air pollution may impact thyroid function, although the evidence is still scarce and inconclusive. In this study we evaluated the association of exposure to air pollutants to thyroid function parameters in a nationwide sample representative of the adult population of Spain. Methods The [email protected] study is a national, cross-sectional, population-based survey which was conducted in 2008-2010 using a random cluster sampling of the Spanish population. The present analyses included 3859 individuals, without a previous thyroid disease diagnosis, and with negative thyroid peroxidase antibodies (TPO Abs) and thyroid-stimulating hormone (TSH) levels of 0.1-20 mIU/L. Participants were assigned air pollution concentrations for particulate matter <2.5 mu m (PM2.5) and Nitrogen Dioxide (NO2), corresponding to the health examination year, obtained by means of modeling combined with measurements taken at air quality stations (CHIMERE chemistry-transport model). TSH, free thyroxine (FT4), free triiodothyronine (FT3) and TPO Abs concentrations were analyzed using an electrochemiluminescence immunoassay (Modular Analytics E170 Roche). Results In multivariate linear regression models, there was a highly significant negative correlation between PM2.5 concentrations and both FT4 (p<0.001), and FT3 levels (p<0.001). In multivariate logistic regression, there was a significant association between PM2.5 concentrations and the odds of presenting high TSH [OR 1.24 (1.01-1.52) p=0.043], lower FT4 [OR 1.25 (1.02-1.54) p=0.032] and low FT3 levels [1.48 (1.19-1.84) p=<0.001] per each IQR increase in PM2.5 (4.86 mu g/m(3)). There was no association between NO2 concentrations and thyroid hormone levels. No significant heterogeneity was seen in the results between groups of men, pre-menopausal and post-menopausal women. Conclusions Exposures to PM2.5 in the general population were associated with mild alterations in thyroid function.CIBERDEM (Ministerio de Economia, Industria y Competitividad-ISCIII), Ministerio de Sanidad, Servicios Sociales e Igualdad-ISCIII, Instituto de Salud Carlos III (PI17/02136, PI20/01322), Consejeria de Salud y familias (PI-0144-2018), European Regional Development Fund (ERDF) "A way to build Europe". GRM belongs to the regional Nicolas Monardes research program of the Consejeria de Salud (RC-0006-2016; Junta de Andalucia, Spain). CMA is recipient of a "Rio Hortega" research contract (CM19/00186, Instituto de Salud Carlos III). VKDG is recipient of a "Rio Hortega" research contract (CM21/00214, Instituto de Salud Carlos III)