16 research outputs found

    Natural disasters, remote sensing, and synthetic controls

    Get PDF
    Satellite imagery has been used for decades to study changes on Earth’s surface and understand the mechanisms that have shaped it as we know it today. Moreover, substantial improvements in computing power and the increase of data available in recent years have boosted interest for this kind of research. Pixel-based composites of large areas are easily accessible today thanks to the Google Earth Engine platform[1]. These are being used to study the evolution of different ecosystems such as forests[2], as well as the frequency of wildfires. Furthermore, technological advances over the last decades have enabled to precisely monitor variations in extreme weather events[3]. These weather phenomena seem to be larger now in quantity and size due to the increase of climate volatility[4]. The consequences of natural hazards have been mostly studied by comparing pre- and post-disaster conditions, or simple pair-wise comparisons between affected and non-affected areas, rendering inaccurate estimates[5]. We are interested in developing a system that, by means of a synthetic control approach, will enable us to causally evaluate the effects of disturbances over areas of interest using satellite imagery. Resilience is another field of interest for the research community. The decrease in resilience of regions that are recurrently hit by these events might end up making certain places inhabitable. For example, extreme weather events already have their toll on life expectancy in the US[6]. Hence, large migrations may follow as a result in the long term

    Measuring Spatial Subdivisions in Urban Mobility with Mobile Phone Data

    Get PDF
    Urban population grows constantly. By 2050 two thirds of the world population will reside in urban areas. This growth is faster and more complex than the ability of cities to measure and plan for their sustainability. To understand what makes a city inclusive for all, we define a methodology to identify and characterize spatial subdivisions: areas with over- and under-representation of specific population groups, named hot and cold spots respectively. Using aggregated mobile phone data, we apply this methodology to the city of Barcelona to assess the mobility of three groups of people: women, elders, and tourists. We find that, within the three groups, cold spots have a lower diversity of amenities and services than hot spots. Also, cold spots of women and tourists tend to have lower population income. These insights apply to the floating population of Barcelona, thus augmenting the scope of how inclusiveness can be analyzed in the city.Comment: 10 pages, 10 figures. To be presented at the Data Science for Social Good workshop at The Web Conference 202

    A city of cities: Measuring how 15-minutes urban accessibility shapes human mobility in Barcelona

    Get PDF
    As cities expand, human mobility has become a central focus of urban planning and policy making to make cities more inclusive and sustainable. Initiatives such as the "15-minutes city" have been put in place to shift the attention from monocentric city configurations to polycentric structures, increasing the availability and diversity of local urban amenities. Ultimately they expect to increase local walkability and increase mobility within residential areas. While we know how urban amenities influence human mobility at the city level, little is known about spatial variations in this relationship. Here, we use mobile phone, census, and volunteered geographical data to measure geographic variations in the relationship between origin-destination flows and local urban accessibility in Barcelona. Using a Negative Binomial Geographically Weighted Regression model, we show that, globally, people tend to visit neighborhoods with better access to education and retail. Locally, these and other features change in sign and magnitude through the different neighborhoods of the city in ways that are not explained by administrative boundaries, and that provide deeper insights regarding urban characteristics such as rental prices. In conclusion, our work suggests that the qualities of a 15-minutes city can be measured at scale, delivering actionable insights on the polycentric structure of cities, and how people use and access this structure.Comment: 32 pages, 7 figure

    A constellation of horrors: analysis and visualization of the #Cuéntalo movement

    Get PDF
    In this work, we analyze content and structure of the Twitter trending topic #cuentalo with the purpose of providing a visualization of the movement. A supervised learning methodology is used to train the classifying algorithms with hand-labeled observations. The methodology allows us to classify each tweet according to its role in the movement.Peer ReviewedPostprint (published version

    The Camp Nou Stadium as a testbed for city physiology: a modular framework for urban digital twins

    Get PDF
    In this paper, the Camp Nou stadium is used as a testbed for City Physiology, a theoretical framework for urban digital twins. With this case study, the modularity and adaptability of the framework, originally intended for city-scale simulations, are tested on a large facility venue. As a proof of concept, several statistical techniques and an agent-based simulation platform are coupled to simulate a crowd in the stadium, and a process of four steps is followed to build the case study. Both the conceptual (interdomain) and technical (domain specific) layers of the digital twin are defined and connected in a nonlinear process so that they represent the complexity of the object to be simulated. &e result obtained is a strategy to build a digital twin from the domain point of view, paving the way for more complex, more ambitious simulatorsThis project has received funding from the European Union’s Horizon 2020 Research and Innovation Programme under the IoTwins Project (Grant agreement no. 857191). I. Meta was partially funded by the Agencia Estatal de Investigación-Ministerio de Ciencia Innovación (AEI MICINN) and the European Social Fund (ESF) under the FPI program (scholarship no. PRE2019-090239).Peer ReviewedPostprint (published version

    LiverScreen project: study protocol for screening for liver fibrosis in the general population in European countries

    Get PDF
    Background: The development of liver cirrhosis is usually an asymptomatic process until late stages when complications occur. The potential reversibility of the disease is dependent on early diagnosis of liver fibrosis and timely targeted treatment. Recently, the use of non-invasive tools has been suggested for screening of liver fibrosis, especially in subjects with risk factors for chronic liver disease. Nevertheless, large population-based studies with cost-effectiveness analyses are still lacking to support the widespread use of such tools. The aim of this study is to investigate whether non-invasive liver stiffness measurement in the general population is useful to identify subjects with asymptomatic, advanced chronic liver disease. Methods: This study aims to include 30,000 subjects from eight European countries. Subjects from the general population aged ≥ 40 years without known liver disease will be invited to participate in the study either through phone calls/letters or through their primary care center. In the first study visit, subjects will undergo bloodwork as well as hepatic fat quantification and liver stiffness measurement (LSM) by vibration-controlled transient elastography. If LSM is ≥ 8 kPa and/or if ALT levels are ≥1.5 x upper limit of normal, subjects will be referred to hospital for further evaluation and consideration of liver biopsy. The primary outcome is the percentage of subjects with LSM ≥ 8kPa. In addition, a health economic evaluation will be performed to assess the cost-effectiveness and budget impact of such an intervention. The project is funded by the European Commission H2020 program. Discussion: This study comes at an especially important time, as the burden of chronic liver diseases is expected to increase in the coming years. There is consequently an urgent need to change our current approach, from diagnosing the disease late when the impact of interventions may be limited to diagnosing the disease earlier, when the patient is asymptomatic and free of complications, and the disease potentially reversible. Ultimately, the LiverScreen study will serve as a basis from which diagnostic pathways can be developed and adapted to the specific socio-economic and healthcare conditions in each country

    Relevant statistical applications to real-world data science

    No full text
    Tesi en modalitat de compendi de publicacionsThe work presented in this dissertation is a compendium of articles based on three main applications of advanced statistical methodologies on real-world complex datasets. The first application concerns wildfire effects and is divided into two sections. For the first section, we know that the effects of wildfires are heterogeneous. Yet, which areas are more affected by these events remains unclear. Here we present a novel application of the Generalized Synthetic Control (GSC) method that enables the quantification and prediction of vegetation changes due to wildfires through a time-series analysis of in situ and satellite remote sensing data. We apply this method to a span of medium to large wildfires (> 1000 acres) in California throughout a time-span of two decades (1996–2016). The capacity of this method for estimating counterfactual vegetation characteristics for burned regions is explored, and abrupt system changes are quantified. We find that the GSC method is better at predicting vegetation changes than the more traditional approach of using nearby regions to assess wildfire impacts. With regard to the second section of this first application, we aim to explain the dynamics of wildfire effects on a vegetation index (previously estimated by causal inference through synthetic controls) from available pre-wildfire information (mainly proceeding from satellites). For this purpose, we use regression models from Functional Data Analysis, where wildfire effects are considered functional responses, depending on the elapsed time after each wildfire, with pre-wildfire data acting as scalar covariates. Our main findings show that vegetation recovery after wildfires is a slow process, affected by many pre-wildfire conditions, among which the richness and diversity of vegetation are some of the best predictors. For the second application in this dissertation, we use count data on the arrivals at the Camp Nou stadium, owned and managed by Futbol Club Barcelona (FCB). FCB operates the largest stadium in Europe (with a seating capacity of almost one hundred thousand people) and hosts recurring sports events. The attendance to these is influenced by multiple conditions and have a palpable effect on city dynamics -- e.g., peak demand for related services like public transport and stores. We study fine grain audience entrances at the stadium, segregated by gate and visitor type, in order to gain insights and predict the arrival behavior of future games. We can forecast the timeline of arrivals at gate level 72 hours prior to kickoff, facilitating operational and organizational decision-making by anticipating potential agglomerations and audience behavior, and identify patterns for different types of visitors and understand how relevant factors affect their turnout. Lastly, the third application explores the ways in which mobile phone, census, and volunteered geographical data can be used to measure geographic variations in the relationship between origin-destination flows and local urban accessibility in Barcelona. By means of a Negative Binomial Geographically Weighted Regression model we show that, globally, people tend to visit neighborhoods with better access to education facilities and retail. Locally, these and other features differ in sign and magnitude throughout the different city neighborhoods in ways that are not explained by administrative boundaries, providing deeper insights regarding urban characteristics such as rental prices. In conclusion, our work suggests that the qualities of a 15-minute city can be measured at scale, delivering actionable insights on the polycentric structure of cities, and the way people use and access this structure. All in all, the work presented in this thesis is a combination of statistics, applied statistics, data science, econometrics and economics, showing distinct ways and applications in which the temporal and spatial dimension can be treated and used to answer relevant research questions.La feina presentada en aquesta tesi és un compendi d’articles basat principalment en tres aplicacions de metodologies estadístiques i conjunts de dades avançades. La primera aplicació és sobre els efectes dels incendis forestals i està dividida en dues seccions. Per a la primera secció, sabem que els efectes dels incendis són heterogenis. Aquí presentem una nova aplicació de la metodologia Generalized Synthetic Controls (GSC), que ens permet la quantificació i predicció de canvis de vegetació deguts a incendis per mitjà de l’anàlisi de sèries temporals obtingudes amb dades satel·litals. Apliquem aquesta metodologia a incendis mitjans i grans (>404 hectàrees) a Califòrnia durant un període de dues dècades (1996-2016) i explorem la capacitat del mètode per a estimar les característiques de les vegetacions contrafactuals o hipotètiques per tal de detectar canvis dràstics en els ecosistemes. Finalment, concloem que el mètode GSC és una opció millor per a predir canvis en la vegetació que els mètodes més tradicionals, com ara utilitzar les regions pròximes per a mesurar els efectes dels incendis. Per a la segona part d’aquesta primera aplicació, l’objectiu és explicar la dinàmica dels efectes sobre un índex de vegetació (anteriorment estimat amb la inferència causal per mitjà de controls sintètics) de la informació prèvia a l’incendi (sobretot informació obtinguda a través dels satèl·lits). Amb aquest propòsit, fem servir models de regressió de l’Anàlisi de Dades Funcionals, on els efectes dels incendis es consideren respostes funcionals en funció del temps transcorregut després de cada incendi, mentre que la informació anterior als incendis s’empra de manera escalar. Els resultats mostren que la recuperació de la vegetació després dels incendis és un procés lent i afectat per moltes condicions prèvies a l’incendi, entre les quals la riquesa i la diversitat de la vegetació són algunes de les qualitats més importants a l’hora de predir recuperacions. Per a la segona aplicació d’aquesta tesi, fem servir dades de comptatge d’assistències al Camp Nou, l’estadi del Futbol Club Barcelona (FCB). El FCB opera l’estadi més gran d’Europa (amb una capacitat de seients totals pròxima a les cent mil persones) i gestiona esdeveniments esportius de manera recurrent. Aquests esdeveniments estan afectats per diverses condicions (l’hora i el dia de la setmana, el temps, l’adversari) i afecten les dinàmiques de la ciutat. Nosaltres estudiem les dades detallades sobre les entrades del públic a l’estadi, segregant per tipus de visitants i per portes, per tal de guanyar perspectives i predir el comportament de l’assistència en partits futurs. Podem predir el cronograma d’entrades per porta 72 hores abans del començament del partit, cosa que fa més fàcil la presa de decisions operacional i organitzacional i permet d’anticipar aglomeracions potencials i el comportament de l’audiència. Finalment, la tercera aplicació explora com, utilitzant dades de dispositius mòbils, censos, i dades geogràfiques voluntàries, podem mesurar les variacions geogràfiques en la relació origen-destí dels fluxos de persones i l’accessibilitat local urbana a Barcelona. Partint d’un model Negative Binomial Geographically Weighted Regression, demostrem que, globalment, la gent tendeix a desplaçar-se als barris amb més bon accés a l’educació i al petit comerç. Localment, aquests factors i d’altres canvien en signe i magnitud en funció del barri de maneres que no s’expliquen satisfactòriament pels límits administratius. En resum, la nostra feina suggereix que les qualitats de les ciutats dels 15 minuts són mesurables a escala, fet que ofereix una visió sobre les estructures pericèntriques de les ciutats i la manera en què la gent utilitza i accedeix a aquesta estructura. En resum, la feina presentada en aquesta tesi mostra diverses aplicacionsEl trabajo presentado en esta tesis es un compendio de artículos basado principalmente en tres aplicaciones de metodologías estadísticas avanzadas en conjuntos de datos complejos. La primera aplicación es sobre los efectos de los incendios forestales y está dividida en dos secciones. Para la primera sección, sabemos que los efectos de los incendios son heterogéneos, lo cual significa que la magnitud de sus efectos depende de muchos factores como la región geográfica, el clima, o el tipo de vegetación. Sin embargo, cuales áreas son las que se ven más afectadas por estos acontecimientos no está del todo claro. Aquí presentamos una nueva aplicación de la metodología Generalized Synthetic Controls (GSC) que nos permite la cuantificación y predicción de cambios de vegetación debido a los incendios, a través del análisis de series temporales obtenida de datos satelitales. Aplicamos esta metodología a incendios medianos y grandes (≥ 404 hectáreas) en California durante un periodo de dos décadas (1996-2016). Exploramos las capacidades del método para estimar las características de las vegetaciones contra factuales o hipotéticas para detectar cambios drásticos en los ecosistemas. Finalmente, encontramos que el método GSC es una mejor opción para predecir cambios en la vegetación que los métodos más tradicionales, como utilizar las regiones cercanas para medir los efectos de los incendios. Para la segunda parte de esta primera aplicación, nuestro objetivo es explicar la dinámica de los efectos en un índice de vegetación (anteriormente estimado usando inferencia causal a través de controles sintéticos) de la información previa al incendio (sobretodo información obtenida a través de los satélites). Con ese propósito, utilizamos modelos de regresión del Análisis de Datos Funcional, donde los efectos de los incendios son considerados respuestas funcionales, dependiendo del tiempo transcurrido después de cada incendio, mientras que la información anterior a los incendios es utilizada de forma escalar. Nuestros hallazgos principales muestran que la recuperación de la vegetación después de los incendios es un proceso lento, afectado por muchas condiciones previas al incendio, entre las cuales la riqueza y la diversidad de la vegetación son unas de las cualidades más importantes a la hora de predecir las recuperaciones. Para la segunda aplicación de esta tesis, utilizamos datos de contaje sobre las llegadas al estadio Camp Nou del Futbol Club Barcelona (FCB). El FCB opera el estadio más grande de Europa (con una capacidad de asientos cerca de las cien mil personas) y gestiona recurrentemente eventos deportivos. Estos eventos están influenciados por múltiples condiciones (la hora y el día de la semana, el tiempo, el contrincante) y afectan las dinámicas de la ciudad – por ejemplo, picos de demanda de los servicios relacionados como el transporte público y las tiendas. Nosotros estudiamos datos detallados sobre las entradas de la audiencia en el estadio, segregando por distintos tipos de visitante y puertas, para ganar perspectivas y predecir el comportamiento de las llegadas en futuros partidos. Podemos predecir el cronograma de las llegadas a nivel de puerta 72 horas antes del pitido inicial del partido, facilitando la toma de decisiones operacional y organizacional, anticipando aglomeraciones potenciales y el comportamiento de la audiencia. Además, podemos identificar patrones para distintos tipos de visitantes y entender como distintos factores los afectan. Finalmente, la tercera aplicación explora como el uso de datos de dispositivos móviles, censos, y datos voluntarios geográficos podemos medir las variaciones geográficas en la relación en origen y destino de los flujos de personas y la accesibilidad local urbana en Barcelona. Utilizando un modelo de Negative Binomial Geographically Weighted Regression, demostramos que, globalmente, la gente tiende a visitar vecindarios con mejor acceso a la educación y la venta minorista. Localmente, estos y otros factores cambian en signo y magnitud, a través de los distintos vecindarios de la ciudad en formas que no se explican por los limites administrativos, y que proporcionan conocimientos más profundos respecto a las características urbanas como los precios de los alquileres. En resumen, nuestro trabajo sugiere que las cualidades de las ciudades de 15 minutos pueden ser medidas a escala, entregando y revelando una visión sobre las estructuras policéntricas de las ciudades, y cómo la gente utiliza y accede a esta estructura. En resumen, el trabajo presentado en esta tesis es una combinación de estadística, estadística aplicada, data science, econometría y economía, demostrando distintas formas y aplicaciones en las que, tanto el aspecto temporal, como el dimensional, pueden ser tratados para responder preguntas de investigación relevantes.DOCTORAT EN ESTADÍSTICA I INVESTIGACIÓ OPERATIVA (Pla 2012

    Wildfires vegetation recovery through satellite remote sensing and functional data analysis

    Get PDF
    In recent years, wildfires have caused havoc across the world, which are especially aggravated in certain regions due to climate change. Remote sensing has become a powerful tool for monitoring fires, as well as for measuring their effects on vegetation over the following years. We aim to explain the dynamics of wildfires’ effects on a vegetation index (previously estimated by causal inference through synthetic controls) from pre-wildfire available information (mainly proceeding from satellites). For this purpose, we use regression models from Functional Data Analysis, where wildfire effects are considered functional responses, depending on elapsed time after each wildfire, while pre-wildfire information acts as scalar covariates. Our main findings show that vegetation recovery after wildfires is a slow process, affected by many pre-wildfire conditions, among which the richness and diversity of vegetation is one of the best predictors for the recovery.Serra-Burriel would like to thank the Barcelona Supercomputing Center for the Severo Ochoa Mobility Grant, and Delicado would like to thank the Spanish Ministerio de Ciencia e Innovación for the grant MTM2017-88142-PPeer ReviewedPostprint (published version

    Estimating heterogeneous wildfire effects using synthetic controls and satellite remote sensing

    No full text
    Wildfires have become one of the biggest natural hazards for environments worldwide. The effects of wildfires are heterogeneous, meaning that the magnitude of their effects depends on many factors such as geographical region, climate and land cover/vegetation type. Yet, which areas are more affected by these events remains unclear. Here we present a novel application of the Generalized Synthetic Control (GSC) method that enables quantification and prediction of vegetation changes due to wildfires through a time-series analysis of in situ and satellite remote sensing data. We apply this method to medium to large wildfires (> 1000 acres) in California throughout a time-span of two decades (1996–2016). The method's ability for estimating counterfactual vegetation characteristics for burned regions is explored in order to quantify abrupt system changes. We find that the GSC method is better at predicting vegetation changes than the more traditional approach of using nearby regions to assess wildfire impacts. We evaluate the GSC method by comparing its predictions of spectral vegetation indices to observations during pre-wildfire periods and find improvements in correlation coefficient from R2 = 0.66 to R2 = 0.93 in Normalized Difference Vegetation Index (NDVI), from R2 = 0.48 to R2 = 0.81 for Normalized Burn Ratio (NBR), and from R2 = 0.49 to R2 = 0.85 for Normalized Difference Moisture Index (NDMI). Results show greater changes in NDVI, NBR, and NDMI post-fire on regions classified as having a lower Burning Index. We find that on average, wildfires cause a 25% initial decrease in the vegetation index (NDVI) and a larger than 80% drop in wetness indices (NBR and NDMI) after they occur. The GSC method also reveals that wildfire effects on vegetation can last for more than a decade post-wildfire, and in some cases never return to their previous vegetation cycles within our study period. We also find that the dynamical effects vary across regions and have an impact on seasonal cycles of vegetation in later years. Lastly, we discuss the usefulness of using GSC in remote sensing analyses.F. S.-B. would also like to thank the Barcelona Supercomputing Center for the Severo Ochoa Mobility Grant, and Delicado would like to thank the Spanish Ministerio de Ciencia e Innovación for the grant MTM2017-88142-P, and A. T. P. acknowledges funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Sklodowska-Curie grant agreement H2020-MSCA-COFUND-2016-754433.Peer ReviewedPostprint (author's final draft
    corecore