92 research outputs found

    Mining Mobile Datasets to Enable the Fine-Grained Stochastic Simulation of Ebola Diffusion

    Get PDF
    The emergence of Ebola in West Africa is of worldwide public health concern. Successful miti- gation of epidemics requires coordinated, well-planned intervention strategies that are specific to the pathogen, transmission modality, population, and available resources. Modeling and sim- ulation in the field of computational epidemiology provides predictions of expected outcomes that are used by public policy planners in setting response strategies. Developing up to date models of population structures, daily activities, and movement has proven challenging for developing countries due to limited governmental resources. Recent collaborations (in 2012 and 2014) with telecom providers have given public health researchers access to Big Data needed to build high-fidelity models. Researchers now have access to billions of anonymized, detailed call data records (CDR) of mobile devices for several West African countries. In addition to official census records, these CDR datasets provide insights into the actual population locations, densities, movement, travel patterns, and migration in hard to reach areas. These datasets allow for the construction of population, activity, and movement models. For the first time, these models provide computational support of health related decision making in these developing areas (via simulation-based studies). New models, datasets, and simulation software were produced to assist in mitigating the continuing outbreak of Ebola. Existing models of disease characteristics, propagation, and progression were updated for the current circulating strain of Ebola. The simulation process required the interactions of multi-scale models, including viral loads (at the cellular level), disease progression (at the individual person level), disease propagation (at the workplace and family level), societal changes in migration and travel movements (at the population level), and mitigating interventions (at the abstract governmental policy level). The predictive results from this system were validated against results from the CDC\u27s high-level predictions

    Data-Centric Epidemic Forecasting: A Survey

    Full text link
    The COVID-19 pandemic has brought forth the importance of epidemic forecasting for decision makers in multiple domains, ranging from public health to the economy as a whole. While forecasting epidemic progression is frequently conceptualized as being analogous to weather forecasting, however it has some key differences and remains a non-trivial task. The spread of diseases is subject to multiple confounding factors spanning human behavior, pathogen dynamics, weather and environmental conditions. Research interest has been fueled by the increased availability of rich data sources capturing previously unobservable facets and also due to initiatives from government public health and funding agencies. This has resulted, in particular, in a spate of work on 'data-centered' solutions which have shown potential in enhancing our forecasting capabilities by leveraging non-traditional data sources as well as recent innovations in AI and machine learning. This survey delves into various data-driven methodological and practical advancements and introduces a conceptual framework to navigate through them. First, we enumerate the large number of epidemiological datasets and novel data streams that are relevant to epidemic forecasting, capturing various factors like symptomatic online surveys, retail and commerce, mobility, genomics data and more. Next, we discuss methods and modeling paradigms focusing on the recent data-driven statistical and deep-learning based methods as well as on the novel class of hybrid models that combine domain knowledge of mechanistic models with the effectiveness and flexibility of statistical approaches. We also discuss experiences and challenges that arise in real-world deployment of these forecasting systems including decision-making informed by forecasts. Finally, we highlight some challenges and open problems found across the forecasting pipeline.Comment: 67 pages, 12 figure

    High-Performance Modelling and Simulation for Big Data Applications

    Get PDF
    This open access book was prepared as a Final Publication of the COST Action IC1406 “High-Performance Modelling and Simulation for Big Data Applications (cHiPSet)“ project. Long considered important pillars of the scientific method, Modelling and Simulation have evolved from traditional discrete numerical methods to complex data-intensive continuous analytical optimisations. Resolution, scale, and accuracy have become essential to predict and analyse natural and complex systems in science and engineering. When their level of abstraction raises to have a better discernment of the domain at hand, their representation gets increasingly demanding for computational and data resources. On the other hand, High Performance Computing typically entails the effective use of parallel and distributed processing units coupled with efficient storage, communication and visualisation systems to underpin complex data-intensive applications in distinct scientific and technical domains. It is then arguably required to have a seamless interaction of High Performance Computing with Modelling and Simulation in order to store, compute, analyse, and visualise large data sets in science and engineering. Funded by the European Commission, cHiPSet has provided a dynamic trans-European forum for their members and distinguished guests to openly discuss novel perspectives and topics of interests for these two communities. This cHiPSet compendium presents a set of selected case studies related to healthcare, biological data, computational advertising, multimedia, finance, bioinformatics, and telecommunications

    High-Performance Modelling and Simulation for Big Data Applications

    Get PDF
    This open access book was prepared as a Final Publication of the COST Action IC1406 “High-Performance Modelling and Simulation for Big Data Applications (cHiPSet)“ project. Long considered important pillars of the scientific method, Modelling and Simulation have evolved from traditional discrete numerical methods to complex data-intensive continuous analytical optimisations. Resolution, scale, and accuracy have become essential to predict and analyse natural and complex systems in science and engineering. When their level of abstraction raises to have a better discernment of the domain at hand, their representation gets increasingly demanding for computational and data resources. On the other hand, High Performance Computing typically entails the effective use of parallel and distributed processing units coupled with efficient storage, communication and visualisation systems to underpin complex data-intensive applications in distinct scientific and technical domains. It is then arguably required to have a seamless interaction of High Performance Computing with Modelling and Simulation in order to store, compute, analyse, and visualise large data sets in science and engineering. Funded by the European Commission, cHiPSet has provided a dynamic trans-European forum for their members and distinguished guests to openly discuss novel perspectives and topics of interests for these two communities. This cHiPSet compendium presents a set of selected case studies related to healthcare, biological data, computational advertising, multimedia, finance, bioinformatics, and telecommunications

    Human mobility: Models and applications

    Get PDF
    This is the author accepted manuscript. The final version is available from the publisher via the DOI in this recordRecent years have witnessed an explosion of extensive geolocated datasets related to human movement, enabling scientists to quantitatively study individual and collective mobility patterns, and to generate models that can capture and reproduce the spatiotemporal structures and regularities in human trajectories. The study of human mobility is especially important for applications such as estimating migratory flows, traffic forecasting, urban planning, and epidemic modeling. In this survey, we review the approaches developed to reproduce various mobility patterns, with the main focus on recent developments. This review can be used both as an introduction to the fundamental modeling principles of human mobility, and as a collection of technical methods applicable to specific mobility-related problems. The review organizes the subject by differentiating between individual and population mobility and also between short-range and long-range mobility. Throughout the text the description of the theory is intertwined with real-world applications.US Army Research Offic

    Human mobility:Models and applications

    Get PDF
    Recent years have witnessed an explosion of extensive geolocated datasets related to human movement, enabling scientists to quantitatively study individual and collective mobility patterns, and to generate models that can capture and reproduce the spatiotemporal structures and regularities in human trajectories. The study of human mobility is especially important for applications such as estimating migratory flows, traffic forecasting, urban planning, and epidemic modeling. In this survey, we review the approaches developed to reproduce various mobility patterns, with the main focus on recent developments. This review can be used both as an introduction to the fundamental modeling principles of human mobility, and as a collection of technical methods applicable to specific mobility-related problems. The review organizes the subject by differentiating between individual and population mobility and also between short-range and long-range mobility. Throughout the text the description of the theory is intertwined with real-world applications.Comment: 126 pages, 45+ figure

    Getting the best of both worlds: a framework for combining disaggregate travel survey data and aggregate mobile phone data for trip generation modelling

    Get PDF
    Traditional approaches to travel behaviour modelling primarily rely on household travel survey data, which is expensive to collect, resulting in small sample sizes and infrequent updates. Furthermore, such data is prone to reporting errors which can lead to biased parameter estimates and subsequently incorrect predictions. On the other hand, mobile phone call detail records (CDRs), which report the timestamped locations of mobile communication events, have been successfully used in the context of generating travel patterns. However, due to their anonymous nature, such records have not been widely used in developing mathematical models establishing the relationship between the observed travel behaviour and influencing factors such as the attributes of the alternatives and the decision makers. In this paper, we propose a joint modelling framework that utilises the advantages offered by both travel survey data and low-cost CDR data to optimise the prediction capacity of traditional trip generation models. In this regard, we develop a model that jointly explains the reported trips for each individual in the household survey data and ensures that the aggregated zonal trip productions are close to those derived from CDR data. This framework is tested using data from Dhaka. Bangladesh consisting of household survey data (65,419 persons in 16,750 households), mobile phone CDR data (over 600 million records generated by 6.9 million users), and aggregate census data. The model results show that the proposed framework improves the spatial and temporal transferability of the joint models over the base model which relies on household travel survey data alone. This serves as a proof-of-concept that augmenting travel survey data with mobile phone data holds significant promise for the travel behaviour modelling community, not only by saving the cost of data collection, but also improving the prediction capability of the models

    LIPIcs, Volume 277, GIScience 2023, Complete Volume

    Get PDF
    LIPIcs, Volume 277, GIScience 2023, Complete Volum

    LU(S)TI in the global South: an empirical analysis of land use and socio-economic transport interaction in Tanzania using mobile network data

    Get PDF
    The majority of rural-urban migration is filtered through slums: informally established, unplanned, and unrecognised by the government, scientists have a minimal understand- ing of the 200,000 that exist worldwide, never mind enough insight into the millions of individuals living there. This limited understanding often coincides with a more general absence of data in traditional urban planning approaches, leading to most cities seeing development, positive or otherwise, preceding planning. Wesolowski and Eagle (2010) highlighted the key need to use models of human mobility to help guide effective spatial planning policies. Previous research has shown that thinking about the built environment alone cannot account for individual differences in behaviour, and that we must also consider factors such as socio-economic circumstance and context (which are far more likely to contain explanatory value than the geographies of points of interest, such as home and work locations of individuals alone). However, this remains a very difficult topic to study. Emerging economies are often characterised by institutions struggling to keep even demographic data streams up to date. Combined with ineffective data collection strategies, it is often realistic to expect stakeholders to retain an overview of the dynamics of urban systems. This gap causes many issues, but particularly in East Africa: expense and logistics restrict the ability to deploy sensor technologies; fast-changing environments reduce the utility of traditional household and census surveying; and even when raw data exists there are distinct skill gaps for data analysis. To address this, this thesis extends nascent work, and systematically investigates the use of Call Detail Records (CDR) and Mobile Financial Service (MFS) transaction logs to model mobility, demographics, land use and their interplay. Data used was automatically generated as part of day-to-day operations of a major Tanzanian Mobile Network Operator. As part of this thesis, three empirical analyses are carried out to test the boundaries of inferring activity-based land use, predicting cell tower coverage level socio-economic levels and generating mobility metrics in the form of Origin-Destination matrices and synthetic daily activity plans for the Tanzanian port city of Dar Es Salaam. Further, shortcomings of CDR and MFS data, and ways to overcome these, are identified. Empirical chapters form the basis for the identification of factors from the spatial dimension focused on assessing the impact of the built environment, socio-economic circum- stance and mobility behaviour allowing for the extension of traditional land use-transport interaction (LUTI) models, through the inclusion of socio-economic characteristics. This culminates in a new empirical LU(S)TI analysis for a sub-Saharan context. The metropolitan area of the port city of Dar es Salaam, Tanzania, is a pertinent case study area as it is facing similar challenges to many other fast-growing metropolitan areas in emerging economies globally

    Developing Travel Behaviour Models Using Mobile Phone Data

    Get PDF
    Improving the performance and efficiency of transport systems requires sound decision-making supported by data and models. However, conducting travel surveys to facilitate travel behaviour model estimation is an expensive venture. Hence, such surveys are typically infrequent in nature, and cover limited sample sizes. Furthermore, the quality of such data is often affected by reporting errors and changes in the respondents’ behaviour due to awareness of being observed. On the other hand, large and diverse quantities of time-stamped location data are nowadays passively generated as a by-product of technological growth. These passive data sources include Global Positioning System (GPS) traces, mobile phone network records, smart card data and social media data, to name but a few. Among these, mobile phone network records (i.e. call detail records (CDRs) and Global Systems for Mobile Communication (GSM) data) offer the biggest promise due to the increasing mobile phone penetration rates in both the developed and the developing worlds. Previous studies using mobile phone data have primarily focused on extracting travel patterns and trends rather than establishing mathematical relationships between the observed behaviour and the causal factors to predict the travel behaviour in alternative policy scenarios. This research aims to extend the application of mobile phone data to travel behaviour modelling and policy analysis by augmenting the data with information derived from other sources. This comes along with significant challenges stemming from the anonymous and noisy nature of the data. Consequently, novel data fusion and modelling frameworks have been developed and tested for different modelling scenarios to demonstrate the potential of this emerging low-cost data source. In the context of trip generation, a hybrid modelling framework has been developed to account for the anonymous nature of CDR data. This involves fusing the CDR and demographic data of a sub-sample of the users to estimate a demographic prediction sub-model based on phone usage variables extracted from the data. The demographic group membership probabilities from this model are then used as class weights in a latent class model for trip generation based on trip rates extracted from the GSM data of the same users. Once estimated, the hybrid model can be applied to probabilistically infer the socio-demographics, and subsequently, the trip generation of a large proportion of the population where only large-scale anonymous CDR data is available as an input. The estimation and validation results using data from Switzerland show that the hybrid model competes well against a typical trip generation model estimated using data with known socio-demographics of the users. The hybrid framework can be applied to other travel behaviour modelling contexts using CDR data (in mode or route choice for instance). The potential of CDR data to capture rational route choice behaviour for long-distance inter-regional O-D pairs (joined by highly overlapping routes) is demonstrated through data fusion with information on the attributes of the alternatives extracted from multiple external sources. The effect of location discontinuities in CDR data (due to its event-driven nature), and how this impacts the ability to observe the users’ trajectories in a highly overlapping network is discussed prompting the development of a route identification algorithm that distinguishes between unique and broad sub-group route choices. The broad choice framework, which was developed in the context of vehicle type choice is then adapted to leverage this limitation where unique route choices cannot be observed for some users, and only the broad sub-groups of the possible overlapping routes are identifiable. The estimation and validation results using data from Senegal show that CDR data can capture rational route choice behaviour, as well as reasonable value of travel time estimates. Still relying on data fusion, a novel method based on the mixed logit framework is developed to enable the analysis of departure time choice behaviour using passively collected data (GSM and GPS data) where the challenge is to deal with the lack of information on the desired times of travel. The proposed method relies on data fusion with travel time information extracted from Google Maps in the context of Switzerland. It is unique in the sense that it allows the modeller to understand the sensitivity attached to schedule delay, thus enabling its valuation, despite the passive nature of the data. The model results are in line with the expected travel behaviour, and the schedule delay valuation estimates are reasonable for the study area. Finally, a joint trip generation modelling framework fusing CDR, household travel survey, and census data is developed. The framework adjusts the scaling factors of a traditional trip generation model (based on household travel survey data only) to optimise model performance at both the disaggregate and aggregate levels. The framework is calibrated using data from Bangladesh and the adjusted models are found to have better spatial and temporal transferability. Thus, besides demonstrating the potential of mobile phone data, the thesis makes significant methodological and applied contributions. The use of different datasets provides rich insights that can inform policy measures related to the adoption of big data for transport studies. The research findings are particularly timely for transport agencies and practitioners working in contexts with severe data limitations (especially in developing countries), as well as academics generally interested in exploring the potential of emerging big data sources, both in transport and beyond
    • …
    corecore