25 research outputs found

    SpNetPrep: An R package using Shiny to facilitate spatial statistics on road networks

    Get PDF
    Spatial statistics is an important field of data science with many applications in very different areas of study such as epidemiology, criminology, seismology, astronomy and econometrics, among others. In particular, spatial statistics has frequently been used to analyze traffic accidents datasets with explanatory and preventive objectives. Traditionally, these studies have employed spatial statistics techniques at some level of areal aggregation, usually related to administrative units. However, last decade has brought an increasing number of works on the spatial incidence and distribution of traffic accidents at the road level by means of the spatial structure known as a linear network. This change seems positive because it could provide deeper and more accurate investigations than previous studies that were based on areal spatial units. The interest in working at the road level renders some technical difficulties due to the high complexity of these structures, specially in terms of manipulation and rectification. The R Shiny app SpNetPrep, which is available online and via an R package named the same way, has the goal of providing certain functionalities that could be useful for a user which is interested in performing an spatial analysis over a road network structure

    Spatio-temporal methods for the analysis of crime and traffic safety data

    Get PDF
    Desde que John Snow analizara espacialmente los casos de cólera de la epidemia de Londres de 1854, han sido muchas las disciplinas que se han beneficiado de la existencia de métodos estadísticos espacio-temporales: agricultura, astronomía, biología, epidemiología, geología, hidrología, meteorología y teledetección, entre otras. Esta tesis se centra en el desarrollo y aplicación de estos métodos en el contexto de dos disciplinas: la seguridad vial y la criminología. En particular, un objetivo capital ha sido el de detectar lagunas de investigación en la literatura actualmente disponible. Así pues, la investigación de diversos problemas que surgen de forma habitual en estas dos áreas, los cuales requieren de un tratamiento estadístico concreto, ha llevado a estructurar la tesis de la forma siguiente. En primer lugar, tras un capítulo introductorio, se exponen dos estudios sobre seguridad vial sobre una estructura de tipo red. Así pues, el Capítulo 2 contiene un análisis multivariante a nivel de calle en el que se distingue entre zonas de intersección y de no intersección. Seguidamente, en el Capítulo 3 se presenta un método para la detección de “hotspots” de riesgo diferencial sobre una red. El Capítulo 4 incluye un análisis espacio-temporal de un conjunto de datos de robos a vivienda centrado en el fenómeno de casi-repetición, el cual es capital en criminología. La versión clásica del test de Knox es adaptada para contemplar la existencia de heterogeneidad espacio-temporal en el riesgo de robo, lo que permite obtener una visión más precisa de la magnitud del fenómeno. En concreto, se propone un ajuste adecuado en un contexto de ausencia de variación espacio-temporal tanto en la variable exposición como en las covariables. El Capítulo 5 incluye un estudio detallado del problema de la unidad de área modificable (MAUP) en el contexto del análisis de la seguridad vial. Como novedad frente a estudios previos, la escala y la zonificación de las estructuras espaciales son controladas de forma explícita. Además, el análisis no solo se centra en las consecuencias finales en términos de estimación y precisión de los modelos, sino en las alteraciones que sufren las variables. El Capítulo 6 se dedica a la comparación de varias metodologías que permiten analizar cómo la proximidad a ciertos lugares influye en la incidencia de un evento de interés. En concreto, esta comparación se realiza para valorar la relación existente entre los accidentes de tráfico y la localización de centros educativos. El Capítulo 7 se centra en analizar una cuestión a la que se ha dado gran importancia en criminología cuantitativa: la pérdida de fiabilidad de un análisis como consecuencia de la presencia de eventos no geocodificados. Se ha estimado que alcanzar un 85% en la tasa de geocodificación es lo suficientemente aceptable como para analizar los datos. En esta tesis se reestima este porcentaje teniendo en cuenta algunos factores y métodos no tenidos en cuenta en la estimación inicial. Se concluye que geocodificar el 85% de los eventos puede no ser suficiente bajo ciertas condiciones. Finalmente, el Capítulo 8 incluye la descripción de dos paquetes de R que han sido desarrollados durante esta tesis: SpNetPrep, que permite el preprocesado y depuración de una estructura de tipo red, y DRHotNet, que implementa el procedimiento de detección de “hotspots” descrito en el Capítulo 3.Since physician John Snow analyzed the spatial distribution of cholera cases detected in the 1854 epidemic in London, many disciplines have benefited from the existence of spatio-temporal statistical methods: agriculture, astronomy, biology, epidemiology, geology, hydrology, meteorology, and remote sensing, among others. This thesis therefore focuses on the development and application of spatio-temporal methods in the context of two disciplines: traffic safety analysis and criminology. In particular, a capital objective has been to detect research gaps in the currently available literature. Thus, the investigation of several types of problems that usually arise in these two fields, which require a specific statistical approach, has led to the structuring of this thesis as follows. Firstly, after an introductory chapter, two studies in the context of traffic safety analysis where the use of a linear network structure is fundamental are shown. The first one contains a street-level multivariate analysis of the occurrence of traffic accidents accounting for the presence of intersection and non-intersection segments. Next, in Chapter 3, a method is presented and employed for the detection of differential risk "hotspots" along a network. Chapter 4 includes a spatio-temporal analysis of a burglary dataset focused on the phenomenon of near-repetition, which is capital in the field of criminology. The classic version of the Knox test is adapted to account for spatio-temporal burglary risk heterogeneity, which provides a more accurate representation of the magnitude of the phenomenon. Specifically, an adjustment is proposed that is suitable in a context of absence of spatial-temporal variation in both the exposure variable and the covariates. Chapter 5 includes a detailed study of the modifiable area unit problem (MAUP) in the context of traffic safety analysis. As a novelty compared to previous studies, the scale and zoning of the spatial structures considered are explicitly controlled. Furthermore, the analysis does not only focus on the final consequences in terms of estimation and precision of the models, but also on the alterations that occur in the different variables involved. Chapter 6 is dedicated to the comparison of several methodologies that can be selected to analyze how the proximity to certain places influences the incidence of an event of interest. Specifically, this comparison is made to assess the relationship between traffic accidents and the location of educational centers. Chapter 7 focuses on analyzing an issue that has been given great importance in quantitative criminology: the loss of reliability of analyses as a result of the presence of non-geocoded events. It has been estimated that reaching 85% geocoding success rate is enough to carry out further analysis of the data. In this thesis, this percentage is reestimated taking into account some factors and methods not taken into account in the initial estimation. It is concluded that reaching 85% success rate in the geocoding process may not be sufficient under certain conditions. Finally, Chapter 8 includes the description of two R packages that have been developed during this thesis: SpNetPrep, which allows the preprocessing and curation of a linear network, and DRHotNet, which implements the "hotspot" detection procedure described in Chapter 3

    Modelling of Biomass Concentration, Multi-Wavelength Absorption and Discrimination Method for Seven Important Marine Microalgae Species

    Get PDF
    Due to the possible depletion of fossil fuels in the near future and the necessity of finding new food sources for a growing world population, marine microalgae constitutes a very promising alternative resource, which can also contribute to carbon dioxide fixation. Thus, seven species (Chaetoceros calcitrans, Chaetoceros gracilis, Isochrysis galbana, Nannochloropsis gaditana, Dunaliella salina, Tetraselmis suecica, and Tetraselmis chuii) were grown in five serial batch cultures at a bench scale under continuous illumination. The batch cultures were inoculated with an aliquot that was extracted from a larger-scale culture in order to obtain growth data valid for the entire growth cycle with guaranteed reproducibility. Thus, measurements of optical density at several wavelengths and cell counting with a haemocytometer (Neubauer chamber) were performed every one or two days for 22 days in the five batch cultures of each specie. Modeling of cell growth, the relationship between optical density (OD) and cell concentration and the effect of wavelength on OD was performed. The results of this study showed the highest and lowest growth rate for N. gaditana and T. suecica, respectively. Furthermore, a simple and accurate discrimination method by performing direct single OD measurements of microalgae culture aliquots was developed and is already available for free on internet.PRUCV2017-162-001 grant from the Universidad Católica de Valencia San Vicente MártirPRUCV2018-231-001 grant from the Universidad Católica de Valencia San Vicente MártirPRUCV2017-231-001 grant from the Universidad Católica de Valencia San Vicente MártirPRUCV2018-162-001 grant from the Universidad Católica de Valencia San Vicente MártirCiencias del Ma

    A spatio-temporal multinomial model of firearm death in Ecuador

    Get PDF
    This paper presents a statistical model based on a multinomial distribution with fixed and random effects, the latter effects being structured and non-structured in space and time. Inference is performed through a Bayesian framework. We are interested in analyzing violent deaths at the level of parroquia in Ecuador. Noting that most of the deaths are linked to firearms, and much less with knifes, we build a multinomial model to predict the probability of three different types of deaths as a close proxy to a violent death having had occurred. We provide a practical and realistic interpretation of the model putting this in the real crime context and scenario in Ecuador

    Crime Analysis of the Metropolitan Region of Santiago de Chile: A Spatial Panel Data Approach

    Get PDF
    The aim of our work is to determine the influence that socio-economic and demographic factors have had on crimes that have taken place during the period 2010–2018 in the communes of the Metropolitan Region of Chile, as well as the existence of possible spatial or temporal effects. We address 12 kinds of crime that we have grouped into two main types: against people and against property. Our interest focuses on crimes against people, using crimes against property as an additional covariate in order to investigate the existence of the broken-windows phenomenon in this context. The model chosen for our analysis is a spatial panel model with fixed effects. The results highlight that covariates such as infant mortality, birth rate, poverty and green areas have a significant influence on crimes against people. Regarding the spatio-temporal covariates, one effect observed is that there is a displacement of crime towards neighbouring communes, leaving open a new line of study to discover the causes of this displacement

    Trends in Incidence and Transmission Patterns of COVID-19 in Valencia, Spain

    Get PDF
    Importance Limited information on the transmission and dynamics of SARS-CoV-2 at the city scale is available. Objective To describe the local spread of SARS-CoV-2 in Valencia, Spain. Design, Setting, and Participants This single-center epidemiological cohort study of patients with SARS-CoV-2 was performed at University General Hospital in Valencia (population in the hospital catchment area, 364 000), a tertiary hospital. The study included all consecutive patients with COVID-19 isolated at home from the start of the COVID-19 pandemic on February 19 until August 31, 2020. Exposures Cases of SARS-CoV-2 infection confirmed by the presence of IgM antibodies or a positive polymerase chain reaction test result on a nasopharyngeal swab were included. Cases in which patients with negative laboratory results met diagnostic and clinical criteria were also included. Main Outcomes and Measures The primary outcome was the characterization of dissemination patterns and connections among the 20 neighborhoods of Valencia during the outbreak. To recreate the transmission network, the inbound and outbound connections were studied for each region, and the relative risk of infection was estimated. Results In total, 2646 patients were included in the analysis. The mean (SD) age was 45.3 (22.5) years; 1203 (46%) were male and 1442 (54%) were female (data were missing for 1); and the overall mortality was 3.7%. The incidence of SARS-CoV-2 cases was higher in neighborhoods with higher household income (β2 [for mean income per household] = 0.197; 95% CI, 0.057-0.351) and greater population density (β1 [inhabitants per km2] = 0.228; 95% CI, 0.085-0.387). Correlations with meteorological variables were not statistically significant. Neighborhood 3, where the hospital and testing facility were located, had the most outbound connections (14). A large residential complex close to the city (neighborhood 20) had the fewest connections (0 outbound and 2 inbound). Five geographically unconnected neighborhoods were of strategic importance in disrupting the transmission network. Conclusions and Relevance This study of local dissemination of SARS-COV-2 revealed nonevident transmission patterns between geographically unconnected areas. The results suggest that tailor-made containment measures could reduce transmission and that hospitals, including testing facilities, play a crucial role in disease transmission. Consequently, the local dynamics of SARS-CoV-2 spread might inform the strategic lockdown of specific neighborhoods to stop the contagion and avoid a citywide lockdown.This study was supported by the Innovation, Universities, Science and Digital Society Council through the Valencia Innovation Agency (AVI); grant 851255 from the European Research Council under the European Union’s Horizon 2020 research and innovation program (Dr Zanin); grant MDM-2017-0711 from the Spanish State Research Agency through the Severo Ochoa and María de Maeztu Program for Centers and Units of Excellence in Research and Development (Dr Zanin); and from the Universitat de Valencia (Drs Iftimi and Lozano).Peer reviewe

    Respondent Burden Effects on Item Non-Response and Careless Response Rates: An Analysis of Two Types of Surveys

    No full text
    The respondent burden refers to the effort required by a respondent to answer a questionnaire. Although this concept was introduced decades ago, few studies have focused on the quantitative detection of such a burden. In this paper, a face-to-face survey and a telephone survey conducted in Valencia (Spain) are analyzed. The presence of burden is studied in terms of both item non-response rates and careless response rates. In particular, two moving-window statistics based on the coefficient of unalikeability and the average longstring index are proposed for characterizing careless responding. Item non-response and careless response rates are modeled for each survey by using mixed-effects models, including respondent-level and question-level covariates and also temporal random effects to assess the existence of respondent burden during the questionnaire. The results suggest that the sociodemographic characteristics of the respondents and the typology of the question impact item non-response and careless response rates. Moreover, the estimates of the temporal random effects indicate that item non-response and careless response rates are time-varying, suggesting the presence of respondent burden. In particular, an increasing trend in item non-response rates in the telephone survey has been found, which supports the hypothesis of the burden. Regarding careless responding, despite the presence of some temporal variation, no clear trend has been identified

    A mechanistic bivariate point process model for crime pattern analysis

    No full text
    The statistical analysis of crime data has gained attention in the last decade. In particular, the availability of spatio-temporal crime data at the event level allows us to model the incidence of crime with high precision. Point process models are the natural tool to study crime patterns. As it is well-known that crime events often spread as a contagion process, mechanistic self-exciting models are usually considered in this context. In this paper, we propose a mechanistic bivariate spatio-temporal model for the first-order intensity function of the point processes associated with the intensity of two crime types. Specifically, the model includes separate estimates of the overall temporal and spatial intensities of crime and a spatio-temporal interaction term for each of the crime types under analysis. Regarding the spatio-temporal term, we model how the occurrence of previous crime events (from any of the two types) influences the intensity of each type of crime under study. We consider a dataset of crime events recorded in Valencia (Spain) during the year 2017 and focus on two crime types for the analysis: property crime and robbery. The results show that there is an association between the recent occurrence of either property crimes or robberies and the intensity of both crime types. Several spatio-temporal monitoring tools are described and discussed as well

    Aprendizaje de las matemáticas a través del lenguaje de programación R en Educación Secundaria

    No full text
    Learning to program using computers constitutes a great advantage for students in order to gain competencies nowadays. Furthermore, programming can help them to develop skills such as logical reasoning, structured and/or even creative thinking from an educational point of view. Therefore, the first objective of this work is to review several studies that discuss the multiple benefits that students can obtain from learning to program in secondary education. The second and main purpose of this study seeks to show the possibility of learning mathematical concepts with the aid of one of the most popular programming languages currently available, R. This is especially interesting because some mathematical contents in the curriculum are easy to teach via algorithmic design and experimentation. Besides, these contents are unable to be taught in this way when using traditional teaching methods. This learning methodology was tested with 33 students from Spain aged 14-15 years, which used the R programming language to study polynomial equations. The experience provided sufficient information for observing great advantages and yet some disadvantages for certain students due to the intrinsic complexity of programming, as it was revealed through the correlational analysis of the survey that was taken by the participants. In any case, these disadvantages would likely be solved by a longer implementation of the methodology.El aprendizaje de la programación por medio de los ordenadores constituye una gran ventaja a nivel de competencias en la época actual. Además, en un sentido estrictamente educacional, la programación puede dotar a los alumnos que la estudian y practican de una mayor capacidad de razonamiento lógico, pensamiento estructurado o incluso una mayor imaginación. Así pues, el primer objetivo de este trabajo es revisar algunos estudios que señalan las múltiples ventajas que puede suponer para el alumnado el aprendizaje de la programación durante su educación secundaria. El segundo y principal objetivo es plantear el uso de uno de los lenguajes de programación más populares del momento, el R, como una herramienta para tratar contenidos propios de la asignatura de Matemáticas. Esto es especialmente interesante debido a la presencia de contenidos en el currículo que se prestan claramente al diseño de algoritmos y a una experimentación mayor que la que permite la enseñanza tradicional. Esta metodología de aprendizaje fue puesta en práctica con 33 alumnos españoles de entre 14 y 15 años de edad, los cuales utilizaron el lenguaje R para tratar cuestiones relativas a la resolución de ecuaciones polinómicas. La experiencia permitió comprobar grandes ventajas de la metodología, aunque también algunas desventajas para ciertos alumnos, debido a la complejidad intrínseca de la programación, como se desprendió del análisis correlacional de la encuesta realizada a los mismos. En cualquier caso, estas desventajas podrían subsanarse mediante una aplicación de la metodología más prolongada en el tiempo
    corecore