445 research outputs found

    Automatic reconstruction of itineraries from descriptive texts

    Get PDF
    Esta tesis se inscribe dentro del marco del proyecto PERDIDO donde los objetivos son la extracción y reconstrucción de itinerarios a partir de documentos textuales. Este trabajo se ha realizado en colaboración entre el laboratorio LIUPPA de l' Université de Pau et des Pays de l' Adour (France), el grupo de Sistemas de Información Avanzados (IAAA) de la Universidad de Zaragoza y el laboratorio COGIT de l' IGN (France). El objetivo de esta tesis es concebir un sistema automático que permita extraer, a partir de guías de viaje o descripciones de itinerarios, los desplazamientos, además de representarlos sobre un mapa. Se propone una aproximación para la representación automática de itinerarios descritos en lenguaje natural. Nuestra propuesta se divide en dos tareas principales. La primera pretende identificar y extraer de los textos describiendo itinerarios información como entidades espaciales y expresiones de desplazamiento o percepción. El objetivo de la segunda tarea es la reconstrucción del itinerario. Nuestra propuesta combina información local extraída gracias al procesamiento del lenguaje natural con datos extraídos de fuentes geográficas externas (por ejemplo, gazetteers). La etapa de anotación de informaciones espaciales se realiza mediante una aproximación que combina el etiquetado morfo-sintáctico y los patrones léxico-sintácticos (cascada de transductores) con el fin de anotar entidades nombradas espaciales y expresiones de desplazamiento y percepción. Una primera contribución a la primera tarea es la desambiguación de topónimos, que es un problema todavía mal resuelto dentro del reconocimiento de entidades nombradas (Named Entity Recognition - NER) y esencial en la recuperación de información geográfica. Se plantea un algoritmo no supervisado de georreferenciación basado en una técnica de clustering capaz de proponer una solución para desambiguar los topónimos los topónimos encontrados en recursos geográficos externos, y al mismo tiempo, la localización de topónimos no referenciados. Se propone un modelo de grafo genérico para la reconstrucción automática de itinerarios, donde cada nodo representa un lugar y cada arista representa un camino enlazando dos lugares. La originalidad de nuestro modelo es que además de tener en cuenta los elementos habituales (caminos y puntos del recorrido), permite representar otros elementos involucrados en la descripción de un itinerario, como por ejemplo los puntos de referencia visual. Se calcula de un árbol de recubrimiento mínimo a partir de un grafo ponderado para obtener automáticamente un itinerario bajo la forma de un grafo. Cada arista del grafo inicial se pondera mediante un método de análisis multicriterio que combina criterios cualitativos y cuantitativos. El valor de estos criterios se determina a partir de informaciones extraídas del texto e informaciones provenientes de recursos geográficos externos. Por ejemplo, se combinan las informaciones generadas por el procesamiento del lenguaje natural como las relaciones espaciales describiendo una orientación (ej: dirigirse hacia el sur) con las coordenadas geográficas de lugares encontrados dentro de los recursos para determinar el valor del criterio ``relación espacial''. Además, a partir de la definición del concepto de itinerario y de las informaciones utilizadas en la lengua para describir un itinerario, se ha modelado un lenguaje de anotación de información espacial adaptado a la descripción de desplazamientos, apoyándonos en las recomendaciones del consorcio TEI (Text Encoding and Interchange). Finalmente, se ha implementado y evaluado las diferentes etapas de nuestra aproximación sobre un corpus multilingüe de descripciones de senderos y excursiones (francés, español, italiano)

    Adolescent Brain Cognitive Development (ABCD) Study Linked External Data (LED): Protocol and practices for geocoding and assignment of environmental data

    Get PDF
    Our brain is constantly shaped by our immediate environments, and while some effects are transient, some have long-term consequences. Therefore, it is critical to identify which environmental risks have evident and long-term impact on brain development. To expand our understanding of the environmental context of each child, the Adolescent Brain Cognitive Development (ABCD) Study® incorporates the use of geospatial location data to capture a range of individual, neighborhood, and state level data based on the child\u27s residential location in order to elucidate the physical environmental contexts in which today\u27s youth are growing up. We review the major considerations and types of geocoded information incorporated by the Linked External Data Environmental (LED) workgroup to expand on the built and natural environmental constructs in the existing and future ABCD Study data releases. Understanding the environmental context of each youth furthers the consortium\u27s mission to understand factors that may influence individual differences in brain development, providing the opportunity to inform public policy and health organization guidelines for child and adolescent health

    Population distribution by selected road network elements - comparison of centroids, geocoded addresses, built-up areas and total areas on the example of Slovak communes

    Get PDF
    Two research objectives can be identified in the presented paper. The first one was the development of a point layer, which would abstract from the position of a central point depending on the shape of the territory of the respective spatial unit (commune), and would express the position of a commune as regards the location of the point in the area of the commune built-up area. For such purpose, a geocoding algorithm from Google was used, for which it was possible to prepare a final dot map layer without any terrain layout, as the geocoding algorithm processes only simple text addresses of the relevant spatial units. Such an obtained dot layer was compared with the layer of centroids and the achieved differences were visualised. Another objective was to compare different methods of population distribution interpretation from the selected road network elements at the commune level. Point layers in the form of centroids and geocodes were compared with the spatial population distribution on the basis of the total area and built-up area of a commune. It is more suitable to use geocodes as the holder of statistical information in comparison with commune centroids, in particular in the areas with marked vertical division of the terrain. In assessing population distribution, the obtained values are much closer to the expression of the identical indicator calculated for the built-up area of a commune that we consider most accurate, which is also documented by the average percentage deviations between particular interpretations of population distribution.

    “All the world’s a stage”: A GIS framework for recreating personal time-space from qualitative and quantitative sources

    Get PDF
    This article presents a methodological model for the study of the space‐time patterns of everyday life. The framework utilizes a wide range of qualitative and quantitative sources to create two environmental stages, social and built, which place and contextualize the daily mobilities of individuals as they traverse urban environments. Additionally, this study outlines a procedure to fully integrate narrative sources in a GIS. By placing qualitative sources, such as narratives, within a stage‐based GIS, researchers can begin to tell rich spatial stories about the lived experiences of segregation, social interaction, and environmental exposure. The article concludes with a case study utilizing the diary of a postal clerk to outline the wide applicability of this model for space‐time GIS research

    Development of spatial density maps based on geoprocessing web services: application to tuberculosis incidence in Barcelona, Spain

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Health professionals and authorities strive to cope with heterogeneous data, services, and statistical models to support decision making on public health. Sophisticated analysis and distributed processing capabilities over geocoded epidemiological data are seen as driving factors to speed up control and decision making in these health risk situations. In this context, recent Web technologies and standards-based web services deployed on geospatial information infrastructures have rapidly become an efficient way to access, share, process, and visualize geocoded health-related information.</p> <p>Methods</p> <p>Data used on this study is based on Tuberculosis (TB) cases registered in Barcelona city during 2009. Residential addresses are geocoded and loaded into a spatial database that acts as a backend database. The web-based application architecture and geoprocessing web services are designed according to the Representational State Transfer (REST) principles. These web processing services produce spatial density maps against the backend database.</p> <p>Results</p> <p>The results are focused on the use of the proposed web-based application to the analysis of TB cases in Barcelona. The application produces spatial density maps to ease the monitoring and decision making process by health professionals. We also include a discussion of how spatial density maps may be useful for health practitioners in such contexts.</p> <p>Conclusions</p> <p>In this paper, we developed web-based client application and a set of geoprocessing web services to support specific health-spatial requirements. Spatial density maps of TB incidence were generated to help health professionals in analysis and decision-making tasks. The combined use of geographic information tools, map viewers, and geoprocessing services leads to interesting possibilities in handling health data in a spatial manner. In particular, the use of spatial density maps has been effective to identify the most affected areas and its spatial impact. This study is an attempt to demonstrate how web processing services together with web-based mapping capabilities suit the needs of health practitioners in epidemiological analysis scenarios.</p

    A Spatiotemporal Autoregressive Price Index for the Paris Office Property Market

    Get PDF
    This paper applies the spatiotemporal hedonic approach to analysis of office transaction prices in the Paris property market (i.e. central Paris and its inner suburbs). The analysis focuses primarily on the market’s two main business districts (the CBD and the La Defense District). We find that spatial and temporal dependence effects are strongly present in these submarkets. Additionally, we propose a hybrid method for incorporating a temporal regime into the spatiotemporal autoregressive model proposed by Pace, Barry, Clapp and Rodriguez (1998). Regime switching around 1997 (i.e. in the presence of temporal heterogeneity) substantially affects the significance of spatial and temporal dependences. Finally, we build a new price index that incorporates both spatiotemporal dependences and temporal heterogeneity. This index differs strongly from the usual hedonic price indexHedonic Prices; Paris Office Property Market; Spatiotemporal Autoregressive Price Index; Temporal Heterogeneity

    SOCIAL MEDIA FOOTPRINTS OF PUBLIC PERCEPTION ON ENERGY ISSUES IN THE CONTERMINOUS UNITED STATES

    Get PDF
    Energy has been at the top of the national and global political agenda along with other concomitant challenges, such as poverty, disaster and climate change. Social perception on various energy issues, such as its availability, development and consumption deeply affect our energy future. This type of information is traditionally collected through structured energy surveys. However, these surveys are often subject to formidable costs and intensive labor, as well as a lack of temporal dimensions. Social media can provide a more cost-effective solution to collect massive amount of data on public opinions in a timely manner that may complement the survey. The purpose of this study is to use machine learning algorithms and social media conversations to characterize the spatiotemporal topics and social perception on different energy in terms of spatial and temporal dimensions. Text analysis algorithms, such as sentiment analysis and topic analysis, were employed to offer insights into the public attitudes and those prominent issues related to energy. The results show that the energy related public perceptions exhibited spatiotemporal dynamics. The study is expected to help inform decision making, formulate national energy policies, and update entrepreneurial energy development decisions

    Semantically-Aware Retrieval of Oceanographic Phenomena Annotated on Satellite Images

    Get PDF
    Scientists in the marine domain process satellite images in order to extract information that can be used for monitoring, understanding, and forecasting of marine phenomena, such as turbidity, algal blooms and oil spills. The growing need for effective retrieval of related information has motivated the adoption of semantically aware strategies on satellite images with different spatiotemporal and spectral characteristics. A big issue of these approaches is the lack of coincidence between the information that can be extracted from the visual data and the interpretation that the same data have for a user in a given situation. In this work, we bridge this semantic gap by connecting the quantitative elements of the Earth Observation satellite images with the qualitative information, modelling this knowledge in a marine phenomena ontology and developing a question answering mechanism based on natural language that enables the retrieval of the most appropriate data for each user’s needs. The main objective of the presented methodology is to realize the content-based search of Earth Observation images related to the marine application domain on an application-specific basis that can answer queries such as “Find oil spills that occurred this year in the Adriatic Sea”

    Household wealth proxies for socio-economic inequality policy studies in China

    Get PDF
    In China, one percent of the richest population holds more than one-third of the wealth, while the poorest 25% shares no more than two percent of the total. The country’s rapid economic development has resulted in increasing socio-economic disparities, and a rapidly deteriorating environment. This puts the Chinese citizens, especially the most vulnerable and deprived socio-economic status (SES) groups, at high risks of environmental inequality (EI). In most SES-based EI studies conducted in China, household wealth has often been overlooked, though it potentially serves a good economic indicator to capture the socio-economic effect of environmental change in China. Nevertheless, existing SES databases in China are of low spatial resolution and are insufficient to support fine-grained EI studies at the intra-city level in China. The core research challenge is to develop a representative household wealth proxy in high-spatial resolution for China. This study highlights the research gaps and proposes a new household wealth proxy, which integrates both fine-grained data/features such as daytime satellite imagery and easily accessible wealth indicators such as house prices. We also capitalize on everyday economic activity data retrieved from personal mobile phones and online transaction/social platforms in the composition of our wealth proxy to achieve a higher accuracy in estimating household wealth at fine-grained resolution via machine learning. Finally, we summarize the challenges in improving both the quality and the availability of Chinese socio-economic datasets, while protecting personal privacy and information security during the data collection process for household wealth proxy development in China

    A pragmatic guide to geoparsing evaluation

    Get PDF
    Abstract: Empirical methods in geoparsing have thus far lacked a standard evaluation framework describing the task, metrics and data used to compare state-of-the-art systems. Evaluation is further made inconsistent, even unrepresentative of real world usage by the lack of distinction between the different types of toponyms, which necessitates new guidelines, a consolidation of metrics and a detailed toponym taxonomy with implications for Named Entity Recognition (NER) and beyond. To address these deficiencies, our manuscript introduces a new framework in three parts. (Part 1) Task Definition: clarified via corpus linguistic analysis proposing a fine-grained Pragmatic Taxonomy of Toponyms. (Part 2) Metrics: discussed and reviewed for a rigorous evaluation including recommendations for NER/Geoparsing practitioners. (Part 3) Evaluation data: shared via a new dataset called GeoWebNews to provide test/train examples and enable immediate use of our contributions. In addition to fine-grained Geotagging and Toponym Resolution (Geocoding), this dataset is also suitable for prototyping and evaluating machine learning NLP models
    corecore