38 research outputs found

    An automated approach for geocoding tabular itineraries

    Get PDF
    Historical itineraries, often accessible as lists or tables describing places visited in sequence, are abundant resources and also important objects of study for humanities scholars. This article advances a novel method for automatically geocoding tabular itineraries, combining approximate string matching with a cost optimization algorithm based on dynamic programming. Experiments with a dataset of historical itineraries, with ground-truth geocoding annotations provided by domain experts and leveraging also the GeoNames gazetteer, attest to the effectiveness of the proposed method. The obtained results show that while approximate string matching can already achieve very low median errors, with many toponyms matching exactly against GeoNames entries, the combination with cost optimization can significantly improve results in terms of the average distance towards the correct disambiguations

    Building Blocks for Mapping Services

    Get PDF
    Mapping services are ubiquitous on the Internet. These services enjoy a considerable user base. But it is often overlooked that providing a service on a global scale with virtually millions of users has been the playground of an oligopoly of a select few service providers are able to do so. Unfortunately, the literature on these solutions is more than scarce. This thesis adds a number of building blocks to the literature that explain how to design and implement a number of features

    Automatic reconstruction of itineraries from descriptive texts

    Get PDF
    Esta tesis se inscribe dentro del marco del proyecto PERDIDO donde los objetivos son la extracci贸n y reconstrucci贸n de itinerarios a partir de documentos textuales. Este trabajo se ha realizado en colaboraci贸n entre el laboratorio LIUPPA de l' Universit茅 de Pau et des Pays de l' Adour (France), el grupo de Sistemas de Informaci贸n Avanzados (IAAA) de la Universidad de Zaragoza y el laboratorio COGIT de l' IGN (France). El objetivo de esta tesis es concebir un sistema autom谩tico que permita extraer, a partir de gu铆as de viaje o descripciones de itinerarios, los desplazamientos, adem谩s de representarlos sobre un mapa. Se propone una aproximaci贸n para la representaci贸n autom谩tica de itinerarios descritos en lenguaje natural. Nuestra propuesta se divide en dos tareas principales. La primera pretende identificar y extraer de los textos describiendo itinerarios informaci贸n como entidades espaciales y expresiones de desplazamiento o percepci贸n. El objetivo de la segunda tarea es la reconstrucci贸n del itinerario. Nuestra propuesta combina informaci贸n local extra铆da gracias al procesamiento del lenguaje natural con datos extra铆dos de fuentes geogr谩ficas externas (por ejemplo, gazetteers). La etapa de anotaci贸n de informaciones espaciales se realiza mediante una aproximaci贸n que combina el etiquetado morfo-sint谩ctico y los patrones l茅xico-sint谩cticos (cascada de transductores) con el fin de anotar entidades nombradas espaciales y expresiones de desplazamiento y percepci贸n. Una primera contribuci贸n a la primera tarea es la desambiguaci贸n de top贸nimos, que es un problema todav铆a mal resuelto dentro del reconocimiento de entidades nombradas (Named Entity Recognition - NER) y esencial en la recuperaci贸n de informaci贸n geogr谩fica. Se plantea un algoritmo no supervisado de georreferenciaci贸n basado en una t茅cnica de clustering capaz de proponer una soluci贸n para desambiguar los top贸nimos los top贸nimos encontrados en recursos geogr谩ficos externos, y al mismo tiempo, la localizaci贸n de top贸nimos no referenciados. Se propone un modelo de grafo gen茅rico para la reconstrucci贸n autom谩tica de itinerarios, donde cada nodo representa un lugar y cada arista representa un camino enlazando dos lugares. La originalidad de nuestro modelo es que adem谩s de tener en cuenta los elementos habituales (caminos y puntos del recorrido), permite representar otros elementos involucrados en la descripci贸n de un itinerario, como por ejemplo los puntos de referencia visual. Se calcula de un 谩rbol de recubrimiento m铆nimo a partir de un grafo ponderado para obtener autom谩ticamente un itinerario bajo la forma de un grafo. Cada arista del grafo inicial se pondera mediante un m茅todo de an谩lisis multicriterio que combina criterios cualitativos y cuantitativos. El valor de estos criterios se determina a partir de informaciones extra铆das del texto e informaciones provenientes de recursos geogr谩ficos externos. Por ejemplo, se combinan las informaciones generadas por el procesamiento del lenguaje natural como las relaciones espaciales describiendo una orientaci贸n (ej: dirigirse hacia el sur) con las coordenadas geogr谩ficas de lugares encontrados dentro de los recursos para determinar el valor del criterio ``relaci贸n espacial''. Adem谩s, a partir de la definici贸n del concepto de itinerario y de las informaciones utilizadas en la lengua para describir un itinerario, se ha modelado un lenguaje de anotaci贸n de informaci贸n espacial adaptado a la descripci贸n de desplazamientos, apoy谩ndonos en las recomendaciones del consorcio TEI (Text Encoding and Interchange). Finalmente, se ha implementado y evaluado las diferentes etapas de nuestra aproximaci贸n sobre un corpus multiling眉e de descripciones de senderos y excursiones (franc茅s, espa帽ol, italiano)

    Entity Linking for the Biomedical Domain

    Get PDF
    Entity linking is the process of detecting mentions of different concepts in text documents and linking them to canonical entities in a target lexicon. However, one of the biggest issues in entity linking is the ambiguity in entity names. The ambiguity is an issue that many text mining tools have yet to address since different names can represent the same thing and every mention could indicate a different thing. For instance, search engines that rely on heuristic string matches frequently return irrelevant results, because they are unable to satisfactorily resolve ambiguity. Thus, resolving named entity ambiguity is a crucial step in entity linking. To solve the problem of ambiguity, this work proposes a heuristic method for entity recognition and entity linking over the biomedical knowledge graph concerning the semantic similarity of entities in the knowledge graph. Named entity recognition (NER), relation extraction (RE), and relationship linking make up a conventional entity linking (EL) system pipeline (RL). We have used the accuracy metric in this thesis. Therefore, for each identified relation or entity, the solution comprises identifying the correct one and matching it to its corresponding unique CUI in the knowledge base. Because KBs contain a substantial number of relations and entities, each with only one natural language label, the second phase is directly dependent on the accuracy of the first. The framework developed in this thesis enables the extraction of relations and entities from the text and their mapping to the associated CUI in the UMLS knowledge base. This approach derives a new representation of the knowledge base that lends it to the easy comparison. Our idea to select the best candidates is to build a graph of relations and determine the shortest path distance using a ranking approach. We test our suggested approach on two well-known benchmarks in the biomedical field and show that our method exceeds the search engine's top result and provides us with around 4% more accuracy. In general, when it comes to fine-tuning, we notice that entity linking contains subjective characteristics and modifications may be required depending on the task at hand. The performance of the framework is evaluated based on a Python implementation

    Efficient Routing for Disaster Scenarios in Uncertain Networks: A Computational Study of Adaptive Algorithms for the Stochastic Canadian Traveler Problem with Multiple Agents and Destinations

    Get PDF
    The primary objective of this research is to develop adaptive online algorithms for solving the Canadian Traveler Problem (CTP), which is a well-studied problem in the literature that has important applications in disaster scenarios. To this end, we propose two novel approaches, namely Maximum Likely Node (MLN) and Maximum Likely Path (MLP), to address the single-agent single-destination variant of the CTP. Our computational experiments demonstrate that the MLN and MLP algorithms together achieve new best-known solutions for 10,715 instances. In the context of disaster scenarios, the CTP can be extended to the multiple-agent multiple-destination variant, which we refer to as MAD-CTP. We propose two approaches, namely MAD-OMT and MAD-HOP, to solve this variant. We evaluate the performance of these algorithms on Delaunay and Euclidean graphs of varying sizes, ranging from 20 nodes with 49 edges to 500 nodes with 1500 edges. Our results demonstrate that MAD-HOP outperforms MAD-OMT by a considerable margin, achieving a replan time of under 9 seconds for all instances. Furthermore, we extend the existing state-of-the-art algorithm, UCT, which was previously shown by Eyerich et al. (2010) to be effective for solving the single-source single-destination variant of the CTP, to address the MAD-CTP problem. We compare the performance of UCT and MAD-HOP on a range of instances, and our results indicate that MAD-HOP offers better performance than UCT on most instances. In addition, UCT exhibited a very high replan time of around 10 minutes. The inferior results of UCT may be attributed to the number of rollouts used in the experiments but increasing the number of rollouts did not conclusively demonstrate whether UCT could outperform MAD-HOP. This may be due to the benefits obtained from using multiple agents, as MAD-HOP appears to benefit to a greater extent than UCT when information is shared among agents

    Efficient Routing for Disaster Scenarios in Uncertain Networks: A Computational Study of Adaptive Algorithms for the Stochastic Canadian Traveler Problem with Multiple Agents and Destinations

    Get PDF
    The primary objective of this research is to develop adaptive online algorithms for solving the Canadian Traveler Problem (CTP), which is a well-studied problem in the literature that has important applications in disaster scenarios. To this end, we propose two novel approaches, namely Maximum Likely Node (MLN) and Maximum Likely Path (MLP), to address the single-agent single-destination variant of the CTP. Our computational experiments demonstrate that the MLN and MLP algorithms together achieve new best-known solutions for 10,715 instances. In the context of disaster scenarios, the CTP can be extended to the multiple-agent multiple-destination variant, which we refer to as MAD-CTP. We propose two approaches, namely MAD-OMT and MAD-HOP, to solve this variant. We evaluate the performance of these algorithms on Delaunay and Euclidean graphs of varying sizes, ranging from 20 nodes with 49 edges to 500 nodes with 1500 edges. Our results demonstrate that MAD-HOP outperforms MAD-OMT by a considerable margin, achieving a replan time of under 9 seconds for all instances. Furthermore, we extend the existing state-of-the-art algorithm, UCT, which was previously shown by Eyerich et al. (2010) to be effective for solving the single-source single-destination variant of the CTP, to address the MAD-CTP problem. We compare the performance of UCT and MAD-HOP on a range of instances, and our results indicate that MAD-HOP offers better performance than UCT on most instances. In addition, UCT exhibited a very high replan time of around 10 minutes. The inferior results of UCT may be attributed to the number of rollouts used in the experiments but increasing the number of rollouts did not conclusively demonstrate whether UCT could outperform MAD-HOP. This may be due to the benefits obtained from using multiple agents, as MAD-HOP appears to benefit to a greater extent than UCT when information is shared among agents

    Personal Wayfinding Assistance

    Get PDF
    We are traveling many different routes every day. In familiar environments it is easy for us to find our ways. We know our way from bedroom to kitchen, from home to work, from parking place to office, and back home at the end of the working day. We have learned these routes in the past and are now able to find our destination without having to think about it. As soon as we want to find a place beyond the demarcations of our mental map, we need help. In some cases we ask our friends to explain us the way, in other cases we use a map to find out about the place. Mobile phones are increasingly equipped with wayfinding assistance. These devices are usually at hand because they are handy and small, which enables us to get wayfinding assistance everywhere where we need it. While the small size of mobile phones makes them handy, it is a disadvantage for displaying maps. Geographic information requires space to be visualized in order to be understandable. Typically, not all information displayed in maps is necessary. An example are walking ways in parks for car drivers, they are they are usually no relevant route options. By not displaying irrelevant information, it is possible to compress the map without losing important information. To reduce information purposefully, we need information about the user, the task at hand, and the environment it is embedded in. In this cumulative dissertation, I describe an approach that utilizes the prior knowledge of the user to adapt maps to the to the limited display options of mobile devices with small displays. I focus on central questions that occur during wayfinding and relate them to the knowledge of the user. This enables the generation of personal and context-specific wayfinding assistance in the form of maps which are optimized for small displays. To achieve personalized assistance, I present algorithmic methods to derive spatial user profiles from trajectory data. The individual profiles contain information about the places users regularly visit, as well as the traveled routes between them. By means of these profiles it is possible to generate personalized maps for partially familiar environments. Only the unfamiliar parts of the environment are presented in detail, the familiar parts are highly simplified. This bears great potential to minimize the maps, while at the same time preserving the understandability by including personally meaningful places as references. To ensure the understandability of personalized maps, we have to make sure that the names of the places are adapted to users. In this thesis, we study the naming of places and analyze the potential to automatically select and generate place names. However, personalized maps only work for environments the users are partially familiar with. If users need assistance for unfamiliar environments, they require complete information. In this thesis, I further present approaches to support uses in typical situations which can occur during wayfinding. I present solutions to communicate context information and survey knowledge along the route, as well as methods to support self-localization in case orientation is lost

    Proceedings of the 6th Joint ISO-ACL SIGSEM Workshop on Interoperable Semantic Annotation (ISA-6)

    Get PDF
    corecore