Search CORE

38 research outputs found

An automated approach for geocoding tabular itineraries

Author: Martins Bruno
Murrieta-Flores Patricia
Santos Rui
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 30/11/2017
Field of study

Historical itineraries, often accessible as lists or tables describing places visited in sequence, are abundant resources and also important objects of study for humanities scholars. This article advances a novel method for automatically geocoding tabular itineraries, combining approximate string matching with a cost optimization algorithm based on dynamic programming. Experiments with a dataset of historical itineraries, with ground-truth geocoding annotations provided by domain experts and leveraging also the GeoNames gazetteer, attest to the effectiveness of the proposed method. The obtained results show that while approximate string matching can already achieve very low median errors, with many toponyms matching exactly against GeoNames entries, the combination with cost optimization can significantly improve results in terms of the average distance towards the correct disambiguations

Lancaster E-Prints

Building Blocks for Mapping Services

Author: Luxen Dennis
Publication venue: KIT-Bibliothek, Karlsruhe
Publication date: 01/01/2013
Field of study

Mapping services are ubiquitous on the Internet. These services enjoy a considerable user base. But it is often overlooked that providing a service on a global scale with virtually millions of users has been the playground of an oligopoly of a select few service providers are able to do so. Unfortunately, the literature on these solutions is more than scarce. This thesis adds a number of building blocks to the literature that explain how to design and implement a number of features

KITopen

Automatic reconstruction of itineraries from descriptive texts

Author: Gaio Mauro
Moncla Ludovic
Nogueras Iso Francisco Javier
Publication venue: Universidad de Zaragoza, Prensas de la Universidad
Publication date: 01/01/2015
Field of study

Esta tesis se inscribe dentro del marco del proyecto PERDIDO donde los objetivos son la extracción y reconstrucción de itinerarios a partir de documentos textuales. Este trabajo se ha realizado en colaboración entre el laboratorio LIUPPA de l' Université de Pau et des Pays de l' Adour (France), el grupo de Sistemas de Información Avanzados (IAAA) de la Universidad de Zaragoza y el laboratorio COGIT de l' IGN (France). El objetivo de esta tesis es concebir un sistema automático que permita extraer, a partir de guías de viaje o descripciones de itinerarios, los desplazamientos, además de representarlos sobre un mapa. Se propone una aproximación para la representación automática de itinerarios descritos en lenguaje natural. Nuestra propuesta se divide en dos tareas principales. La primera pretende identificar y extraer de los textos describiendo itinerarios información como entidades espaciales y expresiones de desplazamiento o percepción. El objetivo de la segunda tarea es la reconstrucción del itinerario. Nuestra propuesta combina información local extraída gracias al procesamiento del lenguaje natural con datos extraídos de fuentes geográficas externas (por ejemplo, gazetteers). La etapa de anotación de informaciones espaciales se realiza mediante una aproximación que combina el etiquetado morfo-sintáctico y los patrones léxico-sintácticos (cascada de transductores) con el fin de anotar entidades nombradas espaciales y expresiones de desplazamiento y percepción. Una primera contribución a la primera tarea es la desambiguación de topónimos, que es un problema todavía mal resuelto dentro del reconocimiento de entidades nombradas (Named Entity Recognition - NER) y esencial en la recuperación de información geográfica. Se plantea un algoritmo no supervisado de georreferenciación basado en una técnica de clustering capaz de proponer una solución para desambiguar los topónimos los topónimos encontrados en recursos geográficos externos, y al mismo tiempo, la localización de topónimos no referenciados. Se propone un modelo de grafo genérico para la reconstrucción automática de itinerarios, donde cada nodo representa un lugar y cada arista representa un camino enlazando dos lugares. La originalidad de nuestro modelo es que además de tener en cuenta los elementos habituales (caminos y puntos del recorrido), permite representar otros elementos involucrados en la descripción de un itinerario, como por ejemplo los puntos de referencia visual. Se calcula de un árbol de recubrimiento mínimo a partir de un grafo ponderado para obtener automáticamente un itinerario bajo la forma de un grafo. Cada arista del grafo inicial se pondera mediante un método de análisis multicriterio que combina criterios cualitativos y cuantitativos. El valor de estos criterios se determina a partir de informaciones extraídas del texto e informaciones provenientes de recursos geográficos externos. Por ejemplo, se combinan las informaciones generadas por el procesamiento del lenguaje natural como las relaciones espaciales describiendo una orientación (ej: dirigirse hacia el sur) con las coordenadas geográficas de lugares encontrados dentro de los recursos para determinar el valor del criterio ``relación espacial''. Además, a partir de la definición del concepto de itinerario y de las informaciones utilizadas en la lengua para describir un itinerario, se ha modelado un lenguaje de anotación de información espacial adaptado a la descripción de desplazamientos, apoyándonos en las recomendaciones del consorcio TEI (Text Encoding and Interchange). Finalmente, se ha implementado y evaluado las diferentes etapas de nuestra aproximación sobre un corpus multilingüe de descripciones de senderos y excursiones (francés, español, italiano)

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Repositorio Universidad de Zaragoza

Entity Linking for the Biomedical Domain

Author: Nassimi Sahar
Publication venue: Hannover : Gottfried Wilhelm Leibniz Universität
Publication date: 10/02/2023
Field of study

Entity linking is the process of detecting mentions of different concepts in text documents and linking them to canonical entities in a target lexicon. However, one of the biggest issues in entity linking is the ambiguity in entity names. The ambiguity is an issue that many text mining tools have yet to address since different names can represent the same thing and every mention could indicate a different thing. For instance, search engines that rely on heuristic string matches frequently return irrelevant results, because they are unable to satisfactorily resolve ambiguity. Thus, resolving named entity ambiguity is a crucial step in entity linking. To solve the problem of ambiguity, this work proposes a heuristic method for entity recognition and entity linking over the biomedical knowledge graph concerning the semantic similarity of entities in the knowledge graph. Named entity recognition (NER), relation extraction (RE), and relationship linking make up a conventional entity linking (EL) system pipeline (RL). We have used the accuracy metric in this thesis. Therefore, for each identified relation or entity, the solution comprises identifying the correct one and matching it to its corresponding unique CUI in the knowledge base. Because KBs contain a substantial number of relations and entities, each with only one natural language label, the second phase is directly dependent on the accuracy of the first. The framework developed in this thesis enables the extraction of relations and entities from the text and their mapping to the associated CUI in the UMLS knowledge base. This approach derives a new representation of the knowledge base that lends it to the easy comparison. Our idea to select the best candidates is to build a graph of relations and determine the shortest path distance using a ranking approach. We test our suggested approach on two well-known benchmarks in the biomedical field and show that our method exceeds the search engine's top result and provides us with around 4% more accuracy. In general, when it comes to fine-tuning, we notice that entity linking contains subjective characteristics and modifications may be required depending on the task at hand. The performance of the framework is evaluated based on a Python implementation

Institutionelles Repositorium der Leibniz Universität Hannover

Efficient Routing for Disaster Scenarios in Uncertain Networks: A Computational Study of Adaptive Algorithms for the Stochastic Canadian Traveler Problem with Multiple Agents and Destinations

Author: Chanchad Neel
Publication venue: ScholarWorks@UARK
Publication date: 01/05/2023
Field of study

The primary objective of this research is to develop adaptive online algorithms for solving the Canadian Traveler Problem (CTP), which is a well-studied problem in the literature that has important applications in disaster scenarios. To this end, we propose two novel approaches, namely Maximum Likely Node (MLN) and Maximum Likely Path (MLP), to address the single-agent single-destination variant of the CTP. Our computational experiments demonstrate that the MLN and MLP algorithms together achieve new best-known solutions for 10,715 instances. In the context of disaster scenarios, the CTP can be extended to the multiple-agent multiple-destination variant, which we refer to as MAD-CTP. We propose two approaches, namely MAD-OMT and MAD-HOP, to solve this variant. We evaluate the performance of these algorithms on Delaunay and Euclidean graphs of varying sizes, ranging from 20 nodes with 49 edges to 500 nodes with 1500 edges. Our results demonstrate that MAD-HOP outperforms MAD-OMT by a considerable margin, achieving a replan time of under 9 seconds for all instances. Furthermore, we extend the existing state-of-the-art algorithm, UCT, which was previously shown by Eyerich et al. (2010) to be effective for solving the single-source single-destination variant of the CTP, to address the MAD-CTP problem. We compare the performance of UCT and MAD-HOP on a range of instances, and our results indicate that MAD-HOP offers better performance than UCT on most instances. In addition, UCT exhibited a very high replan time of around 10 minutes. The inferior results of UCT may be attributed to the number of rollouts used in the experiments but increasing the number of rollouts did not conclusively demonstrate whether UCT could outperform MAD-HOP. This may be due to the benefits obtained from using multiple agents, as MAD-HOP appears to benefit to a greater extent than UCT when information is shared among agents

ScholarWorks@UARK

Efficient Routing for Disaster Scenarios in Uncertain Networks: A Computational Study of Adaptive Algorithms for the Stochastic Canadian Traveler Problem with Multiple Agents and Destinations

Author: Chanchad Neel
Publication venue: ScholarWorks@UARK
Publication date: 01/05/2023
Field of study

UARK (University of Arkansas )

Personal Wayfinding Assistance

Author: Schmid Falko
Publication venue
Publication date: 01/01/2008
Field of study

We are traveling many different routes every day. In familiar environments it is easy for us to find our ways. We know our way from bedroom to kitchen, from home to work, from parking place to office, and back home at the end of the working day. We have learned these routes in the past and are now able to find our destination without having to think about it. As soon as we want to find a place beyond the demarcations of our mental map, we need help. In some cases we ask our friends to explain us the way, in other cases we use a map to find out about the place. Mobile phones are increasingly equipped with wayfinding assistance. These devices are usually at hand because they are handy and small, which enables us to get wayfinding assistance everywhere where we need it. While the small size of mobile phones makes them handy, it is a disadvantage for displaying maps. Geographic information requires space to be visualized in order to be understandable. Typically, not all information displayed in maps is necessary. An example are walking ways in parks for car drivers, they are they are usually no relevant route options. By not displaying irrelevant information, it is possible to compress the map without losing important information. To reduce information purposefully, we need information about the user, the task at hand, and the environment it is embedded in. In this cumulative dissertation, I describe an approach that utilizes the prior knowledge of the user to adapt maps to the to the limited display options of mobile devices with small displays. I focus on central questions that occur during wayfinding and relate them to the knowledge of the user. This enables the generation of personal and context-specific wayfinding assistance in the form of maps which are optimized for small displays. To achieve personalized assistance, I present algorithmic methods to derive spatial user profiles from trajectory data. The individual profiles contain information about the places users regularly visit, as well as the traveled routes between them. By means of these profiles it is possible to generate personalized maps for partially familiar environments. Only the unfamiliar parts of the environment are presented in detail, the familiar parts are highly simplified. This bears great potential to minimize the maps, while at the same time preserving the understandability by including personally meaningful places as references. To ensure the understandability of personalized maps, we have to make sure that the names of the places are adapted to users. In this thesis, we study the naming of places and analyze the potential to automatically select and generate place names. However, personalized maps only work for environments the users are partially familiar with. If users need assistance for unfamiliar environments, they require complete information. In this thesis, I further present approaches to support uses in typical situations which can occur during wayfinding. I present solutions to communicate context information and survey knowledge along the route, as well as methods to support self-localization in case orientation is lost

Publikationer från KTH

Digitala Vetenskapliga Arkivet - Academic Archive On-line

E-LIB Dokumentserver - Staats und Universitätsbibliothek Bremen

Proceedings of the 6th Joint ISO-ACL SIGSEM Workshop on Interoperable Semantic Annotation (ISA-6)

Author
Publication venue: 'Blavatnik School of Government, University of Oxford'
Publication date: 01/01/2011
Field of study

Tilburg University Repository

Recommended from our members

Retrieving information from heterogeneous freight data sources to answer natural language queries

Author: Seedah Dan Paapanyin Kofi
Publication venue
Publication date: 09/02/2015
Field of study

textThe ability to retrieve accurate information from databases without an extensive knowledge of the contents and organization of each database is extremely beneficial to the dissemination and utilization of freight data. The challenges, however, are: 1) correctly identifying only the relevant information and keywords from questions when dealing with multiple sentence structures, and 2) automatically retrieving, preprocessing, and understanding multiple data sources to determine the best answer to user’s query. Current named entity recognition systems have the ability to identify entities but require an annotated corpus for training which in the field of transportation planning does not currently exist. A hybrid approach which combines multiple models to classify specific named entities was therefore proposed as an alternative. The retrieval and classification of freight related keywords facilitated the process of finding which databases are capable of answering a question. Values in data dictionaries can be queried by mapping keywords to data element fields in various freight databases using ontologies. A number of challenges still arise as a result of different entities sharing the same names, the same entity having multiple names, and differences in classification systems. Dealing with ambiguities is required to accurately determine which database provides the best answer from the list of applicable sources. This dissertation 1) develops an approach to identify and classifying keywords from freight related natural language queries, 2) develops a standardized knowledge representation of freight data sources using an ontology that both computer systems and domain experts can utilize to identify relevant freight data sources, and 3) provides recommendations for addressing ambiguities in freight related named entities. Finally, the use of knowledge base expert systems to intelligently sift through data sources to determine which ones provide the best answer to a user’s question is proposed.Civil, Architectural, and Environmental Engineerin

Texas ScholarWorks