1,058 research outputs found

    Region-based Dynamic Weighting Probabilistic Geocoding

    Get PDF
    Geocoding has been a widely used technology in daily life and scientific research for at least four decades. Especially in scientific research, geocoding has been used as a generator of spatial data for further analysis. These uses have made it extremely important that geocoding results be as accurate as possible. Existing global-weighting approaches to geocoding assume spatial stationarity of addressing systems and address data characteristic distributions across space, resulting in heuristics and approaches that apply global parameters to produce geocodes for addresses in all regions. However, different regions in the United States (US) have different values and densities of address attributes, which increases the error of standard algorithms that assume global parameters and calculation weights. Region-based dynamic weighting can be used in probabilistic geocoding approaches to stabilize and reduce incorrect match probability assignments that are due to place-specific naming conventions which vary region-to-region across the US. This study tested the spatial accuracy and time efficiency of a region-based dynamic weighting probabilistic geocoding system, as compared to a set of manually corrected geocoding results within Los Angeles City. The results of this study show that the region-based dynamic weighting probabilistic method improves the spatial accuracy of geocoding results and has a moderate influence on the time efficiency of the geocoding system

    Automating content generation for large-scale virtual learning environments using semantic web services

    Get PDF
    The integration of semantic web services with three-dimensional virtual worlds offers many potential avenues for the creation of dynamic, content-rich environments which can be used to entertain, educate, and inform. One such avenue is the fusion of the large volumes of data from Wiki-based sources with virtual representations of historic locations, using semantics to filter and present data to users in effective and personalisable ways. This paper explores the potential for such integration, addressing challenges ranging from accurately transposing virtual world locales to semantically-linked real world data, to integrating diverse ranges of semantic information sources in a usercentric and seamless fashion. A demonstrated proof-of-concept, using the Rome Reborn model, a detailed 3D representation of Ancient Rome within the Aurelian Walls, shows several advantages that can be gained through the use of existing Wiki and semantic web services to rapidly and automatically annotate content, as well as demonstrating the increasing need for Wiki content to be represented in a semantically-rich form. Such an approach has applications in a range of different contexts, including education, training, and cultural heritage

    Search improvement within the geospatial web in the context of spatial data infrastructures

    Get PDF
    El trabajo desarrollado en esta tesis doctoral demuestra que es posible mejorar la búsqueda en el contexto de las Infraestructuras de Datos Espaciales mediante la aplicación de técnicas y buenas prácticas de otras comunidades científicas, especialmente de las comunidades de la Web y de la Web Semántica (por ejemplo, Linked Data). El uso de las descripciones semánticas y las aproximaciones basadas en el contenido publicado por la comunidad geoespacial pueden ayudar en la búsqueda de información sobre los fenómenos geográficos, y en la búsqueda de recursos geoespaciales en general. El trabajo comienza con un análisis de una aproximación para mejorar la búsqueda de las entidades geoespaciales desde la perspectiva de geocodificación tradicional. La arquitectura de geocodificación compuesta propuesta en este trabajo asegura una mejora de los resultados de geocodificación gracias a la utilización de diferentes proveedores de información geográfica. En este enfoque, el uso de patrones estructurales de diseño y ontologías en esta aproximación permite una arquitectura avanzada en términos de extensibilidad, flexibilidad y adaptabilidad. Además, una arquitectura basada en la selección de servicio de geocodificación permite el desarrollo de una metodología de la georreferenciación de diversos tipos de información geográfica (por ejemplo, direcciones o puntos de interés). A continuación, se presentan dos aplicaciones representativas que requieren una caracterización semántica adicional de los recursos geoespaciales. El enfoque propuesto en este trabajo utiliza contenidos basados en heurísticas para el muestreo de un conjunto de recursos geopesaciales. La primera parte se dedica a la idea de la abstracción de un fenómeno geográfico de su definición espacial. La investigación muestra que las buenas prácticas de la Web Semántica se puede reutilizar en el ámbito de una Infraestructura de Datos Espaciales para describir los servicios geoespaciales estandarizados por Open Geospatial Consortium por medio de geoidentificadores (es decir, por medio de las entidades de una ontología geográfica). La segunda parte de este capítulo desglosa la aquitectura y componentes de un servicio de geoprocesamiento para la identificación automática de ortoimágenes ofrecidas a través de un servicio estándar de publicación de mapas (es decir, los servicios que siguen la especificación OGC Web Map Service). Como resultado de este trabajo se ha propuesto un método para la identificación de los mapas ofrecidos por un Web Map Service que son ortoimágenes. A continuación, el trabajo se dedica al análisis de cuestiones relacionadas con la creación de los metadatos de recursos de la Web en el contexto del dominio geográfico. Este trabajo propone una arquitectura para la generación automática de conocimiento geográfico de los recursos Web. Ha sido necesario desarrollar un método para la estimación de la cobertura geográfica de las páginas Web. Las heurísticas propuestas están basadas en el contenido publicado por os proveedores de información geográfica. El prototipo desarrollado es capaz de generar metadatos. El modelo generado contiene el conjunto mínimo recomendado de elementos requeridos por un catálogo que sigue especificación OGC Catalogue Service for the Web, el estandar recomendado por deiferentes Infraestructuras de Datos Espaciales (por ejemplo, the Infrastructure for Spatial Information in the European Community (INSPIRE)). Además, este estudio determina algunas características de la Web Geoespacial actual. En primer lugar, ofrece algunas características del mercado de los proveedores de los recursos Web de la información geográfica. Este estudio revela algunas prácticas de la comunidad geoespacial en la producción de metadatos de las páginas Web, en particular, la falta de metadatos geográficos. Todo lo anterior es la base del estudio de la cuestión del apoyo a los usuarios no expertos en la búsqueda de recursos de la Web Geoespacial. El motor de búsqueda dedicado a la Web Geoespacial propuesto en este trabajo es capaz de usar como base un motor de búsqueda existente. Por otro lado, da soporte a la búsqueda exploratoria de los recursos geoespaciales descubiertos en la Web. El experimento sobre la precisión y la recuperación ha demostrado que el prototipo desarrollado en este trabajo es al menos tan bueno como el motor de búsqueda remoto. Un estudio dedicado a la utilidad del sistema indica que incluso los no expertos pueden realizar una tarea de búsqueda con resultados satisfactorios

    Communitywide Database Designs for Tracking Innovation Impact: COMETS, STARS and Nanobank

    Get PDF
    Data availability is arguably the greatest impediment to advancing the science of science and innovation policy and practice (SciSIPP). This paper describes the contents, methodology and use of the public online COMETS (Connecting Outcome Measures in Entrepreneurship Technology and Science) database spanning all sciences, technologies, and high-tech industries; its parent COMETSandSTARS database which adds more data at organization and individual scientist-inventor-entrepreneur level restricted by vendor licenses to onsite use at NBER and/or UCLA; and their prototype Nanobank covering only nano-scale sciences and technologies. Some or all of these databases include or will include: US patents (granted and applications); NIH, NSF, SBIR, STTR Grants; Thomson Reuters Web of Knowledge; ISI Highly Cited; US doctoral dissertations; IPEDS/HEGIS universities; all firms and other organizations which ever publish in ISI listed journals beginning in 1981, are assigned US patents (from 1975), or are listed on a covered grant; additional nanotechnology firms based on web search. Ticker/CUSIP codes enable linking public firms to the major databases covering them. A major matching/disambiguation effort assigns unique identifiers for an organization or individual so that their appearances are linked within and across the constituent legacy databases. Extensive geographic coding enables analysis at country, region, state, county, or city levels. The databases provide very flexible sources of data for serious research on many issues in the study of organizations in innovation systems in the development and spread of knowledge, and the economics of science. Enabling the study of these topics, among others, COMETS contributes substantially to the science of science and technology.

    Transportation Construction Work-Zone Safety Impact on Time-Related Incentive Contracting Projects

    Get PDF
    Work-zone safety on highway projects continues to be a national concern, and project safety performance is one of the indicators of project success. Many contractors and State Transportation Agencies believe that expedited construction time under incentive contracting contributes to reducing the safety risk of road users traveling through work zones. However, the truth of this belief has never been measured or supported by any statistical evidence. Therefore, this research investigates the statistical relationship between time-related incentive road construction projects and frequency of vehicle crashes in California to understand the impact of time-related incentive provisions on project safety performance. The research team collected incentive and non-incentive project data from the California Department of Transportation. Additionally, vehicle crash data was collected from the California Statewide Integrated Traffic Records System. Using Geographic Information System (GIS) software, the locations of construction projects and crashes at the project locations were then pinpointed on GIS centerline layers. The research team performed statistical analyses to test the relationship between the frequency and characteristics of crashes at incentive project sites and ones at non-incentive project sites before, during, and after construction. Finally, the analysis results for both time-related incentive projects and non-incentive projects were summarized to provide project planners and managers with a better understanding of the impact of time-related incentive contracting on project safety performance

    A workflow for geocoding South African addresses

    Get PDF
    There are many industries that have long been utilizing Geographical Information Systems (GIS) for spatial analysis. In many parts of the world, it has gained less popularity because of inaccurate geocoding methods and a lack of data standardization. Commercial services can also be expensive and as such, smaller businesses have been reluctant to make a financial commitment to spatial analytics. This thesis discusses the challenges specific to South Africa as well as the challenges inherent in bad address data. The main goal of this research is to highlight the potential error rates of geocoded user-captured address data and to provide a workflow that can be followed to reduce the error rate without intensive manual data cleansing. We developed a six step workflow and software package to prepare address data for spatial analysis and determine the potential error rate. We used three methods of geocoding: a gazetteer postal code file, a free web API and an international commercial product. To protect the privacy of the clients and the businesses, addresses were aggregated with precision to a postcode or suburb centroid. Geocoding results were analysed before and after each step. Two businesses were analysed, a mid-large scale business with a large structured client address database and a small private business with a 20 year old unstructured client address database. The companies are from two completely different industries, the larger being in the financial industry and the smaller company an independent magazine in publishing

    Railroads and the Making of Modern America -- Tools for Spatio-Temporal Correlation, Analysis, and Visualization

    Get PDF
    This project aims to integrate large-scale data sources from the Digging into Data repositories with other types of relevant data on the railroad system, already assembled by the project directors. Our project seeks to develop useful tools for spatio-temporal visualization of these data and the relationships among them. Our interdisciplinary team includes computer science, history, and geography researchers. Because the railroad "system" and its spatio-temporal configuration appeared differently from locality-to-locality and region-to-region, we need to adjust how we "locate" and "see" the system. By applying data mining and pattern recognition techniques, software systems can be created that dynamically redefine the way spatial data are represented. Utilizing processes common to analysis in Computer Science, we propose to develop a software framework that allows these embedded concepts to be visualized and further studied

    An effective and efficient approach for manually improving geocoded data

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The process of geocoding produces output coordinates of varying degrees of quality. Previous studies have revealed that simply excluding records with low-quality geocodes from analysis can introduce significant bias, but depending on the number and severity of the inaccuracies, their inclusion may also lead to bias. Little quantitative research has been presented on the cost and/or effectiveness of correcting geocodes through manual interactive processes, so the most cost effective methods for improving geocoded data are unclear. The present work investigates the time and effort required to correct geocodes contained in five health-related datasets that represent examples of data commonly used in Health GIS.</p> <p>Results</p> <p>Geocode correction was attempted on five health-related datasets containing a total of 22,317 records. The complete processing of these data took 11.4 weeks (427 hours), averaging 69 seconds of processing time per record. Overall, the geocodes associated with 12,280 (55%) of records were successfully improved, taking 95 seconds of processing time per corrected record on average across all five datasets. Geocode correction improved the overall match rate (the number of successful matches out of the total attempted) from 79.3 to 95%. The spatial shift between the location of original successfully matched geocodes and their corrected improved counterparts averaged 9.9 km per corrected record. After geocode correction the number of city and USPS ZIP code accuracy geocodes were reduced from 10,959 and 1,031 to 6,284 and 200, respectively, while the number of building centroid accuracy geocodes increased from 0 to 2,261.</p> <p>Conclusion</p> <p>The results indicate that manual geocode correction using a web-based interactive approach is a feasible and cost effective method for improving the quality of geocoded data. The level of effort required varies depending on the type of data geocoded. These results can be used to choose between data improvement options (e.g., manual intervention, pseudocoding/geo-imputation, field GPS readings).</p

    Historical Places In Malacca (Enhancement Of Maps Manipulation Capability Through The Website) Using MySQL

    Get PDF
    The motivation to be involved with the field of study regarding GIS, has been emerging in a fast pace in these few years.Much research had been done and performed, giving tremendous and beneficial results towards this field. But most ofthe GIS applications were developed usingthe vendors ownproprietary database, in which, this could promote many problems. Geographic Information Systems alsoknown as GIS, are all about gathering data andthen building layers upon layers of this dataand then displaying them on a computer screen. The aim and the objective of the study done through this paper wouldbe in usingMySQLfor developing a GIS application, thus showingMySQL's ability for supporting GIS-based data, or in the otherword, the spatial data. While the main objective in doing the study and developing the particular system is mainly using MySQL in managing the spatial data, the otherintegral objectives which comes along with this project are, providing better features and quality spatial data features from the system for the users and also enhancing the capability of manipulating the maps, which are provided through the system. TheMethodology beingused in developing this project is according to the RAD Methodology, which involved the stages such as Requirement Planning, User Design, Construction, and Implementation. These stages would be further discussed through Chapter 3 of Methodology and Projectwork. And as for the Conclusion, which could be derived from the entire project, from the research being done, it could be seenthat, MySQL is able to support in the development of any GIS - based application through thenew released of its database which also included the spatial data management ability
    corecore