2,338 research outputs found

    Improving an Open Source Geocoding Server

    Get PDF
    A common problem in geocoding is that the postal addresses as requested by the user differ from the addresses as described in the database. The online, open source geocoder called Nominatim is one of the most used geocoders nowadays. However, this geocoder lacks the interactivity that most of the online geocoders already offer. The Nominatim geocoder provides no feedback to the user while typing addresses. Also, the geocoder cannot deal with any misspelling errors introduced by the user in the requested address. This thesis is about extending the functionality of the Nominatim geocoder to provide fuzzy search and autocomplete features. In this work I propose a new index and search strategy for the OpenStreetMap reference dataset. Also, I extend the search algorithm to geocode new address types such as street intersections. Both the original Nominatim geocoder and the proposed solution are compared using metrics such as the precision of the results, match rate and keystrokes saved by the autocomplete feature. The test addresses used in this work are a subset selected among the Swedish addresses available in the OpenStreetMap data set. The results show that the proposed geocoder performs better when compared to the original Nominatim geocoder. In the proposed geocoder, the users get address suggestions as they type, adding interactivity to the original geocoder. Also, the proposed geocoder is able to find the right address in the presence of errors in the user query with a match rate of 98%.The demand of geospatial information is increasing during the last years. There are more and more mobile applications and services that require from the users to enter some information about where they are, or the address of the place they want to find for example. The systems that convert postal addresses or place descriptions into coordinates are called geocoders. How good or bad a geocoder is not only depends on the information the geocoder contains, but also on how easy is for the users to find the desired addresses. There are many well-known web sites that we use in our everyday life to find the location of an address. For example sites like Google Maps, Bing Maps or Yahoo Maps are accessed by millions of users every day to use such services. Among the main features of the mentioned geocoders are the ability to predict the address the user is writing in the search box, and sometimes even to correct any misspellings introduced by the user. To make it more complicated, the predictions and error corrections these systems perform are done in real time. The owners of these address search engines usually impose some restrictions on the number of addresses a user is allowed to search monthly, above which the user needs to pay a fee in order to keep using the system. This limit is usually high enough for the end user, but it might not be enough for the software developers that want to use geospatial data in their products. There is a free alternative to the address search engines mentioned above called Nominatim. Nominatim is an open source project whose purpose is to search addresses among the OpenStreetMap dataset. OpenStreetMap is a collaborative project that tries to map places in the real world into coordinates. The main drawback of Nominatim is that the usability is not as good as the competitors. Nominatim is unable to find addresses that are not correctly spelled, neither predicts the user needs. In order for this address search engine to be among the most used the prediction and error correction features need to be added. In this thesis work I extend the search algorithms of Nominatim to add the functionality mentioned above. The address search engine proposed in this thesis offers a free and open source alternative to users and systems that require access to geospatial data without restrictions

    Querying Schemas With Access Restrictions

    Full text link
    We study verification of systems whose transitions consist of accesses to a Web-based data-source. An access is a lookup on a relation within a relational database, fixing values for a set of positions in the relation. For example, a transition can represent access to a Web form, where the user is restricted to filling in values for a particular set of fields. We look at verifying properties of a schema describing the possible accesses of such a system. We present a language where one can describe the properties of an access path, and also specify additional restrictions on accesses that are enforced by the schema. Our main property language, AccLTL, is based on a first-order extension of linear-time temporal logic, interpreting access paths as sequences of relational structures. We also present a lower-level automaton model, Aautomata, which AccLTL specifications can compile into. We show that AccLTL and A-automata can express static analysis problems related to "querying with limited access patterns" that have been studied in the database literature in the past, such as whether an access is relevant to answering a query, and whether two queries are equivalent in the accessible data they can return. We prove decidability and complexity results for several restrictions and variants of AccLTL, and explain which properties of paths can be expressed in each restriction.Comment: VLDB201

    \u3cem\u3eGRASP News\u3c/em\u3e, Volume 6, Number 1

    Get PDF
    A report of the General Robotics and Active Sensory Perception (GRASP) Laboratory, edited by Gregory Long and Alok Gupta

    Duality between invariant spaces for max-plus linear discrete event systems

    Full text link
    We extend the notions of conditioned and controlled invariant spaces to linear dynamical systems over the max-plus or tropical semiring. We establish a duality theorem relating both notions, which we use to construct dynamic observers. These are useful in situations in which some of the system coefficients may vary within certain intervals. The results are illustrated by an application to a manufacturing system.Comment: 22 pages, 3 figures (6 eps files

    Sparse Proteomics Analysis - A compressed sensing-based approach for feature selection and classification of high-dimensional proteomics mass spectrometry data

    Get PDF
    Background: High-throughput proteomics techniques, such as mass spectrometry (MS)-based approaches, produce very high-dimensional data-sets. In a clinical setting one is often interested in how mass spectra differ between patients of different classes, for example spectra from healthy patients vs. spectra from patients having a particular disease. Machine learning algorithms are needed to (a) identify these discriminating features and (b) classify unknown spectra based on this feature set. Since the acquired data is usually noisy, the algorithms should be robust against noise and outliers, while the identified feature set should be as small as possible. Results: We present a new algorithm, Sparse Proteomics Analysis (SPA), based on the theory of compressed sensing that allows us to identify a minimal discriminating set of features from mass spectrometry data-sets. We show (1) how our method performs on artificial and real-world data-sets, (2) that its performance is competitive with standard (and widely used) algorithms for analyzing proteomics data, and (3) that it is robust against random and systematic noise. We further demonstrate the applicability of our algorithm to two previously published clinical data-sets

    Software Engineering with Incomplete Information

    Get PDF
    Information may be the common currency of the universe, the stuff of creation. As the physicist John Wheeler claimed, we get ``it from bit''. Measuring information, however, is a hard problem. Knowing the meaning of information is a hard problem. Directing the movement of information is a hard problem. This hardness comes when our information about information is incomplete. Yet we need to offer decision making guidance, to the computer or developer, when facing this incompleteness. This work addresses this insufficiency within the universe of software engineering. This thesis addresses the first problem by demonstrating that obtaining the relative magnitude of information flow is computationally less expensive than an exact measurement. We propose ranked information flow, or RIF, where different flows are ordered according to their FlowForward, a new measure designed for ease of ordering. To demonstrate the utility of FlowForward, we introduce information contour maps: heatmapped callgraphs of information flow within software. These maps serve multiple engineering uses, such as security and refactoring. By mixing a type system with RIF, we address the problem of meaning. Information security is a common concern in software engineering. We present OaST, the world's first gradual security type system that replaces dynamic monitoring with information theoretic risk assessment. OaST now contextualises FlowForward within a formally verified framework: secure program components communicate over insecure channels ranked by how much information flows through them. This context helps the developer interpret the flows and enables security policy discovery, adaptation and refactoring. Finally, we introduce safestrings, a type-based system for controlling how the information embedded within a string moves through a program. This takes a structural approach, whereby a string subtype is a more precise, information limited, subset of string, ie a string that contains an email address, rather than anything else

    The 7th Conference of PhD Students in Computer Science

    Get PDF
    • …
    corecore