18,661 research outputs found

    Noise-tolerant approximate blocking for dynamic real-time entity resolution

    Get PDF
    Entity resolution is the process of identifying records in one or multiple data sources that represent the same real-world entity. This process needs to deal with noisy data that contain for example wrong pronunciation or spelling errors. Many real world applications require rapid responses for entity queries on dynamic datasets. This brings challenges to existing approaches which are mainly aimed at the batch matching of records in static data. Locality sensitive hashing (LSH) is an approximate blocking approach that hashes objects within a certain distance into the same block with high probability. How to make approximate blocking approaches scalable to large datasets and effective for entity resolution in real-time remains an open question. Targeting this problem, we propose a noise-tolerant approximate blocking approach to index records based on their distance ranges using LSH and sorting trees within large sized hash blocks. Experiments conducted on both synthetic and real-world datasets show the effectiveness of the proposed approach

    Towards a Scalable Dynamic Spatial Database System

    Get PDF
    With the rise of GPS-enabled smartphones and other similar mobile devices, massive amounts of location data are available. However, no scalable solutions for soft real-time spatial queries on large sets of moving objects have yet emerged. In this paper we explore and measure the limits of actual algorithms and implementations regarding different application scenarios. And finally we propose a novel distributed architecture to solve the scalability issues.Comment: (2012

    Dynamic deployment of context-aware access control policies for constrained security devices

    Get PDF
    Securing the access to a server, guaranteeing a certain level of protection over an encrypted communication channel, executing particular counter measures when attacks are detected are examples of security requirements. Such requirements are identi ed based on organizational purposes and expectations in terms of resource access and availability and also on system vulnerabilities and threats. All these requirements belong to the so-called security policy. Deploying the policy means enforcing, i.e., con guring, those security components and mechanisms so that the system behavior be nally the one speci ed by the policy. The deployment issue becomes more di cult as the growing organizational requirements and expectations generally leave behind the integration of new security functionalities in the information system: the information system will not always embed the necessary security functionalities for the proper deployment of contextual security requirements. To overcome this issue, our solution is based on a central entity approach which takes in charge unmanaged contextual requirements and dynamically redeploys the policy when context changes are detected by this central entity. We also present an improvement over the OrBAC (Organization-Based Access Control) model. Up to now, a controller based on a contextual OrBAC policy is passive, in the sense that it assumes policy evaluation triggered by access requests. Therefore, it does not allow reasoning about policy state evolution when actions occur. The modi cations introduced by our work overcome this limitation and provide a proactive version of the model by integrating concepts from action speci cation languages

    Principles and Concepts of Agent-Based Modelling for Developing Geospatial Simulations

    Get PDF
    The aim of this paper is to outline fundamental concepts and principles of the Agent-Based Modelling (ABM) paradigm, with particular reference to the development of geospatial simulations. The paper begins with a brief definition of modelling, followed by a classification of model types, and a comment regarding a shift (in certain circumstances) towards modelling systems at the individual-level. In particular, automata approaches (e.g. Cellular Automata, CA, and ABM) have been particularly popular, with ABM moving to the fore. A definition of agents and agent-based models is given; identifying their advantages and disadvantages, especially in relation to geospatial modelling. The potential use of agent-based models is discussed, and how-to instructions for developing an agent-based model are provided. Types of simulation / modelling systems available for ABM are defined, supplemented with criteria to consider before choosing a particular system for a modelling endeavour. Information pertaining to a selection of simulation / modelling systems (Swarm, MASON, Repast, StarLogo, NetLogo, OBEUS, AgentSheets and AnyLogic) is provided, categorised by their licensing policy (open source, shareware / freeware and proprietary systems). The evaluation (i.e. verification, calibration, validation and analysis) of agent-based models and their output is examined, and noteworthy applications are discussed.Geographical Information Systems (GIS) are a particularly useful medium for representing model input and output of a geospatial nature. However, GIS are not well suited to dynamic modelling (e.g. ABM). In particular, problems of representing time and change within GIS are highlighted. Consequently, this paper explores the opportunity of linking (through coupling or integration / embedding) a GIS with a simulation / modelling system purposely built, and therefore better suited to supporting the requirements of ABM. This paper concludes with a synthesis of the discussion that has proceeded. The aim of this paper is to outline fundamental concepts and principles of the Agent-Based Modelling (ABM) paradigm, with particular reference to the development of geospatial simulations. The paper begins with a brief definition of modelling, followed by a classification of model types, and a comment regarding a shift (in certain circumstances) towards modelling systems at the individual-level. In particular, automata approaches (e.g. Cellular Automata, CA, and ABM) have been particularly popular, with ABM moving to the fore. A definition of agents and agent-based models is given; identifying their advantages and disadvantages, especially in relation to geospatial modelling. The potential use of agent-based models is discussed, and how-to instructions for developing an agent-based model are provided. Types of simulation / modelling systems available for ABM are defined, supplemented with criteria to consider before choosing a particular system for a modelling endeavour. Information pertaining to a selection of simulation / modelling systems (Swarm, MASON, Repast, StarLogo, NetLogo, OBEUS, AgentSheets and AnyLogic) is provided, categorised by their licensing policy (open source, shareware / freeware and proprietary systems). The evaluation (i.e. verification, calibration, validation and analysis) of agent-based models and their output is examined, and noteworthy applications are discussed.Geographical Information Systems (GIS) are a particularly useful medium for representing model input and output of a geospatial nature. However, GIS are not well suited to dynamic modelling (e.g. ABM). In particular, problems of representing time and change within GIS are highlighted. Consequently, this paper explores the opportunity of linking (through coupling or integration / embedding) a GIS with a simulation / modelling system purposely built, and therefore better suited to supporting the requirements of ABM. This paper concludes with a synthesis of the discussion that has proceeded

    Rational physical agent reasoning beyond logic

    No full text
    The paper addresses the problem of defining a theoretical physical agent framework that satisfies practical requirements of programmability by non-programmer engineers and at the same time permitting fast realtime operation of agents on digital computer networks. The objective of the new framework is to enable the satisfaction of performance requirements on autonomous vehicles and robots in space exploration, deep underwater exploration, defense reconnaissance, automated manufacturing and household automation

    Dynamic sorted neighborhood indexing for real-time entity resolution

    Get PDF
    Real-time Entity Resolution (ER) is the process of matching query records in subsecond time with records in a database that represent the same real-world entity. Indexing techniques are generally used to efficiently extract a set of candidate records from the database that are similar to a query record, and that are to be compared with the query record in more detail. The sorted neighborhood indexing method, which sorts a database and compares records within a sliding window, has been successfully used for ER of large static databases. However, because it is based on static sorted arrays and is designed for batch ER that resolves all records in a database rather than resolving those relating to a single query record, this technique is not suitable for real-time ER on dynamic databases that are constantly updated. We propose a tree-based technique that facilitates dynamic indexing based on the sorted neighborhood method, which can be used for real-time ER, and investigate both static and adaptive window approaches. We propose an approach to reduce query matching times by precalculating the similarities between attribute values stored in neighboring tree nodes. We also propose a multitree solution where different sorting keys are used to reduce the effects of errors and variations in attribute values on matching quality by building several distinct index trees. We experimentally evaluate our proposed techniques on large real datasets, as well as on synthetic data with different data quality characteristics. Our results show that as the index grows, no appreciable increase occurs in both record insertion and query times, and that using multiple trees gives noticeable improvements on matching quality with only a small increase in query time. Compared to earlier indexing techniques for real-time ER, our approach achieves significantly reduced indexing and query matching times while maintaining high matching accuracy
    • …
    corecore