155,763 research outputs found

    SLIM : Scalable Linkage of Mobility Data

    Get PDF
    We present a scalable solution to link entities across mobility datasets using their spatio-temporal information. This is a fundamental problem in many applications such as linking user identities for security, understanding privacy limitations of location based services, or producing a unified dataset from multiple sources for urban planning. Such integrated datasets are also essential for service providers to optimise their services and improve business intelligence. In this paper, we first propose a mobility based representation and similarity computation for entities. An efficient matching process is then developed to identify the final linked pairs, with an automated mechanism to decide when to stop the linkage. We scale the process with a locality-sensitive hashing (LSH) based approach that significantly reduces candidate pairs for matching. To realize the effectiveness and efficiency of our techniques in practice, we introduce an algorithm called SLIM. In the experimental evaluation, SLIM outperforms the two existing state-of-the-art approaches in terms of precision and recall. Moreover, the LSH-based approach brings two to four orders of magnitude speedup

    The DIGMAP geo-temporal web gazetteer service

    Get PDF
    This paper presents the DIGMAP geo-temporal Web gazetteer service, a system providing access to names of places, historical periods, and associated geo-temporal information. Within the DIGMAP project, this gazetteer serves as the unified repository of geographic and temporal information, assisting in the recognition and disambiguation of geo-temporal expressions over text, as well as in resource searching and indexing. We describe the data integration methodology, the handling of temporal information and some of the applications that use the gazetteer. Initial evaluation results show that the proposed system can adequately support several tasks related to geo-temporal information extraction and retrieval

    The potential for linking cohort participants to official criminal records: a pilot study using the Avon Longitudinal Study of Parents and Children (ALSPAC)

    Get PDF
    Introduction: Linking longitudinal cohort resources with police-recorded records of criminal activity has the potential to inform public health style approaches to policing, and may reduce potential sources of bias from self-reported criminal data collected by cohort studies. A pilot linkage of police records to the Avon Longitudinal Study of Parents and Children (ALSPAC) allows us to consider the acceptability of this linkage, its utility as a data resource, differences in self-reported crime according to consent status for data linkage, and the appropriate governance mechanism to support such a linkage.Methods: We carried out a pilot study linking data from the ALSPAC birth cohort to Ministry of Justice (MoJ) records on criminal cautions and convictions. This pilot was conducted on a fully anonymous basis, meaning we cannot link the identified records to any participant or the wider information within the dataset. Using ALSPAC data, we used summary statistics to investigate differences in socio-economic background and self-reported criminal activity by consent status for crime linkage. We used MoJ records to identify the geographic and temporal concentration of criminality in the ALSPAC cohort.Results: We found that the linkage appears acceptable to participants (4% of the sample opted out), levels of criminal caution and conviction are high enough to support research, and that the majority of crimes occurred in Avon & Somerset (the policing area local to ALSPAC). Those who did not respond to consent requests had higher levels of self-reported criminal behaviour compared to participants who provided explicit consent.Conclusions: These findings suggest that data linkage in ALSPAC provides opportunities to study criminal behaviour and that linked individual-level records could provide robust research in the area. Our findings also suggest the potential for bias when only including participants who have explicitly consented to data linkage, highlighting the limitations of opt-in consent strategies

    The potential for linking cohort participants to official criminal records:a pilot study using the Avon Longitudinal Study of Parents and Children (ALSPAC)

    Get PDF
    Introduction: Linking longitudinal cohort resources with police-recorded records of criminal activity has the potential to inform public health style approaches to policing, and may reduce potential sources of bias from self-reported criminal data collected by cohort studies. A pilot linkage of police records to the Avon Longitudinal Study of Parents and Children (ALSPAC) allows us to consider the acceptability of this linkage, its utility as a data resource, differences in self-reported crime according to consent status for data linkage, and the appropriate governance mechanism to support such a linkage. Methods: We carried out a pilot study linking data from the ALSPAC birth cohort to Ministry of Justice (MoJ) records on criminal cautions and convictions. This pilot was conducted on a fully anonymous basis, meaning we cannot link the identified records to any participant or the wider information within the dataset. Using ALSPAC data, we used summary statistics to investigate differences in socio-economic background and self-reported criminal activity by consent status for crime linkage. We used MoJ records to identify the geographic and temporal concentration of criminality in the ALSPAC cohort. Results: We found that the linkage appears acceptable to participants (4% of the sample opted out), levels of criminal caution and conviction are high enough to support research, and that the majority of crimes occurred in Avon & Somerset (the policing area local to ALSPAC). Those who did not respond to consent requests had higher levels of self-reported criminal behaviour compared to participants who provided explicit consent. Conclusions: These findings suggest that data linkage in ALSPAC provides opportunities to study criminal behaviour and that linked individual-level records could provide robust research in the area. Our findings also suggest the potential for bias when only including participants who have explicitly consented to data linkage, highlighting the limitations of opt-in consent strategies

    Publishing Linked Data - There is no One-Size-Fits-All Formula

    Get PDF
    Publishing Linked Data is a process that involves several design decisions and technologies. Although some initial guidelines have been already provided by Linked Data publishers, these are still far from covering all the steps that are necessary (from data source selection to publication) or giving enough details about all these steps, technologies, intermediate products, etc. Furthermore, given the variety of data sources from which Linked Data can be generated, we believe that it is possible to have a single and uni�ed method for publishing Linked Data, but we should rely on di�erent techniques, technologies and tools for particular datasets of a given domain. In this paper we present a general method for publishing Linked Data and the application of the method to cover di�erent sources from di�erent domains

    Smart City Development with Urban Transfer Learning

    Full text link
    Nowadays, the smart city development levels of different cities are still unbalanced. For a large number of cities which just started development, the governments will face a critical cold-start problem: 'how to develop a new smart city service with limited data?'. To address this problem, transfer learning can be leveraged to accelerate the smart city development, which we term the urban transfer learning paradigm. This article investigates the common process of urban transfer learning, aiming to provide city planners and relevant practitioners with guidelines on how to apply this novel learning paradigm. Our guidelines include common transfer strategies to take, general steps to follow, and case studies in public safety, transportation management, etc. We also summarize a few research opportunities and expect this article can attract more researchers to study urban transfer learning

    Developing an open data portal for the ESA climate change initiative

    Get PDF
    We introduce the rationale for, and architecture of, the European Space Agency Climate Change Initiative (CCI) Open Data Portal (http://cci.esa.int/data/). The Open Data Portal hosts a set of richly diverse datasets – 13 “Essential Climate Variables” – from the CCI programme in a consistent and harmonised form and to provides a single point of access for the (>100 TB) data for broad dissemination to an international user community. These data have been produced by a range of different institutions and vary across both scientific and spatio-temporal characteristics. This heterogeneity of the data together with the range of services to be supported presented significant technical challenges. An iterative development methodology was key to tackling these challenges: the system developed exploits a workflow which takes data that conforms to the CCI data specification, ingests it into a managed archive and uses both manual and automatically generated metadata to support data discovery, browse, and delivery services. It utilises both Earth System Grid Federation (ESGF) data nodes and the Open Geospatial Consortium Catalogue Service for the Web (OGC-CSW) interface, serving data into both the ESGF and the Global Earth Observation System of Systems (GEOSS). A key part of the system is a new vocabulary server, populated with CCI specific terms and relationships which integrates OGC-CSW and ESGF search services together, developed as part of a dialogue between domain scientists and linked data specialists. These services have enabled the development of a unified user interface for graphical search and visualisation – the CCI Open Data Portal Web Presence
    corecore