Search CORE

4,628 research outputs found

Entity Resolution and Data Fusion: An Integrated Approach

Author: Domenico Beneventano
Giovanni Simonini
Luca Gagliardelli
Sonia Bergamaschi
Publication venue: country:ITA
Publication date: 01/01/2019
Field of study

Archivio istituzionale della ricerca - Università di Modena e Reggio Emilia

Secure data sharing and processing in heterogeneous clouds

Author: Kubo Baldur
Reimair Florian
Reiter Andreas
Suzic Bojan
VENTURI DANIELE
Publication venue: 'Elsevier BV'
Publication date: 01/01/2015
Field of study

The extensive cloud adoption among the European Public Sector Players empowered them to own and operate a range of cloud infrastructures. These deployments vary both in the size and capabilities, as well as in the range of employed technologies and processes. The public sector, however, lacks the necessary technology to enable effective, interoperable and secure integration of a multitude of its computing clouds and services. In this work we focus on the federation of private clouds and the approaches that enable secure data sharing and processing among the collaborating infrastructures and services of public entities. We investigate the aspects of access control, data and security policy languages, as well as cryptographic approaches that enable fine-grained security and data processing in semi-trusted environments. We identify the main challenges and frame the future work that serve as an enabler of interoperability among heterogeneous infrastructures and services. Our goal is to enable both security and legal conformance as well as to facilitate transparency, privacy and effectivity of private cloud federations for the public sector needs. © 2015 The Authors

Elsevier - Publisher Connector

ZENODO

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Archivio della ricerca- Università di Roma La Sapienza

Incremental Entity Blocking over Heterogeneous Streaming Data

Author: Araújo Tiago Brasileiro
da Nóbrega Thiago Pereira
Nummenmaa Jyrki
Pires Carlos Eduardo Santos
Stefanidis Kostas
Publication venue: 'MDPI AG'
Publication date: 01/12/2022
Field of study

Web systems have become a valuable source of semi-structured and streaming data. In this sense, Entity Resolution (ER) has become a key solution for integrating multiple data sources or identifying similarities between data items, namely entities. To avoid the quadratic costs of the ER task and improve efficiency, blocking techniques are usually applied. Beyond the traditional challenges faced by ER and, consequently, by the blocking techniques, there are also challenges related to streaming data, incremental processing, and noisy data. To address them, we propose a schema-agnostic blocking technique capable of handling noisy and streaming data incrementally through a distributed computational infrastructure. To the best of our knowledge, there is a lack of blocking techniques that address these challenges simultaneously. This work proposes two strategies (attribute selection and top-n neighborhood entities) to minimize resource consumption and improve blocking efficiency. Moreover, this work presents a noise-tolerant algorithm, which minimizes the impact of noisy data (e.g., typos and misspellings) on blocking effectiveness. In our experimental evaluation, we use real-world pairs of data sources, including a case study that involves data from Twitter and Google News. The proposed technique achieves better results regarding effectiveness and efficiency compared to the state-of-the-art technique (metablocking). More precisely, the application of the two strategies over the proposed technique alone improves efficiency by 56%, on average.publishedVersionPeer reviewe

Directory of Open Access Journals

Trepo - Institutional Repository of Tampere University

The mediated data integration (MeDInt) : An approach to the integration of database and legacy systems

Author: Mukviboonchai Suvimol
Publication venue: Edith Cowan University, Research Online, Perth, Western Australia
Publication date: 01/01/2003
Field of study

The information required for decision making by executives in organizations is normally scattered across disparate data sources including databases and legacy systems. To gain a competitive advantage, it is extremely important for executives to be able to obtain one unique view of information in an accurate and timely manner. To do this, it is necessary to interoperate multiple data sources, which differ structurally and semantically. Particular problems occur when applying traditional integration approaches, for example, the global schema needs to be recreated when the component schema has been modified. This research investigates the following heterogeneities between heterogeneous data sources: Data Model Heterogeneities, Schematic Heterogeneities and Semantic Heterogeneities. The problems of existing integration approaches are reviewed and solved by introducing and designing a new integration approach to logically interoperate heterogeneous data sources and to resolve three previously classified heterogeneities. The research attempts to reduce the complexity of the integration process by maximising the degree of automation. Mediation and wrapping techniques are employed in this research. The Mediated Data Integration (MeDint) architecture has been introduced to integrate heterogeneous data sources. Three major elements, the MeDint Mediator, wrappers, and the Mediated Data Model (MDM) play important roles in the integration of heterogeneous data sources. The MeDint Mediator acts as an intermediate layer transforming queries to sub-queries, resolving conflicts, and consolidating conflict-resolved results. Wrappers serve as translators between the MeDint Mediator and data sources. Both the mediator and wrappers arc well-supported by MDM, a semantically-rich data model which can describe or represent heterogeneous data schematically and semantically. Some organisational information systems have been tested and evaluated using the MeDint architecture. The results have addressed all the research questions regarding the interoperability of heterogeneous data sources. In addition, the results also confirm that the Me Dint architecture is able to provide integration that is transparent to users and that the schema evolution does not affect the integration

Research Online @ ECU

Embodied-driven design : a framework to configure body representation & mapping(本文)

Author: Saraiji MHD Yamen
サライジムハマドヤメン
Publication venue: 慶應義塾大学大学院メディアデザイン研究科
Publication date
Field of study

KeiO Academic Resource Archive

Internet of Things Strategic Research Roadmap

Author: Bassi Alessandro
Doody Pat
Eisenhauer Markus
Friess Peter
Guillemin Patrick
Gusmeroli Sergio
Harrison Mark
Jubert Ignacio Soler
Mazura Margaretha
Sundmaeker Harald
Vermesan Ovidiu
Publication venue
Publication date: 01/01/2011
Field of study

Internet of Things (IoT) is an integrated part of Future Internet including existing and evolving Internet and network developments and could be conceptually defined as a dynamic global network infrastructure with self configuring capabilities based on standard and interoperable communication protocols where physical and virtual “things” have identities, physical attributes, and virtual personalities, use intelligent interfaces, and are seamlessly integrated into the information network

SINTEF Open

Recommended from our members

Multimedia delivery in the future internet

Author: Aggoun A
Amon P
Arbel I
Chernilov A
Cosmas J
Garcia G
Jari A
Keller S
Kontopoulos C
Lamy-Bergot C
Leon A
Mattavelli M
Mauthe A
Mota T
Naumann M
Navarro A
Negru O
Pinto F
Shao B
Timmerer C
Tsekleves E
Zahariadis T
Publication venue: 'Society for Leukocyte Biology'
Publication date: 01/01/2008
Field of study

The term “Networked Media” implies that all kinds of media including text, image, 3D graphics, audio and video are produced, distributed, shared, managed and consumed on-line through various networks, like the Internet, Fiber, WiFi, WiMAX, GPRS, 3G and so on, in a convergent manner [1]. This white paper is the contribution of the Media Delivery Platform (MDP) cluster and aims to cover the Networked challenges of the Networked Media in the transition to the Future of the Internet. Internet has evolved and changed the way we work and live. End users of the Internet have been confronted with a bewildering range of media, services and applications and of technological innovations concerning media formats, wireless networks, terminal types and capabilities. And there is little evidence that the pace of this innovation is slowing. Today, over one billion of users access the Internet on regular basis, more than 100 million users have downloaded at least one (multi)media file and over 47 millions of them do so regularly, searching in more than 160 Exabytes1 of content. In the near future these numbers are expected to exponentially rise. It is expected that the Internet content will be increased by at least a factor of 6, rising to more than 990 Exabytes before 2012, fuelled mainly by the users themselves. Moreover, it is envisaged that in a near- to mid-term future, the Internet will provide the means to share and distribute (new) multimedia content and services with superior quality and striking flexibility, in a trusted and personalized way, improving citizens’ quality of life, working conditions, edutainment and safety. In this evolving environment, new transport protocols, new multimedia encoding schemes, cross-layer inthe network adaptation, machine-to-machine communication (including RFIDs), rich 3D content as well as community networks and the use of peer-to-peer (P2P) overlays are expected to generate new models of interaction and cooperation, and be able to support enhanced perceived quality-of-experience (PQoE) and innovative applications “on the move”, like virtual collaboration environments, personalised services/ media, virtual sport groups, on-line gaming, edutainment. In this context, the interaction with content combined with interactive/multimedia search capabilities across distributed repositories, opportunistic P2P networks and the dynamic adaptation to the characteristics of diverse mobile terminals are expected to contribute towards such a vision. Based on work that has taken place in a number of EC co-funded projects, in Framework Program 6 (FP6) and Framework Program 7 (FP7), a group of experts and technology visionaries have voluntarily contributed in this white paper aiming to describe the status, the state-of-the art, the challenges and the way ahead in the area of Content Aware media delivery platforms

Brunel University Research Archive

Doctor of Philosophy

Author: South Brett Ray
Publication venue: University of Utah
Publication date: 01/05/2014
Field of study

dissertationManual annotation of clinical texts is often used as a method of generating reference standards that provide data for training and evaluation of Natural Language Processing (NLP) systems. Manually annotating clinical texts is time consuming, expensive, and requires considerable cognitive effort on the part of human reviewers. Furthermore, reference standards must be generated in ways that produce consistent and reliable data but must also be valid in order to adequately evaluate the performance of those systems. The amount of labeled data necessary varies depending on the level of analysis, the complexity of the clinical use case, and the methods that will be used to develop automated machine systems for information extraction and classification. Evaluating methods that potentially reduce cost, manual human workload, introduce task efficiencies, and reduce the amount of labeled data necessary to train NLP tools for specific clinical use cases are active areas of research inquiry in the clinical NLP domain. This dissertation integrates a mixed methods approach using methodologies from cognitive science and artificial intelligence with manual annotation of clinical texts. Aim 1 of this dissertation identifies factors that affect manual annotation of clinical texts. These factors are further explored by evaluating approaches that may introduce efficiencies into manual review tasks applied to two different NLP development areas - semantic annotation of clinical concepts and identification of information representing Protected Health Information (PHI) as defined by HIPAA. Both experiments integrate iv different priming mechanisms using noninteractive and machine-assisted methods. The main hypothesis for this research is that integrating pre-annotation or other machineassisted methods within manual annotation workflows will improve efficiency of manual annotation tasks without diminishing the quality of generated reference standards

The University of Utah: J. Willard Marriott Digital Library

Dynamic Assembly for System Adaptability, Dependability, and Assurance

Author: Luqi
Publication venue: Naval Postgraduate School
Publication date: 01/12/2002
Field of study

(DASASA) ProjectAuthor-contributed print ite

Calhoun, Institutional Archive of the Naval Postgraduate School