Search CORE

22,688 research outputs found

PRIVAS - automatic anonymization of databases

Author: Berón Mario
Henriques Pedro Rangel
Miguel Joana
Pereira Maria João
Publication venue: 'IADIS - International Association for the Development of the Information Society'
Publication date: 01/01/2019
Field of study

Currently, given the technological evolution, data and information are increasingly valuable in the most diverse areas for the most various purposes. Although the information and knowledge discovered by the exploration and use of data can be very valuable in many applications, people have been increasingly concerned about the other side, that is, the privacy threats that these processes bring. The system Privas, described in this paper, will aid the Data Publisher to pre-process the database before publishing. For that, a DSL is used to define the database schema description, identify the sensitive data and the desired privacy level. After that a Privas processor will process the DSL program and interpret it to automatically transform the repository schema. The automatization of the anonymization process is the main contribution and novelty of this work.info:eu-repo/semantics/publishedVersio

Crossref

Biblioteca Digital do IPB

Data access and integration in the ISPIDER proteomics grid

Author: C.A. Goble
E.M. Zdobnov
J. Smith
L.M. Haas
M. Antonioletti
M. Maibaum
P. Buneman
P. Mçbrien
R.G.G. Cattell
S. Bowers
S. Durinck
S.B. Davidson
T.M. Oinn
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2006
Field of study

Grid computing has great potential for supporting the integration of complex, fast changing biological data repositories to enable distributed data analysis. One scenario where Grid computing has such potential is provided by proteomics resources which are rapidly being developed with the emergence of affordable, reliable methods to study the proteome. The protein identifications arising from these methods derive from multiple repositories which need to be integrated to enable uniform access to them. A number of technologies exist which enable these resources to be accessed in a Grid environment, but the independent development of these resources means that significant data integration challenges, such as heterogeneity and schema evolution, have to be met. This paper presents an architecture which supports the combined use of Grid data access (OGSA-DAI), Grid distributed querying (OGSA-DQP) and data integration (AutoMed) software tools to support distributed data analysis. We discuss the application of this architecture for the integration of several autonomous proteomics data resources

CiteSeerX

Crossref

Birkbeck Institutional Research Online

The University of Manchester - Institutional Repository

Using schema transformation pathways for data lineage tracing

Author: A. Woodruff
C. Faloutsos
H. Fan
H. Fan
H. Fan
J. Albert
L. Zamboulis
L. Zamboulis
M. Boyd
P. Buneman
P. Buneman
P. McBrien
P. McBrien
P.A. Bernstein
Y. Cui
Y. Cui
Y. Cui
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2005
Field of study

With the increasing amount and diversity of information available on the Internet, there has been a huge growth in information systems that need to integrate data from distributed, heterogeneous data sources. Tracing the lineage of the integrated data is one of the problems being addressed in data warehousing research. This paper presents a data lineage tracing approach based on schema transformation pathways. Our approach is not limited to one specific data model or query language, and would be useful in any data transformation/integration framework based on sequences of primitive schema transformations

CiteSeerX

Crossref

Birkbeck Institutional Research Online

Architecture and quality in data warehouses - An extended repository approach.

Author: Jarke M.
Jeusfeld M.A.
Quix C.
Vassiliadis P.
Publication venue
Publication date
Field of study

Research Papers in Economics

An Ontology Based Method to Solve Query Identifier Heterogeneity in Post-Genomic Clinical Trials

Author: Anguita Sanchez Alberto
Crespo del Arco Jose
Martín Martín Luis
Tsiknakis Manolis
Publication venue: Facultad de Informática (UPM)
Publication date: 01/01/2008
Field of study

The increasing amount of information available for biomedical research has led to issues related to knowledge discovery in large collections of data. Moreover, Information Retrieval techniques must consider heterogeneities present in databases, initially belonging to different domains—e.g. clinical and genetic data. One of the goals, among others, of the ACGT European is to provide seamless and homogeneous access to integrated databases. In this work, we describe an approach to overcome heterogeneities in identifiers inside queries. We present an ontology classifying the most common identifier semantic heterogeneities, and a service that makes use of it to cope with the problem using the described approach. Finally, we illustrate the solution by analysing a set of real queries

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Archivo Digital UPM