29 research outputs found

    Taking advantages of ontology and contexts to determine similarity of data

    Get PDF
    Data integration is the process of unifying data sharing some common semantics but are originated from unrelated sources. In our work we consider these sources are autonomous, heterogeneous and they are physically distributed. These three characteristics make the integration task more difficult as there are several aspects to bear in mind. In this work we only focus on one of these aspects, the semantic heterogeneity, which deals with the meaning of the concepts within the information sources. As each source contains a specific vocabulary according to its understanding of the world, terms denoting same meaning can be very difficult to find. In this paper we will briefly explain our method to find similarities using ontologies and contexts. We will propose some improvements in the similarity functions in order to take advantages of the information the ontologies provide.Eje: I - Workshop de Ingeniería de Software y Base de DatosRed de Universidades con Carreras en Informática (RedUNCI

    An ontology approach to data integration

    Get PDF
    The term Federated Databases refers to the data integration of distributed, autonomous and heterogeneous databases. However, a federation can also include information systems, not only databases. At integrating data, several issues must be addressed. Here, we focus on the problem of heterogeneity, more specifically on semantic heterogeneity that is, problems rela ted to semantically equivalent concepts or semantically related/unrelated concepts. In order to address this problem, we apply the idea of ontologies as a tool for data integration. In this paper, we explain this concept and we briefly describe a method for constructing an ontology by using a hybrid ontology approach.Facultad de Informátic

    An ontology approach to data integration

    Get PDF
    The term Federated Databases refers to the data integration of distributed, autonomous and heterogeneous databases. However, a federation can also include information systems, not only databases. At integrating data, several issues must be addressed. Here, we focus on the problem of heterogeneity, more specifically on semantic heterogeneity that is, problems rela ted to semantically equivalent concepts or semantically related/unrelated concepts. In order to address this problem, we apply the idea of ontologies as a tool for data integration. In this paper, we explain this concept and we briefly describe a method for constructing an ontology by using a hybrid ontology approach.Facultad de Informátic

    Taking advantages of ontology and contexts to determine similarity of data

    Get PDF
    Data integration is the process of unifying data sharing some common semantics but are originated from unrelated sources. In our work we consider these sources are autonomous, heterogeneous and they are physically distributed. These three characteristics make the integration task more difficult as there are several aspects to bear in mind. In this work we only focus on one of these aspects, the semantic heterogeneity, which deals with the meaning of the concepts within the information sources. As each source contains a specific vocabulary according to its understanding of the world, terms denoting same meaning can be very difficult to find. In this paper we will briefly explain our method to find similarities using ontologies and contexts. We will propose some improvements in the similarity functions in order to take advantages of the information the ontologies provide.Eje: I - Workshop de Ingeniería de Software y Base de DatosRed de Universidades con Carreras en Informática (RedUNCI

    Succinct Data Structures in the Realm of GIS

    Get PDF
    Presented at the 4th XoveTIC Conference, A Coruña, Spain, 7–8 October 2021.[Abstract] Geographic Information Systems (GIS) have spread all over our technological environment in the last decade. The inclusion of GPS technologies in everyday portable devices along with the creation of massive shareable geographical data banks has boosted the rise of geoinformatics. Despite the technological maturity of this field, there are still relevant research challenges concerning efficient information storage and representation. One of the most powerful techniques to tackle these issues is designing new Succinct Data Structures (SDS). These structures are defined by three main characteristics: they use a compact representation of the data, they have self-index properties and, as a consequence, they do not need decompression to process the enclosed information. Thus, SDS are not only capable of storing geographical data using as little space as possible, but they can also solve queries efficiently without any previous decompression. This work introduces how SDS can be successfully applied in the GIS context through several novel approaches and practical use cases.This work is partially funded by the CITIC research center funded by Xunta/FEDER-UE 2014-2020 Program, ED431G 2019/01. MICINN(PGE/ERDF) [EXTRA-Compact: PID2020-114635RB-I00]Xunta de Galicia; ED431G 2019/0

    Lossless compression of industrial time series with direct access

    Get PDF
    [EN]The new opportunities generated by the data-driven economy in the manufacturing industry have caused many companies opt for it. However, the size of time series data that need to be captured creates the problem of having to assume high storage costs. Moreover, these costs, which are constantly growing, begin to have an impact on the profitability of companies. Thus, in this scenario, the need arises to develop techniques that allow obtaining reduced representations of the time series. In this paper, we present a lossless compression method for industrial time series that allows an efficient access. That is, our aim goes beyond pure compression, where the usual way to access the data requires a complete decompression of the dataset before processing it. Instead, our method allows decompressing portions of the dataset, and moreover, it allows direct querying the compressed data. Thus, the proposed method combines the efficient access, typical of lossy methods, with the lossless compression.For the A Coruna team: This work was supported by CITIC, as Research Center accredited by Galician University System, is funded by "Conselleria de Cultura, Educacion e Universidade from Xunta de Galicia", supported in an 80% through ERDF Funds, ERDF Operational Programme Galicia 2014-2020, and the remaining 20% by "Secretaria Xeral de Universidades" (Grant ED431G 2019/01) , Xunta de Galicia/FEDER-UE under Grants [IG240.2020.1.185; IN852A 2018/14] and Ministerio de Ciencia, Innovacion under Grants [TIN2016-78011-C4-1-R; RTC-2017-5908-7] . For the Basque team: Ministerio de Ciencia, Innovacion y Universidades under Grant [FEDER/TIN2016-78011-C4-2-R] and the Basque Government under Grant No. [IT1330-19] . Funding for open access charge: Universidade da Coruna/CISUG

    Spatial selection of sparse pivots for similarity search in metric spaces

    Get PDF
    Similarity search is a fundamental operation for applications that deal with unstructured data sources. In this paper we propose a new pivot-based method for similarity search, called Sparse Spatial Selection (SSS). The main characteristic of this method is that it guarantees a good pivot selection more efficiently than other methods previously proposed. In addition, SSS adapts itself to the dimensionality of the metric space we are working with, without being necessary to specify in advance the number of pivots to use. Furthermore, SSS is dynamic, that is, it is capable to support object insertions in the database efficiently, it can work with both continuous and discrete distance functions, and it is suitable for secondary memory storage. In this work we provide experimental results that confirm the advantages of the method with several vector and metric spaces. We also show that the efficiency of our proposal is similar to that of other existing ones over vector spaces, although it is better over general metric spaces.Facultad de Informátic

    Sistema de consulta vía Web para el Instituto Andaluz de Patrimonio Histórico

    Get PDF
    En este artículo se presenta un sistema de consulta vía Web que está siendo desarrollado para el Instituto Andaluz de Patrimonio Histórico. Dicho sistema utiliza una arquitectura, basada en ontologías, para la consulta vía Web de Bibliotecas Digitales. Dicha arquitectura proporciona independencia física y lógica a la aplicación de consulta frente a la base de datos. La interfaz de usuario utiliza la estrategia del Lenguaje Natural Acotado (BNL), por lo que se garantiza la facilidad de su uso.CYCIT TEL99-0335-C04-02CYCIT TIC2000-1673-C06- 0

    Compresión de índices para bases de datos textuales

    Get PDF
    Mientras que en bases de datos tradicionales los índices ocupan menos espacio que el conjunto de datos indexados, en bases de datos de texto el índice generalmente ocupa más espacio que el texto pudiendo necesitar de 4 a 20 veces el tamaño del mismo. Una alternativa para reducir el espacio ocupado por el índice es buscar una representación compacta del mismo, manteniendo las facilidades de navegación sobre la estructura. Pero en grandes colecciones de texto, el índice aún comprimido suele ser demasiado grande como para residir en memoria principal. En estos casos, la cantidad de accesos a memoria secundaria realizados durante el proceso de búsqueda es un factor crítico en la performance del índice. En este trabajo estamos interesados en el diseño de índices comprimidos y en memoria secundaria para búsquedas en texto, un tema de creciente interés en la comunidad de bases de datos.Eje: Bases de datos y minería de datosRed de Universidades con Carreras en Informática (RedUNCI

    Compresión de índices para bases de datos textuales

    Get PDF
    Mientras que en bases de datos tradicionales los índices ocupan menos espacio que el conjunto de datos indexados, en bases de datos de texto el índice generalmente ocupa más espacio que el texto pudiendo necesitar de 4 a 20 veces el tamaño del mismo. Una alternativa para reducir el espacio ocupado por el índice es buscar una representación compacta del mismo, manteniendo las facilidades de navegación sobre la estructura. Pero en grandes colecciones de texto, el índice aún comprimido suele ser demasiado grande como para residir en memoria principal. En estos casos, la cantidad de accesos a memoria secundaria realizados durante el proceso de búsqueda es un factor crítico en la performance del índice. En este trabajo estamos interesados en el diseño de índices comprimidos y en memoria secundaria para búsquedas en texto, un tema de creciente interés en la comunidad de bases de datos.Eje: Bases de datos y minería de datosRed de Universidades con Carreras en Informática (RedUNCI
    corecore