10 research outputs found

    M-Grid: Similarity Searching in Grids

    Get PDF
    The problem of similarity searching is nowadays attracting a lot of attention, because upcoming applications process complex data and the traditional exact match searching is not sufficient. There are efficient solutions, but they are tailored for the needs of specific data domains. General solutions, based on the metric space abstraction, are extensible, but they are designed to operate on a single computer only. Therefore, their scalability is limited and they cannot adapt to different performance requirements. In this paper, we propose a distributed access structure which is fully dynamic and exploits a Grid infrastructure. We study properties of this structure in numerous experiments. Besides, the performance tuning is analyzed with respect to user-specific requirements which include the maximum response time and the number of queries executed concurrently.The problem of similarity searching is nowadays attracting a lot of attention, because upcoming applications process complex data and the traditional exact match searching is not sufficient. There are efficient solutions, but they are tailored for the needs of specific data domains. General solutions, based on the metric space abstraction, are extensible, but they are designed to operate on a single computer only. Therefore, their scalability is limited and they cannot adapt to different performance requirements. In this paper, we propose a distributed access structure which is fully dynamic and exploits a Grid infrastructure. We study properties of this structure in numerous experiments. Besides, the performance tuning is analyzed with respect to user-specific requirements which include the maximum response time and the number of queries executed concurrently

    Quality Management in Big Data

    No full text
    Due to the importance of quality issues in Big Data, Big Data quality management has attracted significant research attention on how to measure, improve and manage the quality for Big Data. This special issue in the Journal of Informatics thus tends to address the quality problems in Big Data as well as promote further research for Big Data quality. Our editorial describes the state-of-the-art research challenges in the Big Data quality research, and highlights the contributions of each paper accepted in this special issue

    Adaptive Approximate Similarity Searching through Metric Social Networks

    No full text
    Reproduction of all or part of this work is permitted for educational or research use on condition that this copyright notice is included in any copy. Publications in the FI MU Report Series are in general accessible via WWW

    Similarity search: the metric space approach

    No full text

    EuDML Assessment and Evaluation — Final Report

    No full text
    This evaluation report’s findings are that the project was delivered on schedule, meeting almost all the defined evaluation criteria, while almost all its performance parameters are found to be at or above their expected values. The content-providers and user feedback on the EuDML Release (version 1.4) show that the system is currently stable, functional and useful. Most of the suggested improvements made by users while answering the surveys were translated and tracked as bugs through the Mantis system (the EuDML bug-tracking system). Many of them have been attended to and the result is the current Release (version 2). Few work areas are left to be refined, mostly documenting some services or processes. This report ends with three recommendations:1. Two new global performance (public effectiveness) parameters should be defined and monitored: • the ratio between the items freely available to the public and the total number of searchable items in EuDML (currently >87%); • the ratio between full-text indexed items and the total number of searchable items in EuDML (currently 59%).Thus, the overall aim of EuDML would be, between adding new collections, to achieve a value of 1 for both of these parameters.2. Emphasis should be put on documenting clearly and concisely EuDML’s functionalities for content-providers, Scientific Advisory Board and different categories of users (e.g. a Help/FAQ section should be created).3. More robust steps should be taken by the partners to ensure the future growth and sustainability of EuDML

    Multivariate visualization methods

    No full text
    corecore