210 research outputs found

    Fault Tolerant Resource Allocation for Query Processing in Grid Environments

    Get PDF
    International audienceIn this paper, we propose a new algorithm for fault-tolerant resource allocation for query processing in grid environments. For this, we propose an initial resource allocation algorithm followed by a fault-tolerance protocol. The proposed fault-tolerance protocol is based on the passive replication of stateful operators in queries. We provide theoretical analyses of the proposed algorithms and consolidate our analyses with the simulations

    GeoLoc: Robust Resource Allocation Method for Query Optimization in Data Grid Systems

    Get PDF
    International audienceResource allocation (RA) is one of the key stages of distributed query processing in the Data Grid environment. In the last decade were published a number of works in the field that deals with different aspects of the problem. We believe that in those studies authors paid less attention to such important aspects as definition of allocation space and criterion of parallelism degree determination. In this paper we propose a method of RA that extends existing solutions in those two points of interest and resolves the problem in the specific conditions of the large scale heterogeneous environment of Data Grids. Firstly, we propose to use a geographical proximity of nodes to data sources to define the Allocation Space (AS). Secondly, we present the principle of execution time parity between scan and join (build and probe) operations for determination of parallelism degree and for generation of load balanced query execution plans. We conducted an experiment that proved the superiority of our GeoLoc method in terms of response time over the RA method that we chose for the comparison. The present study provides also a brief description of existing methods and their qualitative comparison with respect to proposed method

    Grid resource discovery based on web services

    Get PDF
    The size of grid systems has increased substantially in the last decades. Resource discovery in grid systems is a fundamental task which provides searching and locating necessary resources for a given process. Various different approaches are proposed in literature for this problem. Grid resource discovery using web services is an important approach which has resulted in many tools to become de facto standards of today's grid resource management. In this paper, we propose a survey of recent grid resource discovery studies based on web services. We provide synthesis, analysis and evaluation of these studies by classification. We also give a comparative study of different classes proposed

    Resource allocation for query processing in grid systems: A survey

    Get PDF
    Grid systems are very useful platforms for distributed databases, especially in some situations in which the scale of data sources and user requests is very high. However, the main characteristics of grid systems such as dynamicity, large size and heterogeneity, bring new problems to the query processing domain such as resource discovery and resource allocation. In this paper, we provide a survey related to resource allocation methods for query processing In data grid systems. We provide a classification for existing studies considering their approaches to the resource allocation problem. We provide a synthesis of the studies and propose evaluations and comparisons for the different classes of studies. ©2012 CRL Publishing Ltd

    Robust Query Optimization Methods With Respect to Estimation Errors: A Survey

    Get PDF
    International audienceThe quality of a query execution plan chosen by a Cost-Based Optimizer (CBO) depends greatly on the estimation accuracy of input parameter values. Many research results have been produced on improving the estimation accuracy, but they do not work for every situation. Therefore, "robust query optimization" was introduced, in an effort to minimize the sub-optimality risk by accepting the fact that estimates could be inaccurate. In this survey, we aim to provide an overview of robust query optimization methods by classifying them into different categories, explaining the essential ideas, listing their advantages and limitations, and comparing them with multiple criteria

    Using Metadata to Help the Integration of Several Multi-Sources Set of Updates

    Get PDF
    International audienceToday, spatial data are increasingly available on the web and users can update their datasets more easily. Different sets of updates result from diverse sources are furnished to the user, each containing updates acquired in different manners, with different quality and at different times. A special context where the data and updates could come from different sources is a military mission. Indeed, the actors are distributed between different sites and one particularity is that they can be either a producer or a user of the data. They have their own dataset and can update them in several ways but must regularly supply their evolutions to the others actors in order to guarantee the success of the mission. Therefore, each actor receives many heterogeneous sets of updates and must integrate them in their own dataset in accordance with their needs. In this context, the user receives several set of heterogeneous updates which can have different quality, which can contain errors due to the manner they were acquired and they have to integrate them in their personal dataset. Thus, all the evolutions are not necessarily interesting for the user, and conversely one set of updates may not cover all the user needs. These heterogeneous sets of updates could also be concurrent each others and be concurrent with the user dataset. In this context, how can a user efficiently update his spatial dataset with some evolutions which are not necessarily pertinent and probably concurrent? This is the essential question to answer if we want to improve the update of spatial data by different sets of evolutions which are coming from multiple sites. In this paper, we will study the main problem arising when we integrate concurrent and heterogeneous updates and we will propose a process which helps the user to integrate efficiency multi-source updates into his dataset. This process comprises several steps : Firstly, we classify the evolutions to remove the heterogeneity, secondly we take into account the user needs and exclude the non pertinent data, thirdly we check the concurrency control between all the updates, and finally we reconcile the data if a conflict was detected. This process uses metadata to choose the “best” evolution to be integrated in the dataset. The metadata used are structured in accordance with the ISO 19115 standard specifications

    Resource Allocation for Query Optimization in Data Grid Systems: Static Load Balancing Strategies

    Get PDF
    International audienceResource allocation is one of the principal stages of relational query processing in data grid systems. Static allocation methods allocate nodes to relational operations during query compilation. Existing heuristics did not take into account the multi-queries environment, where some nodes may become overloaded because they are allocated to too many concurrent queries. Dynamic resource allocation mechanisms are currently developed to modify the physical plan during query execution. In fact, when a node is detected to be overloaded, some of the operations on it will migrate. However, if the resource contention is too heavy in the initial execution plan, the operation migration cost may be very high. In this paper, we propose two load balancing strategies adopted during the static resource allocation phase, so that the workload is balanced at the beginning, the operation migration cost is decreased during the query execution, and therefore the average response time is reduced

    Modified Independent Component Analysis for Initializing Non-negative Matrix Factorization : An approach to Hyperspectral Image Unmixing (ECMS 2013)

    Get PDF
    International audienceHyperspectral unmixing consists of identifying, from mixed pixel spectra, a set of pure constituent spectra (endmembers) in a scene and a set of abundance fractions for each pixel. Most linear blind source separation (BSS) techniques are based on Independent Component Analysis (ICA) or Non-Negative Matrix Factorization (NMF). Using only one of these techniques does not resolve the unmixing problem because of, respectively, the statistical dependence between the abundance fractions of the different constituents and the non-uniqueness of the NMF results. To overcome this issue, we propose an unsupervised unmixing approach called ModifICA-NMF (which stands for modified version of ICA followed by NMF). Consider the ideal case of a hyperspectral image combining (M-1) statistically independent source images, and an Mth image depending on them due to the sum-to-one constraint. Our modified ICA first estimates these (M-1) sources and associated mixing coefficients, then derives the remaining source and coefficients, while it also removes the BSS scale indeterminacy. In real conditions, the above (M-1) sources may be somewhat dependent. Our modified ICA method then only yields approximate data. These are then used as the initial values of an NMF method, which refines them. Our tests show that this joint modifICA-NMF approach significantly outperforms the considered classical methods

    Modelling Heteregeneous and Distributed Spatial Datasets in an Update Context

    Get PDF
    Pages de la publication : CDRom.International audienceThe update of distributed geographic data still poses many problems due essentially to the data's specific characteristics (spatial constituent, topology, for example). We propose a metadata model to aid in the management of different actors located at several sites handling heterogeneous data that are regularly updated. This model is based on the ISO 19115 standard, which is the metadata standard for geographic information
    • …
    corecore