1,115,630 research outputs found

    Distributed Management of Massive Data: an Efficient Fine-Grain Data Access Scheme

    Get PDF
    This paper addresses the problem of efficiently storing and accessing massive data blocks in a large-scale distributed environment, while providing efficient fine-grain access to data subsets. This issue is crucial in the context of applications in the field of databases, data mining and multimedia. We propose a data sharing service based on distributed, RAM-based storage of data, while leveraging a DHT-based, natively parallel metadata management scheme. As opposed to the most commonly used grid storage infrastructures that provide mechanisms for explicit data localization and transfer, we provide a transparent access model, where data are accessed through global identifiers. Our proposal has been validated through a prototype implementation whose preliminary evaluation provides promising results

    Long term nitrogen budget modelling in a small agricultural watershed: hydrological control assessment of nitrogen losses with semi-distributed (SWAT) and distributed (TNT2) models

    Get PDF
    Nitrogen exports in catchments are known to be greatly variable because nitrogen cycle in watershed is controlled by different factors such as landuse, farm management practices, climate, soil type and hydrological setting. Our aim is to study the relative importance of the processes controlling nitrogen losses at catchment scale in the long term using a modelling approach constrained by a long term record of observations. The study area is a catchment of 330 ha with 95 % of intensive agriculture in a hilly shallow soil context, in the south west of France. Historical field rotation and nitrogen river load data have been collected for a 20 year period. Two process-based and spatially distributed models have been chosen to simulate nitrogen transfer and transformation in the whole catchment. The first one is the fully distributed TNT2 model, developed and validated in a different context (farming systems in north-western France). The second one is the widely used, semi-distributed SWAT model, used and recognizedto be realistic in many studies on nitrogen transfer in river. This comparative modelling approach was used to evaluate the effect of different modelling approaches on the identification of controlling factors, and the ability of both models to simulate alternative scenarios. The discharge, especially during storm flow, is well simulated by the curve number approach and the semi-distributed hydrological parameter description used SWAT, while the Topmodel-derived approach used in TNT2 tends to underestimate some peak discharges. Nitrogen dynamic simulations are considered to be acceptable for both models for a long time period but the use of both models allows to exhibit their respective capacity and limits. TNT2 has higher potentiality to test the impact of complex agricultural scenarios because the description of management practices and the simulation of crops to management options is more detailed. It permits the assessment of spatial interactions and focussed spatial management, like the set up of grass or tree strips. SWAT can then be used to scale up change scenarios from TNT2 small catchment results to large catchments

    Building Near-Real-Time Processing Pipelines with the Spark-MPI Platform

    Full text link
    Advances in detectors and computational technologies provide new opportunities for applied research and the fundamental sciences. Concurrently, dramatic increases in the three Vs (Volume, Velocity, and Variety) of experimental data and the scale of computational tasks produced the demand for new real-time processing systems at experimental facilities. Recently, this demand was addressed by the Spark-MPI approach connecting the Spark data-intensive platform with the MPI high-performance framework. In contrast with existing data management and analytics systems, Spark introduced a new middleware based on resilient distributed datasets (RDDs), which decoupled various data sources from high-level processing algorithms. The RDD middleware significantly advanced the scope of data-intensive applications, spreading from SQL queries to machine learning to graph processing. Spark-MPI further extended the Spark ecosystem with the MPI applications using the Process Management Interface. The paper explores this integrated platform within the context of online ptychographic and tomographic reconstruction pipelines.Comment: New York Scientific Data Summit, August 6-9, 201

    On the Design of Clean-Slate Network Control and Management Plane

    Get PDF
    We provide a design of clean-slate control and management plane for data networks using the abstraction of 4D architecture, utilizing and extending 4D’s concept of a logically centralized Decision plane that is responsible for managing network-wide resources. In this paper, a scalable protocol and a dynamically adaptable algorithm for assigning Data plane devices to a physically distributed Decision plane are investigated, that enable a network to operate with minimal configuration and human intervention while providing optimal convergence and robustness against failures. Our work is especially relevant in the context of ISPs and large geographically dispersed enterprise networks. We also provide an extensive evaluation of our algorithm using real-world and artificially generated ISP topologies along with an experimental evaluation using ns-2 simulator

    Improving Assumption based Distributed Belief Revision

    Get PDF
    Belief revision is a critical issue in real world DAI applications. A Multi-Agent System not only has to cope with the intrinsic incompleteness and the constant change of the available knowledge (as in the case of its stand alone counterparts), but also has to deal with possible conflicts between the agents’ perspectives. Each semi-autonomous agent, designed as a combination of a problem solver – assumption based truth maintenance system (ATMS), was enriched with improved capabilities: a distributed context management facility allowing the user to dynamically focus on the more pertinent contexts, and a distributed belief revision algorithm with two levels of consistency. This work contributions include: (i) a concise representation of the shared external facts; (ii) a simple and innovative methodology to achieve distributed context management; and (iii) a reduced inter-agent data exchange format. The different levels of consistency adopted were based on the relevance of the data under consideration: higher relevance data (detected inconsistencies) was granted global consistency while less relevant data (system facts) was assigned local consistency. These abilities are fully supported by the ATMS standard functionalities
    • …
    corecore