681,033 research outputs found

    A Taxonomy of Data Grids for Distributed Data Sharing, Management and Processing

    Full text link
    Data Grids have been adopted as the platform for scientific communities that need to share, access, transport, process and manage large data collections distributed worldwide. They combine high-end computing technologies with high-performance networking and wide-area storage management techniques. In this paper, we discuss the key concepts behind Data Grids and compare them with other data sharing and distribution paradigms such as content delivery networks, peer-to-peer networks and distributed databases. We then provide comprehensive taxonomies that cover various aspects of architecture, data transportation, data replication and resource allocation and scheduling. Finally, we map the proposed taxonomy to various Data Grid systems not only to validate the taxonomy but also to identify areas for future exploration. Through this taxonomy, we aim to categorise existing systems to better understand their goals and their methodology. This would help evaluate their applicability for solving similar problems. This taxonomy also provides a "gap analysis" of this area through which researchers can potentially identify new issues for investigation. Finally, we hope that the proposed taxonomy and mapping also helps to provide an easy way for new practitioners to understand this complex area of research.Comment: 46 pages, 16 figures, Technical Repor

    Any Data, Any Time, Anywhere: Global Data Access for Science

    Full text link
    Data access is key to science driven by distributed high-throughput computing (DHTC), an essential technology for many major research projects such as High Energy Physics (HEP) experiments. However, achieving efficient data access becomes quite difficult when many independent storage sites are involved because users are burdened with learning the intricacies of accessing each system and keeping careful track of data location. We present an alternate approach: the Any Data, Any Time, Anywhere infrastructure. Combining several existing software products, AAA presents a global, unified view of storage systems - a "data federation," a global filesystem for software delivery, and a workflow management system. We present how one HEP experiment, the Compact Muon Solenoid (CMS), is utilizing the AAA infrastructure and some simple performance metrics.Comment: 9 pages, 6 figures, submitted to 2nd IEEE/ACM International Symposium on Big Data Computing (BDC) 201

    Information logistics: A production-line approach to information services

    Get PDF
    Logistics can be defined as the process of strategically managing the acquisition, movement, and storage of materials, parts, and finished inventory (and the related information flow) through the organization and its marketing channels in a cost effective manner. It is concerned with delivering the right product to the right customer in the right place at the right time. The logistics function is composed of inventory management, facilities management, communications unitization, transportation, materials management, and production scheduling. The relationship between logistics and information systems is clear. Systems such as Electronic Data Interchange (EDI), Point of Sale (POS) systems, and Just in Time (JIT) inventory management systems are important elements in the management of product development and delivery. With improved access to market demand figures, logisticians can decrease inventory sizes and better service customer demand. However, without accurate, timely information, little, if any, of this would be feasible in today's global markets. Information systems specialists can learn from logisticians. In a manner similar to logistics management, information logistics is concerned with the delivery of the right data, to the ring customer, at the right time. As such, information systems are integral components of the information logistics system charged with providing customers with accurate, timely, cost-effective, and useful information. Information logistics is a management style and is composed of elements similar to those associated with the traditional logistics activity: inventory management (data resource management), facilities management (distributed, centralized and decentralized information systems), communications (participative design and joint application development methodologies), unitization (input/output system design, i.e., packaging or formatting of the information), transportations (voice, data, image, and video communication systems), materials management (data acquisition, e.g., EDI, POS, external data bases, data entry) and production scheduling (job, staff, and project scheduling)

    Signal Processing for Caching Networks and Non-volatile Memories

    Get PDF
    The recent information explosion has created a pressing need for faster and more reliable data storage and transmission schemes. This thesis focuses on two systems: caching networks and non-volatile storage systems. It proposes network protocols to improve the efficiency of information delivery and signal processing schemes to reduce errors at the physical layer as well. This thesis first investigates caching and delivery strategies for content delivery networks. Caching has been investigated as a useful technique to reduce the network burden by prefetching some contents during o˙-peak hours. Coded caching [1] proposed by Maddah-Ali and Niesen is the foundation of our algorithms and it has been shown to be a useful technique which can reduce peak traffic rates by encoding transmissions so that different users can extract different information from the same packet. Content delivery networks store information distributed across multiple servers, so as to balance the load and avoid unrecoverable losses in case of node or disk failures. On one hand, distributed storage limits the capability of combining content from different servers into a single message, causing performance losses in coded caching schemes. But, on the other hand, the inherent redundancy existing in distributed storage systems can be used to improve the performance of those schemes through parallelism. This thesis proposes a scheme combining distributed storage of the content in multiple servers and an efficient coded caching algorithm for delivery to the users. This scheme is shown to reduce the peak transmission rate below that of state-of-the-art algorithms

    WSN and RFID integration to support intelligent monitoring in smart buildings using hybrid intelligent decision support systems

    Get PDF
    The real time monitoring of environment context aware activities is becoming a standard in the service delivery in a wide range of domains (child and elderly care and supervision, logistics, circulation, and other). The safety of people, goods and premises depends on the prompt reaction to potential hazards identified at an early stage to engage appropriate control actions. This requires capturing real time data to process locally at the device level or communicate to backend systems for real time decision making. This research examines the wireless sensor network and radio frequency identification technology integration in smart homes to support advanced safety systems deployed upstream to safety and emergency response. These systems are based on the use of hybrid intelligent decision support systems configured in a multi-distributed architecture enabled by the wireless communication of detection and tracking data to support intelligent real-time monitoring in smart buildings. This paper introduces first the concept of wireless sensor network and radio frequency identification technology integration showing the various options for the task distribution between radio frequency identification and hybrid intelligent decision support systems. This integration is then illustrated in a multi-distributed system architecture to identify motion and control access in a smart building using a room capacity model for occupancy and evacuation, access rights and a navigation map automatically generated by the system. The solution shown in the case study is based on a virtual layout of the smart building which is implemented using the capabilities of the building information model and hybrid intelligent decision support system.The Saudi High Education Ministry and Brunel University (UK

    EDOS Initiatives to Decrease Latency of NRT Data for LANCE

    Get PDF
    NASA's EOS Data and Operations System (EDOS) is the primary supplier of NRT (near real-time) data to the NASA NRT user community known as the Land, Atmosphere NRT Capability for EOS (LANCE). EDOS provides NRT data for various instruments on the EOS missions Terra, Aqua, Aura, as well as for the NOAA missions Suomi NPP and NOAA-20. This poster describes an overview of the EDOS multi-mission system with emphasis on the NRT products distributed for LANCE elements: AIRS, MISR, MLS, MODIS, MOPITT, OMPS, OMI and VIIRS. Remote EDOS high-rate data capture systems are deployed at NASA ground stations which provide data-driven capture of high-rate science for EOS missions. The remote EDOS components transfer the science data via high-rate WANs to the centralized EDOS Level-zero processing systems located at Goddard Space Flight Center. EDOS produces session-based data sets especially for LANCE NRT use from a single ground station contact; this data is sent to dual LANCE destinations as part of the standard redundancy requirement for LANCE elements. EDOS has implemented various latency improvements with the ultimate goal to have EDOS processing of NRT data keep up with the spacecraft data downlink. EDOS enhancements have included implementation of priority-based QoS, expanded network architecture to include open networks, and use of a delay-tolerant protocol. EDOS has streamlined its systems and infrastructure to minimize latency for NRT data delivery for LANCE. EDOS begins to transfer the NRT data to the LANCE elements within minutes of the end of the contact session with an average packet latency from instrument observation to Level 0 product delivery to each LANCE element of just over one hour
    corecore