73 research outputs found

    ASAP Top-k Query Processing in Unstructured P2P Systems

    Get PDF
    International audienceTop-k query processing techniques are useful in unstructured peer-to-peer (P2P) systems, to avoid overwhelming users with too many results. However, existing approaches suffer from long waiting times. This is because top-k results are returned only when all queried peers have finished processing the query. As a result, query response time is dominated by the slowest queried peer. In this paper, we address this users' waiting time problem. For this, we revisit top-k query processing in P2P systems by introducing two novel notions in addition to response time: the stabilization time and the cumulative quality gap. Using thèse notions, we formally define the as-soon-as-possible (ASAP) top-k processing problem. Then, we propose a family of algorithms called ASAP to deal with this problem. We validated our solution through implementation and extensive experimentation. The results show that ASAP significantly outperforms baseline algorithms by returning final top-k result to users in much better times

    As-Soon-As-Possible Top-k Query Processing in P2P Systems

    Get PDF
    International audienceTop-k query processing techniques provide two main advantages for unstructured peer-to-peer (P2P) systems. First they avoid overwhelming users with too many results. Second they reduce significantly network resources consumption. However, existing approaches suffer from long waiting times. This is because top-k results are returned only when all queried peers have finished processing the query. As a result, query response time is dominated by the slowest queried peer. In this paper, we address this users' waiting time problem. For this, we revisit top-k query processing in P2P systems by introducing two novel notions in addition to response time: the stabilization time and the cumulative quality gap. Using these notions, we formally define the as-soonas-possible (ASAP) top-k processing problem. Then, we propose a family of algorithms called ASAP to deal with this problem. We validate our solution through implementation and extensive experimentation. The results show that ASAP significantly outperforms baseline algorithms by returning final top-k result to users in much better times

    The evolution of business analytics : based on case study research

    Get PDF
    While business analytics is becoming more significant and widely used by companies from increasing industries, for many the concept remains a complex illusion. The field of business analytics is considerably generic and fragmented, leaving managers confused and ultimately inhibited to make valuable decisions. This paper presents an evolutionary depiction of business analytics, using real-world case studies to illustrate a distinct overview that describes where the phenomenon was derived from, where it currently stands, and where it is heading towards. This paper provides eight case studies, representing three different eras: yesterday (1950s to 1990s), today (2000s to 2020s), and tomorrow (2030s to 2050s). Through cross-case analysis we have identified concluding patterns that lay as foundation for the discussion on future development within business analytics. We argue based on our findings that automatization of business processes will most likely continue to increase. AI is expanding in numerous areas, each specializing in a complex task, previously reserved by professionals. However, patterns show that new occupations linked to artificial intelligence will most probably be created. For the training of intelligent systems, data will most likely be requested more than ever. The increasing data will likely cause complications in current data infrastructures, causing the need for stronger networks and systems. The systems will need to process, store, and manage the great amount of various data types in real-time, while maintaining high security. Furthermore, data privacy concerns have become more significant in recent years, although, the case study research indicates that it has not limited corporations access to data. On the contrary, corporations, people, and devices will most likely become even more connected than ever before.nhhma

    A peer to peer approach to large scale information monitoring

    Get PDF
    Issued as final reportNational Science Foundation (U.S.

    The ViP2P Platform: XML Views in P2P

    Get PDF
    The growing volumes of XML data sources on the Web or produced by enterprises, organizations etc. raise many performance challenges for data management applications. In this work, we are concerned with the distributed, peer-to-peer management of large corpora of XML documents, based on distributed hash table (or DHT, in short) overlay networks. We present ViP2P (standing for Views in Peer-to-Peer), a distributed platform for sharing XML documents based on a structured P2P network infrastructure (DHT). At the core of ViP2P stand distributed materialized XML views, defined by arbitrary XML queries, filled in with data published anywhere in the network, and exploited to efficiently answer queries issued by any network peer. ViP2P allows user queries to be evaluated over XML documents published by peers in two modes. First, a long-running subscription mode, when a query can be registered in the system and receive answers incrementally when and if published data matches the query. Second, queries can also be asked in an ad-hoc, snapshot mode, where results are required immediately and must be computed based on the results of other long-running, subscription queries. ViP2P innovates over other similar DHT-based XML sharing platforms by using a very expressive structured XML query language. This expressivity leads to a very flexible distribution of XML content in the ViP2P network, and to efficient snapshot query execution. ViP2P has been tested in real deployments of hundreds of computers. We present the platform architecture, its internal algorithms, and demonstrate its efficiency and scalability through a set of experiments. Our experimental results outgrow by orders of magnitude similar competitor systems in terms of data volumes, network size and data dissemination throughput.Comment: RR-7812 (2011

    Peer to peer multidimensional overlays: Approximating complex structures

    Get PDF
    Peer to peer overlay networks have proven to be a good support for storing and retrieving data in a fully decentralized way. A sound approach is to structure them in such a way that they reflect the structure of the application. Peers represent objects of the application so that neighbours in the peer to peer network are objects having similar characteristics from the application's point of view. Such structured peer to peer overlay networks provide a natural support for range queries. While some complex structures such as a Voronoï tessellation, where each peer is associated to a cell in the space, are clearly relevant to structure the objects, the associated cost to compute and maintain these structures is usually extremely high for dimensions larger than 2. We argue that an approximation of a complex structure is enough to provide a native support of range queries. This stems fromthe fact that neighbours are importantwhile the exact space partitioning associated to a given peer is not as crucial. In this paper we present the design, analysis and evaluation of RayNet, a loosely structured Voronoï-based overlay network. RayNet organizes peers in an approximation of a Voronoï tessellation in a fully decentralized way. It relies on a Monte-Carlo algorithm to estimate the size of a cell and on an epidemic protocol to discover neighbours. In order to ensure efficient (polylogarithmic) routing, RayNet is inspired from the Kleinberg's small world model where each peer gets connected to close neighbours (its approximate Voronoï neighbours in Raynet) and shortcuts, long range neighbours, implemented using an existing Kleinberg-like peer sampling

    Application Layer Architectures for Disaster Response Systems

    Get PDF
    Traditional disaster response methods face several issues such as limited situational awareness, lack of interoperability and reliance on voice-oriented communications. Disaster response systems (DRSs) aim to address these issues and assist responders by providing a wide range of services. Since the network infrastructure in disaster area may become non-operational, mobile ad-hoc networks (MANETs) are the only alternative to provide connectivity and other network services. Because of the dynamic nature of MANETs the applications/services provided by DRSs should be based on distributed architectures. These distributed application/services form overlays on top of MANETs. This thesis aims to improve three main aspect of DRSs: interoperability, automation, and prioritization. Interoperability enables the communication and collaboration between different rescue teams which improve the efficiency of rescue operations and avoid potential interferences between teams. Automation allows responders to focus more on their tasks by minimizing the required human interventions in DRSs. Automation also allows machines to operate in areas where human cannot because of safety issues. Prioritization ensures that emergency services (e.g. firefighter communications) in DRSs have higher priority to receive resources (e.g. network services) than non-emergency services (e.g. new reporters’ communications). Prioritizing vital services in disaster area can save lives. This thesis proposes application layer architectures that enable three important services in DRSs and contribute to the improvement of the three aforementioned aspects of DRSs: overlay interconnection, service discovery and differentiated quality of service (QoS). The overlay interconnection architecture provides a distributed and scalable mechanism to interconnect end-user application overlays and gateway overlays in MANETs. The service discovery architecture is a distributed directory-based service discovery mechanism based on the standard Domain Name System (DNS) protocol. Lastly, a differentiated QoS architecture is presented that provides admission control and policy enforcement functions based on a given prioritization scheme. For each of the provided services, a motivation scenario is presented, requirements are derived and related work is evaluated with respect to these requirements. Furthermore, performance evaluations are provided for each of the proposed architectures. For the overlay interconnection architecture, a prototype is presented along with performance measurements. The results show that our architecture achieves acceptable request-response delays and network load overhead. For the service discovery architecture, extensive simulations have been run to evaluate the performance of our architecture and to compare it with the Internet Engineering Task Force (IETF) directory-less service discovery proposal based on Multicast DNS. The results show that our architecture generates less overall network load and ensures successful discovery with higher probability. Finally, for the differentiated QoS architecture, simulations results show that our architecture not only enables differentiated QoS, it also improves overall QoS in terms of the number of successful overlay flows

    HEC: Collaborative Research: SAM^2 Toolkit: Scalable and Adaptive Metadata Management for High-End Computing

    Get PDF
    The increasing demand for Exa-byte-scale storage capacity by high end computing applications requires a higher level of scalability and dependability than that provided by current file and storage systems. The proposal deals with file systems research for metadata management of scalable cluster-based parallel and distributed file storage systems in the HEC environment. It aims to develop a scalable and adaptive metadata management (SAM2) toolkit to extend features of and fully leverage the peak performance promised by state-of-the-art cluster-based parallel and distributed file storage systems used by the high performance computing community. There is a large body of research on data movement and management scaling, however, the need to scale up the attributes of cluster-based file systems and I/O, that is, metadata, has been underestimated. An understanding of the characteristics of metadata traffic, and an application of proper load-balancing, caching, prefetching and grouping mechanisms to perform metadata management correspondingly, will lead to a high scalability. It is anticipated that by appropriately plugging the scalable and adaptive metadata management components into the state-of-the-art cluster-based parallel and distributed file storage systems one could potentially increase the performance of applications and file systems, and help translate the promise and potential of high peak performance of such systems to real application performance improvements. The project involves the following components: 1. Develop multi-variable forecasting models to analyze and predict file metadata access patterns. 2. Develop scalable and adaptive file name mapping schemes using the duplicative Bloom filter array technique to enforce load balance and increase scalability 3. Develop decentralized, locality-aware metadata grouping schemes to facilitate the bulk metadata operations such as prefetching. 4. Develop an adaptive cache coherence protocol using a distributed shared object model for client-side and server-side metadata caching. 5. Prototype the SAM2 components into the state-of-the-art parallel virtual file system PVFS2 and a distributed storage data caching system, set up an experimental framework for a DOE CMS Tier 2 site at University of Nebraska-Lincoln and conduct benchmark, evaluation and validation studies
    • …
    corecore