11,281 research outputs found

    Stochastic Query Covering for Fast Approximate Document Retrieval

    Get PDF
    We design algorithms that, given a collection of documents and a distribution over user queries, return a small subset of the document collection in such a way that we can efficiently provide high-quality answers to user queries using only the selected subset. This approach has applications when space is a constraint or when the query-processing time increases significantly with the size of the collection. We study our algorithms through the lens of stochastic analysis and prove that even though they use only a small fraction of the entire collection, they can provide answers to most user queries, achieving a performance close to the optimal. To complement our theoretical findings, we experimentally show the versatility of our approach by considering two important cases in the context of Web search. In the first case, we favor the retrieval of documents that are relevant to the query, whereas in the second case we aim for document diversification. Both the theoretical and the experimental analysis provide strong evidence of the potential value of query covering in diverse application scenarios

    Caching in Multidimensional Databases

    Get PDF
    One utilisation of multidimensional databases is the field of On-line Analytical Processing (OLAP). The applications in this area are designed to make the analysis of shared multidimensional information fast [9]. On one hand, speed can be achieved by specially devised data structures and algorithms. On the other hand, the analytical process is cyclic. In other words, the user of the OLAP application runs his or her queries one after the other. The output of the last query may be there (at least partly) in one of the previous results. Therefore caching also plays an important role in the operation of these systems. However, caching itself may not be enough to ensure acceptable performance. Size does matter: The more memory is available, the more we gain by loading and keeping information in there. Oftentimes, the cache size is fixed. This limits the performance of the multidimensional database, as well, unless we compress the data in order to move a greater proportion of them into the memory. Caching combined with proper compression methods promise further performance improvements. In this paper, we investigate how caching influences the speed of OLAP systems. Different physical representations (multidimensional and table) are evaluated. For the thorough comparison, models are proposed. We draw conclusions based on these models, and the conclusions are verified with empirical data.Comment: 14 pages, 5 figures, 8 tables. Paper presented at the Fifth Conference of PhD Students in Computer Science, Szeged, Hungary, 27 - 30 June 2006. For further details, please refer to http://www.inf.u-szeged.hu/~szepkuti/papers.html#cachin

    Cooperative Caching for Multimedia Streaming in Overlay Networks

    Get PDF
    Traditional data caching, such as web caching, only focuses on how to boost the hit rate of requested objects in caches, and therefore, how to reduce the initial delay for object retrieval. However, for multimedia objects, not only reducing the delay of object retrieval, but also provisioning reasonably stable network bandwidth to clients, while the fetching of the cached objects goes on, is important as well. In this paper, we propose our cooperative caching scheme for a multimedia delivery scenario, supporting a large number of peers over peer-to-peer overlay networks. In order to facilitate multimedia streaming and downloading service from servers, our caching scheme (1) determines the appropriate availability of cached stream segments in a cache community, (2) determines the appropriate peer for cache replacement, and (3) performs bandwidth-aware and availability-aware cache replacement. By doing so, it achieves (1) small delay of stream retrieval, (2) stable bandwidth provisioning during retrieval session, and (3) load balancing of clients' requests among peers

    Fog-enabled Edge Learning for Cognitive Content-Centric Networking in 5G

    Full text link
    By caching content at network edges close to the users, the content-centric networking (CCN) has been considered to enforce efficient content retrieval and distribution in the fifth generation (5G) networks. Due to the volume, velocity, and variety of data generated by various 5G users, an urgent and strategic issue is how to elevate the cognitive ability of the CCN to realize context-awareness, timely response, and traffic offloading for 5G applications. In this article, we envision that the fundamental work of designing a cognitive CCN (C-CCN) for the upcoming 5G is exploiting the fog computing to associatively learn and control the states of edge devices (such as phones, vehicles, and base stations) and in-network resources (computing, networking, and caching). Moreover, we propose a fog-enabled edge learning (FEL) framework for C-CCN in 5G, which can aggregate the idle computing resources of the neighbouring edge devices into virtual fogs to afford the heavy delay-sensitive learning tasks. By leveraging artificial intelligence (AI) to jointly processing sensed environmental data, dealing with the massive content statistics, and enforcing the mobility control at network edges, the FEL makes it possible for mobile users to cognitively share their data over the C-CCN in 5G. To validate the feasibility of proposed framework, we design two FEL-advanced cognitive services for C-CCN in 5G: 1) personalized network acceleration, 2) enhanced mobility management. Simultaneously, we present the simulations to show the FEL's efficiency on serving for the mobile users' delay-sensitive content retrieval and distribution in 5G.Comment: Submitted to IEEE Communications Magzine, under review, Feb. 09, 201

    A Literature Survey of Cooperative Caching in Content Distribution Networks

    Full text link
    Content distribution networks (CDNs) which serve to deliver web objects (e.g., documents, applications, music and video, etc.) have seen tremendous growth since its emergence. To minimize the retrieving delay experienced by a user with a request for a web object, caching strategies are often applied - contents are replicated at edges of the network which is closer to the user such that the network distance between the user and the object is reduced. In this literature survey, evolution of caching is studied. A recent research paper [15] in the field of large-scale caching for CDN was chosen to be the anchor paper which serves as a guide to the topic. Research studies after and relevant to the anchor paper are also analyzed to better evaluate the statements and results of the anchor paper and more importantly, to obtain an unbiased view of the large scale collaborate caching systems as a whole.Comment: 5 pages, 5 figure

    Query Load Balancing by Caching Search Results in Peer-to-Peer Information Retrieval Networks

    Get PDF
    For peer-to-peer web search engines it is important to keep the delay between receiving a query and providing search results within an acceptable range for the end user. How to achieve this remains an open challenge. One way to reduce delays is by caching search results for queries and allowing peers to access each others cache. In this paper we explore the limitations of search result caching in large-scale peer-to-peer information retrieval networks by simulating such networks with increasing levels of realism. We find that cache hit ratios of at least thirty-three percent are attainable

    Poor Man's Content Centric Networking (with TCP)

    Get PDF
    A number of different architectures have been proposed in support of data-oriented or information-centric networking. Besides a similar visions, they share the need for designing a new networking architecture. We present an incrementally deployable approach to content-centric networking based upon TCP. Content-aware senders cooperate with probabilistically operating routers for scalable content delivery (to unmodified clients), effectively supporting opportunistic caching for time-shifted access as well as de-facto synchronous multicast delivery. Our approach is application protocol-independent and provides support beyond HTTP caching or managed CDNs. We present our protocol design along with a Linux-based implementation and some initial feasibility checks
    • 

    corecore