11,281 research outputs found
Stochastic Query Covering for Fast Approximate Document Retrieval
We design algorithms that, given a collection of documents and a distribution over user queries, return a
small subset of the document collection in such a way that we can efficiently provide high-quality answers
to user queries using only the selected subset. This approach has applications when space is a constraint
or when the query-processing time increases significantly with the size of the collection. We study our
algorithms through the lens of stochastic analysis and prove that even though they use only a small fraction
of the entire collection, they can provide answers to most user queries, achieving a performance close to the
optimal. To complement our theoretical findings, we experimentally show the versatility of our approach
by considering two important cases in the context of Web search. In the first case, we favor the retrieval of
documents that are relevant to the query, whereas in the second case we aim for document diversification.
Both the theoretical and the experimental analysis provide strong evidence of the potential value of query
covering in diverse application scenarios
Caching in Multidimensional Databases
One utilisation of multidimensional databases is the field of On-line
Analytical Processing (OLAP). The applications in this area are designed to
make the analysis of shared multidimensional information fast [9]. On one hand,
speed can be achieved by specially devised data structures and algorithms. On
the other hand, the analytical process is cyclic. In other words, the user of
the OLAP application runs his or her queries one after the other. The output of
the last query may be there (at least partly) in one of the previous results.
Therefore caching also plays an important role in the operation of these
systems. However, caching itself may not be enough to ensure acceptable
performance. Size does matter: The more memory is available, the more we gain
by loading and keeping information in there. Oftentimes, the cache size is
fixed. This limits the performance of the multidimensional database, as well,
unless we compress the data in order to move a greater proportion of them into
the memory. Caching combined with proper compression methods promise further
performance improvements. In this paper, we investigate how caching influences
the speed of OLAP systems. Different physical representations (multidimensional
and table) are evaluated. For the thorough comparison, models are proposed. We
draw conclusions based on these models, and the conclusions are verified with
empirical data.Comment: 14 pages, 5 figures, 8 tables. Paper presented at the Fifth
Conference of PhD Students in Computer Science, Szeged, Hungary, 27 - 30 June
2006. For further details, please refer to
http://www.inf.u-szeged.hu/~szepkuti/papers.html#cachin
Cooperative Caching for Multimedia Streaming in Overlay Networks
Traditional data caching, such as web caching, only focuses on how to boost the hit rate of requested objects in caches, and therefore, how to reduce the initial delay for object retrieval. However, for multimedia objects, not only reducing the delay of object retrieval, but also provisioning reasonably stable network bandwidth to clients, while the fetching of the cached objects goes on, is important as well. In this paper, we propose our cooperative caching scheme for a multimedia delivery scenario, supporting a large number of peers over peer-to-peer overlay networks. In order to facilitate multimedia streaming and downloading service from servers, our caching scheme (1) determines the appropriate availability of cached stream segments in a cache community, (2) determines the appropriate peer for cache replacement, and (3) performs bandwidth-aware and availability-aware cache replacement. By doing so, it achieves (1) small delay of stream retrieval, (2) stable bandwidth provisioning during retrieval session, and (3) load balancing of clients' requests among peers
Fog-enabled Edge Learning for Cognitive Content-Centric Networking in 5G
By caching content at network edges close to the users, the content-centric
networking (CCN) has been considered to enforce efficient content retrieval and
distribution in the fifth generation (5G) networks. Due to the volume,
velocity, and variety of data generated by various 5G users, an urgent and
strategic issue is how to elevate the cognitive ability of the CCN to realize
context-awareness, timely response, and traffic offloading for 5G applications.
In this article, we envision that the fundamental work of designing a cognitive
CCN (C-CCN) for the upcoming 5G is exploiting the fog computing to
associatively learn and control the states of edge devices (such as phones,
vehicles, and base stations) and in-network resources (computing, networking,
and caching). Moreover, we propose a fog-enabled edge learning (FEL) framework
for C-CCN in 5G, which can aggregate the idle computing resources of the
neighbouring edge devices into virtual fogs to afford the heavy delay-sensitive
learning tasks. By leveraging artificial intelligence (AI) to jointly
processing sensed environmental data, dealing with the massive content
statistics, and enforcing the mobility control at network edges, the FEL makes
it possible for mobile users to cognitively share their data over the C-CCN in
5G. To validate the feasibility of proposed framework, we design two
FEL-advanced cognitive services for C-CCN in 5G: 1) personalized network
acceleration, 2) enhanced mobility management. Simultaneously, we present the
simulations to show the FEL's efficiency on serving for the mobile users'
delay-sensitive content retrieval and distribution in 5G.Comment: Submitted to IEEE Communications Magzine, under review, Feb. 09, 201
A Literature Survey of Cooperative Caching in Content Distribution Networks
Content distribution networks (CDNs) which serve to deliver web objects
(e.g., documents, applications, music and video, etc.) have seen tremendous
growth since its emergence. To minimize the retrieving delay experienced by a
user with a request for a web object, caching strategies are often applied -
contents are replicated at edges of the network which is closer to the user
such that the network distance between the user and the object is reduced. In
this literature survey, evolution of caching is studied. A recent research
paper [15] in the field of large-scale caching for CDN was chosen to be the
anchor paper which serves as a guide to the topic. Research studies after and
relevant to the anchor paper are also analyzed to better evaluate the
statements and results of the anchor paper and more importantly, to obtain an
unbiased view of the large scale collaborate caching systems as a whole.Comment: 5 pages, 5 figure
Query Load Balancing by Caching Search Results in Peer-to-Peer Information Retrieval Networks
For peer-to-peer web search engines it is important to keep the delay between receiving a query and providing search results within an acceptable range for the end user. How to achieve this remains an open challenge. One way to reduce delays is by caching search results for queries and allowing peers to access each others cache. In this paper we explore the limitations of search result caching in large-scale peer-to-peer information retrieval networks by simulating such networks with increasing levels of realism. We find that cache hit ratios of at least thirty-three percent are attainable
Poor Man's Content Centric Networking (with TCP)
A number of different architectures have been proposed in support of data-oriented or information-centric networking. Besides a similar visions, they share the need for designing a new networking architecture. We present an incrementally deployable approach to content-centric networking based upon TCP. Content-aware senders cooperate with probabilistically operating routers for scalable content delivery (to unmodified clients), effectively supporting opportunistic caching for time-shifted access as well as de-facto synchronous multicast delivery. Our approach is application protocol-independent and provides support beyond HTTP caching or managed CDNs. We present our protocol design along with a Linux-based implementation and some initial feasibility checks
- âŠ