Search CORE

509 research outputs found

An Optimal Trade-off between Content Freshness and Refresh Cost

Author: Cho
Cohen
Jie Mi
Lee
Notess
Ross
Wessels
Yibei Ling
Publication venue: 'Applied Probability Trust'
Publication date: 02/08/2010
Field of study

Caching is an effective mechanism for reducing bandwidth usage and alleviating server load. However, the use of caching entails a compromise between content freshness and refresh cost. An excessive refresh allows a high degree of content freshness at a greater cost of system resource. Conversely, a deficient refresh inhibits content freshness but saves the cost of resource usages. To address the freshness-cost problem, we formulate the refresh scheduling problem with a generic cost model and use this cost model to determine an optimal refresh frequency that gives the best tradeoff between refresh cost and content freshness. We prove the existence and uniqueness of an optimal refresh frequency under the assumptions that the arrival of content update is Poisson and the age-related cost monotonically increases with decreasing freshness. In addition, we provide an analytic comparison of system performance under fixed refresh scheduling and random refresh scheduling, showing that with the same average refresh frequency two refresh schedulings are mathematically equivalent in terms of the long-run average cost

arXiv.org e-Print Archive

Crossref

Monitoring the dynamic web to respond to continuous queries

Author: Krithi Ramamritham
Sandeep Pandey
Soumen Chakrabarti
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2004
Field of study

Crossref

A Novel Cooperation and Competition Strategy Among Multi-Agent Crawlers

Author: Du Yajun
Wang Min
Xu Yong
Publication venue: Institute of Informatics, Slovak Academy of Sciences
Publication date: 07/02/2017
Field of study

Multi-Agent theory which is used for communication and collaboration among focused crawlers has been proved that it can improve the precision of returned result significantly. In this paper, we proposed a new organizational structure of multi-agent for focused crawlers, in which the agents were divided into three categories, namely F-Agent (Facilitator-Agent), As-Agent (Assistance-Agent) and C-Agent (Crawler-Agent). They worked on their own responsibilities and cooperated mutually to complete a common task of web crawling. In our proposed architecture of focused crawlers based on multi-agent system, we emphasized discussing the collaborative process among multiple agents. To control the cooperation among agents, we proposed a negotiation protocol based on the contract net protocol and achieved the collaboration model of focused crawlers based on multi-agent by JADE. At last, the comparative experiment results showed that our focused crawlers had higher precision and efficiency than other crawlers using the algorithms with breadth-first, best-first, etc

Computing and Informatics (E-Journal - Institute of Informatics, SAS, Bratislava)

Ubicrawler: a scalable fully distributed web crawler

Author: Codenotti Bruno
Publication venue: -
Publication date
Field of study

We present the design and implementation of UbiCrawler, a scalable distributed web crawler, and we analyze its performance. The main features of UbiCrawler are platform independence, fault tolerance, a very effective assignment function for partitioning the domain to crawl, and more in general the complete decentralization of every task

PUblication MAnagement

Inter Process Communication and Prioritization to Enable Desktop Advertisement Mechanism

Author: Gupta Shubham
Mittal Varun
Publication venue: AIS Electronic Library (AISeL)
Publication date: 01/05/2009
Field of study

This research paper tries to bring in a new concept of desktop advertising mechanism by synchronization it with the running processes and the data on users’ side. The proposed approach shall be based on inter process communication interaction, scheduling, prioritization, desktop crawling and system calls. The running process status and data will be fetched by the proposed process, which will then seek relevant information with the remote ad server and display the advertisements fetched based on keywords on user side

AIS Electronic Library (AISeL)

Generalized probabilistic flooding in unstructured peer-to-peer networks

Author: Gaeta Rossano
Sereno Matteo
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2011
Field of study

Institutional Research Information System University of Turin