26,134 research outputs found
Exploiting Social Annotation for Automatic Resource Discovery
Information integration applications, such as mediators or mashups, that
require access to information resources currently rely on users manually
discovering and integrating them in the application. Manual resource discovery
is a slow process, requiring the user to sift through results obtained via
keyword-based search. Although search methods have advanced to include evidence
from document contents, its metadata and the contents and link structure of the
referring pages, they still do not adequately cover information sources --
often called ``the hidden Web''-- that dynamically generate documents in
response to a query. The recently popular social bookmarking sites, which allow
users to annotate and share metadata about various information sources, provide
rich evidence for resource discovery. In this paper, we describe a
probabilistic model of the user annotation process in a social bookmarking
system del.icio.us. We then use the model to automatically find resources
relevant to a particular information domain. Our experimental results on data
obtained from \emph{del.icio.us} show this approach as a promising method for
helping automate the resource discovery task.Comment: 6 pages, submitted to AAAI07 workshop on Information Integration on
the We
Research Directions, Challenges and Issues in Opinion Mining
Rapid growth of Internet and availability of user reviews on the web for any product has provided a need for an effective system to analyze the web reviews. Such reviews are useful to some extent, promising both the customers and product manufacturers. For any popular product, the number of reviews can be in hundreds or even thousands. This creates difficulty for a customer to analyze them and make important decisions on whether to purchase the product or to not. Mining such product reviews or opinions is termed as opinion mining which is broadly classified into two main categories namely facts and opinions. Though there are several approaches for opinion mining, there remains a challenge to decide on the recommendation provided by the system. In this paper, we analyze the basics of opinion mining, challenges, pros & cons of past opinion mining systems and provide some directions for the future research work, focusing on the challenges and issues
A Brief History of Web Crawlers
Web crawlers visit internet applications, collect data, and learn about new
web pages from visited pages. Web crawlers have a long and interesting history.
Early web crawlers collected statistics about the web. In addition to
collecting statistics about the web and indexing the applications for search
engines, modern crawlers can be used to perform accessibility and vulnerability
checks on the application. Quick expansion of the web, and the complexity added
to web applications have made the process of crawling a very challenging one.
Throughout the history of web crawling many researchers and industrial groups
addressed different issues and challenges that web crawlers face. Different
solutions have been proposed to reduce the time and cost of crawling.
Performing an exhaustive crawl is a challenging question. Additionally
capturing the model of a modern web application and extracting data from it
automatically is another open question. What follows is a brief history of
different technique and algorithms used from the early days of crawling up to
the recent days. We introduce criteria to evaluate the relative performance of
web crawlers. Based on these criteria we plot the evolution of web crawlers and
compare their performanc
A Review on Web Crawling System for Web Databases
As deep web develops at a quick pace, there has been expanded enthusiasm for strategies that assistance effectively find deep-web interfaces. Nonetheless, because of the extensive volume of web assets and the dynamic idea of deep web, accomplishing wide scope and high effectiveness is a testing issue. In this task propose a three-stage framework, for proficient reaping deep web interfaces. In the principal stage, web crawler performs website based scanning for focus pages with the assistance of web search tools, abstaining from going by a substantial number of pages. In this paper we have made an overview on how web crawler functions and what are the approaches accessible in existing framework from various scientists
- …