2,055 research outputs found
Perspectives for Electronic Books in the World Wide Web Age
While the World Wide Web (WWW or Web) is steadily expanding, electronic books (e-books) remain a niche market. In this article, it is first postulated that specialized contents and device independence can make Web-based e-books compete with paper prints; and that adaptive features that can be implemented by client-side computing are relevant for e-books, while more complex forms of adaptation requiring server-side computations are not. Then, enhancements of the WWW standards (specifically of XML, XHTML, of the style-sheet languages CSS and XSL, and of the linking language XLink) are proposed for a better support of client-side adaptation and device independent content modeling. Finally, advanced browsing functionalities desirable for e-books as well as their implementation in the WWW context are described
Mining Missing Hyperlinks from Human Navigation Traces: A Case Study of Wikipedia
Hyperlinks are an essential feature of the World Wide Web. They are
especially important for online encyclopedias such as Wikipedia: an article can
often only be understood in the context of related articles, and hyperlinks
make it easy to explore this context. But important links are often missing,
and several methods have been proposed to alleviate this problem by learning a
linking model based on the structure of the existing links. Here we propose a
novel approach to identifying missing links in Wikipedia. We build on the fact
that the ultimate purpose of Wikipedia links is to aid navigation. Rather than
merely suggesting new links that are in tune with the structure of existing
links, our method finds missing links that would immediately enhance
Wikipedia's navigability. We leverage data sets of navigation paths collected
through a Wikipedia-based human-computation game in which users must find a
short path from a start to a target article by only clicking links encountered
along the way. We harness human navigational traces to identify a set of
candidates for missing links and then rank these candidates. Experiments show
that our procedure identifies missing links of high quality
An effective, low-cost measure of semantic relatedness obtained from Wikipedia links
This paper describes a new technique for obtaining measures of semantic relatedness. Like other recent approaches, it uses Wikipedia to provide structured world knowledge about the terms of interest. Out approach is unique in that it does so using the hyperlink structure of Wikipedia rather than its category hierarchy or textual content. Evaluation with manually defined measures of semantic relatedness reveals this to be an effective compromise between the ease of computation of the former approach and the accuracy of the latter
University of Twente @ TREC 2009: Indexing half a billion web pages
This report presents results for the TREC 2009 adhoc task, the diversity task, and the relevance feedback task. We present ideas for unsupervised tuning of search system, an approach for spam removal, and the use of categories and query log information for diversifying search results
Web Page Retrieval by Combining Evidence
The participation of the REINA Research Group in WebCLEF 2005 focused in the monolingual mixed task. Queries or topics are of two types: named and home pages. For both, we first perform a search by thematic contents; for the same query, we do a search in several elements of information from every page (title, some meta tags, anchor text) and then we combine the results. For queries about home pages, we try to detect using a method based in some keywords and their patterns of use. After, a re-rank of the results of the thematic contents retrieval is performed, based on Page-Rank and Centrality coeficients
Parallel Strands: A Preliminary Investigation into Mining the Web for Bilingual Text
Parallel corpora are a valuable resource for machine translation, but at
present their availability and utility is limited by genre- and
domain-specificity, licensing restrictions, and the basic difficulty of
locating parallel texts in all but the most dominant of the world's languages.
A parallel corpus resource not yet explored is the World Wide Web, which hosts
an abundance of pages in parallel translation, offering a potential solution to
some of these problems and unique opportunities of its own. This paper presents
the necessary first step in that exploration: a method for automatically
finding parallel translated documents on the Web. The technique is conceptually
simple, fully language independent, and scalable, and preliminary evaluation
results indicate that the method may be accurate enough to apply without human
intervention.Comment: LaTeX2e, 11 pages, 7 eps figures; uses psfig, llncs.cls, theapa.sty.
An Appendix at http://umiacs.umd.edu/~resnik/amta98/amta98_appendix.html
contains test dat
Educational framework based on cumulative vocabularies, conceptual networks and Wikipedia linkage
We propose a new educational framework based on guidedexploration in small-world networks relying on hyperlinknetwork of the Wikipedia online encyclopedia(http://www.wikipedia.org) in which hyperlinks betweenarticles define conceptual relationships. Educationalmaterial is presented to student with cumulativeconceptual networks based on hyperlink network of theWikipedia connecting concepts of vocabulary aboutcurrent learning topic. Personalization of educationalmaterial is carried out by alternating the distribution ofenabled hyperlinks connecting concepts belonging tocurrent vocabulary according to requirements of learningobjective, learning context and learner’s knowledge.Besides developing a computational method to manageeducational material with conceptual networks and toexplore the shortest paths between concepts of vocabulary(especially highest-ranking hyperlinked concepts andstrongly rising hyperlinked concepts), we have alsoexperimentally estimated properties of conceptualnetworks generated based on hyperlink network of theWikipedia between concepts retrieved from EnglishVocabulary Profile for cumulatively growing vocabulariescorresponding to six language ability levels.Peer reviewe
- …