7,351 research outputs found
Interoperability between Multimedia Collections for Content and Metadata-Based Searching
Artiste is a European project developing a cross-collection search system for art galleries and museums. It combines image content retrieval with text based retrieval and uses RDF mappings in order to integrate diverse databases. The test sites of the Louvre, Victoria and Albert Museum, Uffizi Gallery and National Gallery London provide their own database schema for existing metadata, avoiding the need for migration to a common schema. The system will accept a query based on one museum’s fields and convert them, through an RDF mapping into a form suitable for querying the other collections. The nature of some of the image processing algorithms means that the system can be slow for some computations, so the system is session-based to allow the user to return to the results later. The system has been built within a J2EE/EJB framework, using the Jboss Enterprise Application Server
An Analysis of Optimal Link Bombs
We analyze the phenomenon of collusion for the purpose of boosting the
pagerank of a node in an interlinked environment. We investigate the optimal
attack pattern for a group of nodes (attackers) attempting to improve the
ranking of a specific node (the victim). We consider attacks where the
attackers can only manipulate their own outgoing links. We show that the
optimal attacks in this scenario are uncoordinated, i.e. the attackers link
directly to the victim and no one else. nodes do not link to each other. We
also discuss optimal attack patterns for a group that wants to hide itself by
not pointing directly to the victim. In these disguised attacks, the attackers
link to nodes hops away from the victim. We show that an optimal disguised
attack exists and how it can be computed. The optimal disguised attack also
allows us to find optimal link farm configurations. A link farm can be
considered a special case of our approach: the target page of the link farm is
the victim and the other nodes in the link farm are the attackers for the
purpose of improving the rank of the victim. The target page can however
control its own outgoing links for the purpose of improving its own rank, which
can be modeled as an optimal disguised attack of 1-hop on itself. Our results
are unique in the literature as we show optimality not only in the pagerank
score, but also in the rank based on the pagerank score. We further validate
our results with experiments on a variety of random graph models.Comment: Full Version of a version which appeared in AIRweb 200
The Best Trail Algorithm for Assisted Navigation of Web Sites
We present an algorithm called the Best Trail Algorithm, which helps solve
the hypertext navigation problem by automating the construction of memex-like
trails through the corpus. The algorithm performs a probabilistic best-first
expansion of a set of navigation trees to find relevant and compact trails. We
describe the implementation of the algorithm, scoring methods for trails,
filtering algorithms and a new metric called \emph{potential gain} which
measures the potential of a page for future navigation opportunities.Comment: 11 pages, 11 figure
On the evolution of hyperlinking
Across time, the hyperlink object has supported different applications and studies. This is one perspective on the evolution of the hyperlinking concept, its context and related behaviors. Through a spectrum of hyperlinking applications and practices, the article contrasts the status quo with its related, broader, conceptual roots; it also bridges to some theorized and prototyped hyperlink variations, namely "stigmergic hyperlinks", to make the case that the ubiquitousness of some objects and certain usage patterns can obfuscate opportunities to (re)think them. In trying to contribute an answer to "what has the common hyperlink (such an apparently simple object) done to society, and what has society done to it?", the article identifies situations that have become so embedded in the daily routine, that it is now hard to think of hyperlinking alternatives.info:eu-repo/semantics/publishedVersio
Mining Missing Hyperlinks from Human Navigation Traces: A Case Study of Wikipedia
Hyperlinks are an essential feature of the World Wide Web. They are
especially important for online encyclopedias such as Wikipedia: an article can
often only be understood in the context of related articles, and hyperlinks
make it easy to explore this context. But important links are often missing,
and several methods have been proposed to alleviate this problem by learning a
linking model based on the structure of the existing links. Here we propose a
novel approach to identifying missing links in Wikipedia. We build on the fact
that the ultimate purpose of Wikipedia links is to aid navigation. Rather than
merely suggesting new links that are in tune with the structure of existing
links, our method finds missing links that would immediately enhance
Wikipedia's navigability. We leverage data sets of navigation paths collected
through a Wikipedia-based human-computation game in which users must find a
short path from a start to a target article by only clicking links encountered
along the way. We harness human navigational traces to identify a set of
candidates for missing links and then rank these candidates. Experiments show
that our procedure identifies missing links of high quality
Forming Within-site Topical Information Space to Facilitate Online Free-Choice Learning
Locating specific and structured information in the World Wide Web (WWW) is becoming increasingly difficult, because of the rapid growth of the Web and the distributed nature of information. Although existing search engines do a good job in ranking web pages based on topical relevance, they provide limited assistance for free-choice learners to leverage the nonlinear nature of information spaces for knowledge acquisition. We hypothesize that free-choice learners would benefit more from structured topical information spaces than a list of individual pages across multiple websites. We conceptualize a within-site topical information space as a sphere formed by linked pages centering on a web page. In this paper, we investigate techniques and heuristics to form the space. In particular, we propose a hybrid method that relies on not only content-based characteristics and user queries, but also a site\u27s global structure. Experimental results show that consideration of website topology provides good improvement to page relevance estimation, indicating the clustering tendency of relevant pages
- …