research

DART: the distributed agent based retrieval toolkit

Abstract

The technology of search engines is evolving from indexing and classification of web resources based on keywords to more sophisticated techniques which take into account the meaning and the context of textual information and usage. Replying to query, commercial search engines face the user requests with a large amount of results, mostly useless or only partially related to the request; the subsequent refinement, operated downloading and examining as much pages as possible and simply ignoring whatever stays behind the first few pages, is left up to the user. Furthermore, architectures based on centralized indexes, allow commercial search engines to control the advertisement of online information, in contrast to P2P architectures that focus the attention on user requirements involving the end user in search engine maintenance and operation. To address such wishes, new search engines should focus on three key aspects: semantics, geo-referencing, collaboration/distribution. Semantic analysis lets to increase the results relevance. The geo-referencing of catalogued resources allows contextualisation based on user position. Collaboration distributes storage, processing, and trust on a world-wide network of nodes running on users’ computers, getting rid of bottlenecks and central points of failures. In this paper, we describe the studies, the concepts and the solutions developed in the DART project to introduce these three key features in a novel search engine architecture

    Similar works