14,731 research outputs found
How to Search the Internet Archive Without Indexing It
Significant parts of cultural heritage are produced on the web during the
last decades. While easy accessibility to the current web is a good baseline,
optimal access to the past web faces several challenges. This includes dealing
with large-scale web archive collections and lacking of usage logs that contain
implicit human feedback most relevant for today's web search. In this paper, we
propose an entity-oriented search system to support retrieval and analytics on
the Internet Archive. We use Bing to retrieve a ranked list of results from the
current web. In addition, we link retrieved results to the WayBack Machine;
thus allowing keyword search on the Internet Archive without processing and
indexing its raw archived content. Our search system complements existing web
archive search tools through a user-friendly interface, which comes close to
the functionalities of modern web search engines (e.g., keyword search, query
auto-completion and related query suggestion), and provides a great benefit of
taking user feedback on the current web into account also for web archive
search. Through extensive experiments, we conduct quantitative and qualitative
analyses in order to provide insights that enable further research on and
practical applications of web archives
RECIPE SUGGESTION TOOL
ABSTRACTThere is currently a great need for a tool to search cooking recipes based on ingredients. Current search engines do not provide this feature. Most of the recipe search results in current websites are not efficiently clustered based on relevance or categories resulting in a user getting lost in the huge search results presented.Clustering in information retrieval is used for higher efficiency and better presentation of information to the user. Clustering puts similar documents in the same cluster. If a document is relevant to a query, then the documents in the same cluster are also relevant.The goal of this project is to implement clustering on recipes. The user can search for recipes based on ingredient
Challenges and opportunities of context-aware information access
Ubiquitous computing environments embedding a wide range of pervasive computing technologies provide a challenging and exciting new domain for information access. Individuals working in these environments are increasingly permanently connected to rich information resources. An appealing opportunity of these environments is the potential to deliver useful information to individuals either from their previous information experiences or external sources. This information should enrich their life experiences or make them more effective in their endeavours. Information access in ubiquitous computing environments can be made "context-aware" by exploiting the wide range context data available describing the environment, the searcher and the information itself. Realizing such a vision of reliable, timely and appropriate identification and delivery of information in this way poses numerous challenges. A central theme in achieving context-aware information access is the combination of information retrieval with multiple dimensions of available context data. Potential context data sources, include the user's current task, inputs from environmental and biometric sensors, associated with the user's current context, previous contexts, and document context, which can be exploited using a variety of technologies to create new and exciting possibilities for information access
Query Expansion for Survey Question Retrieval in the Social Sciences
In recent years, the importance of research data and the need to archive and
to share it in the scientific community have increased enormously. This
introduces a whole new set of challenges for digital libraries. In the social
sciences typical research data sets consist of surveys and questionnaires. In
this paper we focus on the use case of social science survey question reuse and
on mechanisms to support users in the query formulation for data sets. We
describe and evaluate thesaurus- and co-occurrence-based approaches for query
expansion to improve retrieval quality in digital libraries and research data
archives. The challenge here is to translate the information need and the
underlying sociological phenomena into proper queries. As we can show retrieval
quality can be improved by adding related terms to the queries. In a direct
comparison automatically expanded queries using extracted co-occurring terms
can provide better results than queries manually reformulated by a domain
expert and better results than a keyword-based BM25 baseline.Comment: to appear in Proceedings of 19th International Conference on Theory
and Practice of Digital Libraries 2015 (TPDL 2015
Supporting aspect-based video browsing - analysis of a user study
In this paper, we present a novel video search interface based on the concept of aspect browsing. The proposed strategy is to assist the user in exploratory video search by actively suggesting new query terms and video shots. Our approach has the potential to narrow the "Semantic Gap" issue by allowing users to explore the data collection. First, we describe a clustering technique to identify potential aspects of a search. Then, we use the results to propose suggestions to the user to help them in their search task. Finally, we analyse this approach by exploiting the log files and the feedbacks of a user study
Why People Search for Images using Web Search Engines
What are the intents or goals behind human interactions with image search
engines? Knowing why people search for images is of major concern to Web image
search engines because user satisfaction may vary as intent varies. Previous
analyses of image search behavior have mostly been query-based, focusing on
what images people search for, rather than intent-based, that is, why people
search for images. To date, there is no thorough investigation of how different
image search intents affect users' search behavior.
In this paper, we address the following questions: (1)Why do people search
for images in text-based Web image search systems? (2)How does image search
behavior change with user intent? (3)Can we predict user intent effectively
from interactions during the early stages of a search session? To this end, we
conduct both a lab-based user study and a commercial search log analysis.
We show that user intents in image search can be grouped into three classes:
Explore/Learn, Entertain, and Locate/Acquire. Our lab-based user study reveals
different user behavior patterns under these three intents, such as first click
time, query reformulation, dwell time and mouse movement on the result page.
Based on user interaction features during the early stages of an image search
session, that is, before mouse scroll, we develop an intent classifier that is
able to achieve promising results for classifying intents into our three intent
classes. Given that all features can be obtained online and unobtrusively, the
predicted intents can provide guidance for choosing ranking methods immediately
after scrolling
- ā¦