4,055 research outputs found
Why People Search for Images using Web Search Engines
What are the intents or goals behind human interactions with image search
engines? Knowing why people search for images is of major concern to Web image
search engines because user satisfaction may vary as intent varies. Previous
analyses of image search behavior have mostly been query-based, focusing on
what images people search for, rather than intent-based, that is, why people
search for images. To date, there is no thorough investigation of how different
image search intents affect users' search behavior.
In this paper, we address the following questions: (1)Why do people search
for images in text-based Web image search systems? (2)How does image search
behavior change with user intent? (3)Can we predict user intent effectively
from interactions during the early stages of a search session? To this end, we
conduct both a lab-based user study and a commercial search log analysis.
We show that user intents in image search can be grouped into three classes:
Explore/Learn, Entertain, and Locate/Acquire. Our lab-based user study reveals
different user behavior patterns under these three intents, such as first click
time, query reformulation, dwell time and mouse movement on the result page.
Based on user interaction features during the early stages of an image search
session, that is, before mouse scroll, we develop an intent classifier that is
able to achieve promising results for classifying intents into our three intent
classes. Given that all features can be obtained online and unobtrusively, the
predicted intents can provide guidance for choosing ranking methods immediately
after scrolling
What Users Ask a Search Engine: Analyzing One Billion Russian Question Queries
We analyze the question queries submitted to a large commercial web search engine to get insights about what people ask, and to better tailor the search results to the users’ needs. Based on a dataset of about one billion question queries submitted during the year 2012, we investigate askers’ querying behavior with the support of automatic query categorization. While the importance of question queries is likely to increase, at present they only make up 3–4% of the total search traffic. Since questions are such a small part of the query stream and are more likely to be unique than shorter queries, clickthrough information is typically rather sparse. Thus, query categorization methods based on the categories of clicked web documents do not work well for questions. As an alternative, we propose a robust question query classification method that uses the labeled questions from a large community question answering platform (CQA) as a training set. The resulting classifier is then transferred to the web search questions. Even though questions on CQA platforms tend to be different to web search questions, our categorization method proves competitive with strong baselines with respect to classification accuracy. To show the scalability of our proposed method we apply the classifiers to about one billion question queries and discuss the trade-offs between performance and accuracy that different classification models offer. Our findings reveal what people ask a search engine and also how this contrasts behavior on a CQA platform
Of course we share! Testing Assumptions about Social Tagging Systems
Social tagging systems have established themselves as an important part in
today's web and have attracted the interest from our research community in a
variety of investigations. The overall vision of our community is that simply
through interactions with the system, i.e., through tagging and sharing of
resources, users would contribute to building useful semantic structures as
well as resource indexes using uncontrolled vocabulary not only due to the
easy-to-use mechanics. Henceforth, a variety of assumptions about social
tagging systems have emerged, yet testing them has been difficult due to the
absence of suitable data. In this work we thoroughly investigate three
available assumptions - e.g., is a tagging system really social? - by examining
live log data gathered from the real-world public social tagging system
BibSonomy. Our empirical results indicate that while some of these assumptions
hold to a certain extent, other assumptions need to be reflected and viewed in
a very critical light. Our observations have implications for the design of
future search and other algorithms to better reflect the actual user behavior
Contextualised Browsing in a Digital Library's Living Lab
Contextualisation has proven to be effective in tailoring \linebreak search
results towards the users' information need. While this is true for a basic
query search, the usage of contextual session information during exploratory
search especially on the level of browsing has so far been underexposed in
research. In this paper, we present two approaches that contextualise browsing
on the level of structured metadata in a Digital Library (DL), (1) one variant
bases on document similarity and (2) one variant utilises implicit session
information, such as queries and different document metadata encountered during
the session of a users. We evaluate our approaches in a living lab environment
using a DL in the social sciences and compare our contextualisation approaches
against a non-contextualised approach. For a period of more than three months
we analysed 47,444 unique retrieval sessions that contain search activities on
the level of browsing. Our results show that a contextualisation of browsing
significantly outperforms our baseline in terms of the position of the first
clicked item in the result set. The mean rank of the first clicked document
(measured as mean first relevant - MFR) was 4.52 using a non-contextualised
ranking compared to 3.04 when re-ranking the result lists based on similarity
to the previously viewed document. Furthermore, we observed that both
contextual approaches show a noticeably higher click-through rate. A
contextualisation based on document similarity leads to almost twice as many
document views compared to the non-contextualised ranking.Comment: 10 pages, 2 figures, paper accepted at JCDL 201
Why people search for images using web search engines
What are the intents or goals behind human interactions with image search engines? Knowing why people search for images is of major concern to Web image search engines because user satisfaction may vary as intent varies. Previous analyses of image search behavior have mostly been query-based, focusing on what images people search for, rather than intent-based, that is, why people search for images. To date, there is no thorough investigation of how different image search intents affect users' search behavior. In this paper, we address the following questions: (1) Why do people search for images in text-based Web image search systems? (2) How does image search behavior
An Intent Taxonomy of Legal Case Retrieval
Legal case retrieval is a special Information Retrieval~(IR) task focusing on
legal case documents. Depending on the downstream tasks of the retrieved case
documents, users' information needs in legal case retrieval could be
significantly different from those in Web search and traditional ad-hoc
retrieval tasks. While there are several studies that retrieve legal cases
based on text similarity, the underlying search intents of legal retrieval
users, as shown in this paper, are more complicated than that yet mostly
unexplored. To this end, we present a novel hierarchical intent taxonomy of
legal case retrieval. It consists of five intent types categorized by three
criteria, i.e., search for Particular Case(s), Characterization, Penalty,
Procedure, and Interest. The taxonomy was constructed transparently and
evaluated extensively through interviews, editorial user studies, and query log
analysis. Through a laboratory user study, we reveal significant differences in
user behavior and satisfaction under different search intents in legal case
retrieval. Furthermore, we apply the proposed taxonomy to various downstream
legal retrieval tasks, e.g., result ranking and satisfaction prediction, and
demonstrate its effectiveness. Our work provides important insights into the
understanding of user intents in legal case retrieval and potentially leads to
better retrieval techniques in the legal domain, such as intent-aware ranking
strategies and evaluation methodologies.Comment: 28 pages, work in proces
DOBBS: Towards a Comprehensive Dataset to Study the Browsing Behavior of Online Users
The investigation of the browsing behavior of users provides useful
information to optimize web site design, web browser design, search engines
offerings, and online advertisement. This has been a topic of active research
since the Web started and a large body of work exists. However, new online
services as well as advances in Web and mobile technologies clearly changed the
meaning behind "browsing the Web" and require a fresh look at the problem and
research, specifically in respect to whether the used models are still
appropriate. Platforms such as YouTube, Netflix or last.fm have started to
replace the traditional media channels (cinema, television, radio) and media
distribution formats (CD, DVD, Blu-ray). Social networks (e.g., Facebook) and
platforms for browser games attracted whole new, particularly less tech-savvy
audiences. Furthermore, advances in mobile technologies and devices made
browsing "on-the-move" the norm and changed the user behavior as in the mobile
case browsing is often being influenced by the user's location and context in
the physical world. Commonly used datasets, such as web server access logs or
search engines transaction logs, are inherently not capable of capturing the
browsing behavior of users in all these facets. DOBBS (DERI Online Behavior
Study) is an effort to create such a dataset in a non-intrusive, completely
anonymous and privacy-preserving way. To this end, DOBBS provides a browser
add-on that users can install, which keeps track of their browsing behavior
(e.g., how much time they spent on the Web, how long they stay on a website,
how often they visit a website, how they use their browser, etc.). In this
paper, we outline the motivation behind DOBBS, describe the add-on and captured
data in detail, and present some first results to highlight the strengths of
DOBBS
- …