CORE
🇺🇦
make metadata, not war
Services
Services overview
Explore all CORE services
Access to raw data
API
Dataset
FastSync
Content discovery
Recommender
Discovery
OAI identifiers
OAI Resolver
Managing content
Dashboard
Bespoke contracts
Consultancy services
Support us
Support us
Membership
Sponsorship
Community governance
Advisory Board
Board of supporters
Research network
About
About us
Our mission
Team
Blog
FAQs
Contact us
Segmenting and labeling query sequences in a multidatabase environment
Authors
A.C. Acar
D. Liu
+6 more
J. Cardiff
L.R. Rabiner
M.-S. Chen
Q. Yao
R. Cooley
R. Kindermann
Publication date
1 January 2011
Publisher
'Springer Science and Business Media LLC'
Doi
Cite
Abstract
When gathering information from multiple independent data sources, users will generally pose a sequence of queries to each source, combine (union) or cross-reference (join) the results in order to obtain the information they need. Furthermore, when gathering information, there is a fair bit of trial and error involved, where queries are recursively refined according to the results of a previous query in the sequence. From the point of view of an outside observer, the aim of such a sequence of queries may not be immediately obvious. We investigate the problem of isolating and characterizing subsequences representing coherent information retrieval goals out of a sequence of queries sent by a user to different data sources over a period of time. The problem has two sub-problems: segmenting the sequence into subsequences, each representing a discrete goal; and labeling each query in these subsequences according to how they contribute to the goal. We propose a method in which a discriminative probabilistic model (a Conditional Random Field) is trained with pre-labeled sequences. We have tested the accuracy with which such a model can infer labels and segmentation on novel sequences. Results show that the approach is very accurate (> 95% accuracy) when there are no spurious queries in the sequence and moderately accurate even in the presence of substantial noise (∼70% accuracy when 15% of queries in the sequence are spurious). © 2011 Springer-Verlag
Similar works
Full text
Open in the Core reader
Download PDF
Available Versions
Crossref
See this paper in CORE
Go to the repository landing page
Download from data provider
Last time updated on 05/06/2019
Bilkent University Institutional Repository
See this paper in CORE
Go to the repository landing page
Download from data provider
oai:repository.bilkent.edu.tr:...
Last time updated on 12/11/2016
OpenMETU (Middle East Technical University)
See this paper in CORE
Go to the repository landing page
Download from data provider
oai:https://open.metu.edu.tr:1...
Last time updated on 02/12/2021