8 research outputs found
Concept-based Interactive Query Expansion Support Tool (CIQUEST)
This report describes a three-year project (2000-03) undertaken in the Information Studies
Department at The University of Sheffield and funded by Resource, The Council for
Museums, Archives and Libraries. The overall aim of the research was to provide user
support for query formulation and reformulation in searching large-scale textual resources
including those of the World Wide Web. More specifically the objectives were: to investigate
and evaluate methods for the automatic generation and organisation of concepts derived from
retrieved document sets, based on statistical methods for term weighting; and to conduct
user-based evaluations on the understanding, presentation and retrieval effectiveness of
concept structures in selecting candidate terms for interactive query expansion.
The TREC test collection formed the basis for the seven evaluative experiments conducted in
the course of the project. These formed four distinct phases in the project plan. In the first
phase, a series of experiments was conducted to investigate further techniques for concept
derivation and hierarchical organisation and structure. The second phase was concerned with
user-based validation of the concept structures. Results of phases 1 and 2 informed on the
design of the test system and the user interface was developed in phase 3. The final phase
entailed a user-based summative evaluation of the CiQuest system.
The main findings demonstrate that concept hierarchies can effectively be generated from
sets of retrieved documents and displayed to searchers in a meaningful way. The approach
provides the searcher with an overview of the contents of the retrieved documents, which in
turn facilitates the viewing of documents and selection of the most relevant ones. Concept
hierarchies are a good source of terms for query expansion and can improve precision. The
extraction of descriptive phrases as an alternative source of terms was also effective. With
respect to presentation, cascading menus were easy to browse for selecting terms and for
viewing documents. In conclusion the project dissemination programme and future work are
outlined
On-line new event detection and clustering using the concepts of the cover coefficient-based clustering methodology
Cataloged from PDF version of article.In this study, we use the concepts of the cover coefficient-based clustering
methodology (C3
M) for on-line new event detection and event clustering. The
main idea of the study is to use the seed selection process of the C3
M algorithm
for the purpose of detecting new events. Since C3
M works in a retrospective
manner, we modify the algorithm to work in an on-line environment.
Furthermore, in order to prevent producing oversized event clusters, and to give
equal chance to all documents to be the seed of a new event, we employ the
window size concept. Since we desire to control the number of seed documents,
we introduce a threshold concept to the event clustering algorithm. We also use
the threshold concept, with a little modification, in the on-line event detection. In
the experiments we use TDT1 corpus, which is also used in the original topic
detection and tracking study. In event clustering and event detection, we use both
binary and weighted versions of TDT1 corpus. With the binary implementation,
we obtain better results. When we compare our on-line event detection results to
the results of UMASS approach, we obtain better performance in terms of false
alarm rates.Vural, AhmetM.S
Recent Experiments with INQUERY
this paper focuses on relevant differences to the previously published algorithms. 1 Description of Ad-Hoc Experiment