Search CORE

6,456 research outputs found

Using the Annotated Bibliography as a Resource for Indicative Summarization

Author: Kan Min-Yen
Klavans Judith L.
McKeown Kathleen R.
Publication venue
Publication date: 01/01/2002
Field of study

We report on a language resource consisting of 2000 annotated bibliography entries, which is being analyzed as part of our research on indicative document summarization. We show how annotated bibliographies cover certain aspects of summarization that have not been well-covered by other summary corpora, and motivate why they constitute an important form to study for information retrieval. We detail our methodology for collecting the corpus, and overview our document feature markup that we introduced to facilitate summary analysis. We present the characteristics of the corpus, methods of collection, and show its use in finding the distribution of types of information included in indicative summaries and their relative ordering within the summaries.Comment: 8 pages, 3 figure

arXiv.org e-Print Archive

CiteSeerX

Columbia University Academic Commons

An MPEG-7 scheme for semantic content modelling and filtering of digital video

Author: A. Vakali
A. Vetro
B.L. Tseng
B.L. Tseng
C. Okoli
C.S. Goldfarb
F. Golshani
F. Kretz
G. Rowe
H. Kosch
H.W. Agius
H.W. Agius
H.W. Agius
Harry Agius
J. Hunter
J. Magalhães
J.F. Allen
L. Al-Safadi
L. Wenyin
M. Davis
M. Echiffre
M. Eirinaki
M.C. Angelides
M.R. Naphande
Marios C. Angelides
N. Adami
P. Correia
P. Salembier
P.M. Fonseca
R. Zhao
S. Adali
S.R. Newcomb
S.R. Newcomb
S.W. Ambler
T. Meyer-Boudnik
U. Westermann
Y.F. Day
É Germain
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 10/02/2006
Field of study

Abstract Part 5 of the MPEG-7 standard specifies Multimedia Description Schemes (MDS); that is, the format multimedia content models should conform to in order to ensure interoperability across multiple platforms and applications. However, the standard does not specify how the content or the associated model may be filtered. This paper proposes an MPEG-7 scheme which can be deployed for digital video content modelling and filtering. The proposed scheme, COSMOS-7, produces rich and multi-faceted semantic content models and supports a content-based filtering approach that only analyses content relating directly to the preferred content requirements of the user. We present details of the scheme, front-end systems used for content modelling and filtering and experiences with a number of users

Crossref

Brunel University Research Archive

Interactive information retrieval

Author: Allan
Barry
Bates
Beaulieu
Beaulieu
Belkin
Belkin
Bhavnani
Blair
Borgman
Borgman
Brajnik
Broder
Buyukkokten
Byström
Campbell
Case
Chen
Cove
Crestani
Crouch
Downie
Dumais
Eastman
Efthimiadis
Ellis
Ellis
Fidel
Ford
Ford
Foster
Fox
Hansen
Harper
Hearst
Hearst
Hearst
Heinström
Hill
Ingwersen
Ingwersen
Jansen
Jansen
Jones
Jones
Kang
Kelly
Kelly
Kim
Konstan
Kruschwitz
Kuhlthau
Legg
Lin
Lin
Lorigo
Lynch
López-Ostenero
Maña-López
Niemi
Norman
Over
Pirkola
Pu
Radev
Reid
Reid
Riedl
Rieh
Robertson
Rosenfeld
Roussinov
Ruthven
Ruthven
Savolainen
Shipman
Shneiderman
Sihvonen
Slone
Smeaton
Spink
Spink
Spink
Spink
Spink
Spink
Spärck Jones
Spärck Jones
Sweeney
Tombros
Tombros
Toms
Topi
Topi
Vakkari
Vakkari
Vakkari
Vakkari
van der Eijk
Vechtomova
Voorhees
White
White
White
White
Wiesman
Wu
Xie
Publication venue: 'Wiley'
Publication date: 01/11/2008
Field of study

Crossref

University of Strathclyde Institutional Repository

A Survey on Retrieval of Mathematical Knowledge

Author: A Asperti
A Asperti
A Kohlhase
A Kohlhase
AM Youssef
AS Youssef
AS Youssef
BR Miller
BR Miller
BR Miller
D Delahaye
F Guidi
F Rabe
G Bancerek
G Bancerek
G Bancerek
I Normann
M Adeel
M Líška
M-Q Nghiem
ME Altamimi
O Caprotti
P Baumgartner
P Cairns
P Libbrecht
P Libbrecht
P Libbrecht
Q Zhang
R Miner
R Zanibbi
S Kamali
T Gauthier
Y Haralambous
Publication venue
Publication date: 01/01/2015
Field of study

We present a short survey of the literature on indexing and retrieval of mathematical knowledge, with pointers to 72 papers and tentative taxonomies of both retrieval problems and recurring techniques.Comment: CICM 2015, 20 page

arXiv.org e-Print Archive

Crossref

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

Automatic vs Manual Provenance Abstractions: Mind the Gap

Author: Alper Pinar
Belhajjame Khalid
Goble Carole A.
Publication venue
Publication date: 21/05/2016
Field of study

In recent years the need to simplify or to hide sensitive information in provenance has given way to research on provenance abstraction. In the context of scientific workflows, existing research provides techniques to semi automatically create abstractions of a given workflow description, which is in turn used as filters over the workflow's provenance traces. An alternative approach that is commonly adopted by scientists is to build workflows with abstractions embedded into the workflow's design, such as using sub-workflows. This paper reports on the comparison of manual versus semi-automated approaches in a context where result abstractions are used to filter report-worthy results of computational scientific analyses. Specifically; we take a real-world workflow containing user-created design abstractions and compare these with abstractions created by ZOOM UserViews and Workflow Summaries systems. Our comparison shows that semi-automatic and manual approaches largely overlap from a process perspective, meanwhile, there is a dramatic mismatch in terms of data artefacts retained in an abstracted account of derivation. We discuss reasons and suggest future research directions.Comment: Preprint accepted to the 2016 workshop on the Theory and Applications of Provenance, TAPP 201

arXiv.org e-Print Archive

The University of Manchester - Institutional Repository

Learning to merge search results for efficient Distributed Information Retrieval

Author: Hiemstra Djoerd
Tjin-Kam-Jet Kien-Tsoi T.E.
Publication venue: Radboud University
Publication date: 01/01/2010
Field of study

Merging search results from different servers is a major problem in Distributed Information Retrieval. We used Regression-SVM and Ranking-SVM which would learn a function that merges results based on information that is readily available: i.e. the ranks, titles, summaries and URLs contained in the results pages. By not downloading additional information, such as the full document, we decrease bandwidth usage. CORI and Round Robin merging were used as our baselines; surprisingly, our results show that the SVM-methods do not improve over those baselines

CiteSeerX

Radboud Repository

University of Twente Research Information

A geo-temporal information extraction service for processing descriptive metadata in digital libraries

Author: Borbinha José
Manguinhas H.
Martins Bruno
Siabato Vaca Willington Libardo
Publication venue: E.T.S.I. en Topografía, Geodesia y Cartografía (UPM)
Publication date: 01/01/2009
Field of study

In the context of digital map libraries, resources are usually described according to metadata records that define the relevant subject, location, time-span, format and keywords. On what concerns locations and time-spans, metadata records are often incomplete or they provide information in a way that is not machine-understandable (e.g. textual descriptions). This paper presents techniques for extracting geotemporal information from text, using relatively simple text mining methods that leverage on a Web gazetteer service. The idea is to go from human-made geotemporal referencing (i.e. using place and period names in textual expressions) into geo-spatial coordinates and time-spans. A prototype system, implementing the proposed methods, is described in detail. Experimental results demonstrate the efficiency and accuracy of the proposed approaches

Archivo Digital UPM

A Progressive Clustering Algorithm to Group the XML Data by Structural and Semantic Similarity

Author: Nayak Richi
Tran Tien
Publication venue: 'World Scientific Pub Co Pte Lt'
Publication date: 01/01/2007
Field of study

Since the emergence in the popularity of XML for data representation and exchange over the Web, the distribution of XML documents has rapidly increased. It has become a challenge for researchers to turn these documents into a more useful information utility. In this paper, we introduce a novel clustering algorithm PCXSS that keeps the heterogeneous XML documents into various groups according to their similar structural and semantic representations. We develop a global criterion function CPSim that progressively measures the similarity between a XML document and existing clusters, ignoring the need to compute the similarity between two individual documents. The experimental analysis shows the method to be fast and accurate

CiteSeerX

Queensland University of Technology ePrints Archive