Search CORE

1,095 research outputs found

Assessed Relevance and Stylistic Variation

Author: Karlgren Jussi
Publication venue
Publication date: 01/01/1996
Field of study

Texts exhibit considerable stylistic variation. This paper reports an experiment where a large corpus of documents is analyzed using various simple stylistic metrics. A subset of the corpus has been previously assessed to be relevant for answering given information retrieval queries. The experiment shows that this subset differs significantly from the rest of the corpus in terms of the stylistic metrics studied

CiteSeerX

Crossref

RISE – Research Institutes of Sweden

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Swedish Institute of Computer Science Publications Database

Software institutes' Online Digital Archive

Stylistic Variation in an Information Retrieval Experiment

Author: Karlgren Jussi
Publication venue
Publication date: 01/01/1996
Field of study

Texts exhibit considerable stylistic variation. This paper reports an experiment where a corpus of documents (N= 75 000) is analyzed using various simple stylistic metrics. A subset (n = 1000) of the corpus has been previously assessed to be relevant for answering given information retrieval queries. The experiment shows that this subset differs significantly from the rest of the corpus in terms of the stylistic metrics studied.Comment: Proceedings of NEMLAP-

arXiv.org e-Print Archive

CiteSeerX

Swedish Institute of Computer Science Publications Database

Software institutes' Online Digital Archive

Using term clouds to represent segment-level semantic content of podcasts

Author: Besser Jana
de Rijke Maarten
Fuller Marguerite
Jones Gareth J.F.
Larson Martha
Newman Eamonn
Tsagkias Manos
Publication venue
Publication date: 01/01/2008
Field of study

Spoken audio, like any time-continuous medium, is notoriously difficult to browse or skim without support of an interface providing semantically annotated jump points to signal the user where to listen in. Creation of time-aligned metadata by human annotators is prohibitively expensive, motivating the investigation of representations of segment-level semantic content based on transcripts generated by automatic speech recognition (ASR). This paper examines the feasibility of using term clouds to provide users with a structured representation of the semantic content of podcast episodes. Podcast episodes are visualized as a series of sub-episode segments, each represented by a term cloud derived from a transcript generated by automatic speech recognition (ASR). Quality of segment-level term clouds is measured quantitatively and their utility is investigated using a small-scale user study based on human labeled segment boundaries. Since the segment-level clouds generated from ASR-transcripts prove useful, we examine an adaptation of text tiling techniques to speech in order to be able to generate segments as part of a completely automated indexing and structuring system for browsing of spoken audio. Results demonstrate that the segments generated are comparable with human selected segment boundaries

Irish Universities

DCU Online Research Access Service

UvA-DARE

International Migration, Integration and Social Cohesion online publications

Bridging the Gap Between Retrieval and Summarization

Author: Lennox Connor
Publication venue: University of New Hampshire Scholars\u27 Repository
Publication date: 01/05/2022
Field of study

Information Retrieval is, at its core, a field focused on providing information to users to fulfill an information need. One of the most common use cases of Information Retrieval is document-level retrieval, which seeks to provide a collection of documents to the user that addresses their needs. In contrast to this, single document retrieval seeks to instead provide the user with a single document comprised of all required information. We seek to extend single document retrieval to single document generation, in which we use multiple source documents to create a new document which directly addresses the information need

UNH Scholars' Repository

Utilizing sub-topical structure of documents for information retrieval.

Author: Ganguly Debasis
Jones Gareth J.F.
Leveling Johannes
Publication venue
Publication date: 28/10/2011
Field of study

Text segmentation in natural language processing typically refers to the process of decomposing a document into constituent subtopics. Our work centers on the application of text segmentation techniques within information retrieval (IR) tasks. For example, for scoring a document by combining the retrieval scores of its constituent segments, exploiting the proximity of query terms in documents for ad-hoc search, and for question answering (QA), where retrieved passages from multiple documents are aggregated and presented as a single document to a searcher. Feedback in ad hoc IR task is shown to beneﬁt from the use of extracted sentences instead of terms from the pseudo relevant documents for query expansion. Retrieval effectiveness for patent prior art search task is enhanced by applying text segmentation to the patent queries. Another aspect of our work involves augmenting text segmentation techniques to produce segments which are more readable with less unresolved anaphora. This is particularly useful for QA and snippet generation tasks where the objective is to aggregate relevant and novel information from multiple documents satisfying user information need on one hand, and ensuring that the automatically generated content presented to the user is easily readable without reference to the original source document

CiteSeerX

Irish Universities

DCU Online Research Access Service

Retrieval through explanation : an abductive inference approach to relevance feedback

Author: Lalmas M.
Ruthven I.
van Rijsbergen C.J.
Publication venue
Publication date: 01/01/1999
Field of study

Relevance feedback techniques are designed to automatically improve a system's representation of a query by using documents the user has marked as relevant. However, traditional relevance feedback models suffer from a number of limitations that restrict their potential in supporting information seeking. One of the major limitations of relevance feedback is that it does not incorporate behavioural aspects of information seeking - how and why users assess relevance. We propose that relevance feedback should be viewed as a process of explanation and demonstrate how this limitation of relevance feedback techniques can be overcome by a theory of relevance feedback based on abductive inference

CiteSeerX

University of Strathclyde Institutional Repository