Search CORE

22,392 research outputs found

Applying digital content management to support localisation

Author: Jones Gareth J.F.
Lawless Séamus
O'Connor Alexander
Wade Vincent
Zhou Dong
Publication venue: Localisation Research Centre
Publication date: 01/10/2009
Field of study

The retrieval and presentation of digital content such as that on the World Wide Web (WWW) is a substantial area of research. While recent years have seen huge expansion in the size of web-based archives that can be searched efficiently by commercial search engines, the presentation of potentially relevant content is still limited to ranked document lists represented by simple text snippets or image keyframe surrogates. There is expanding interest in techniques to personalise the presentation of content to improve the richness and effectiveness of the user experience. One of the most significant challenges to achieving this is the increasingly multilingual nature of this data, and the need to provide suitably localised responses to users based on this content. The Digital Content Management (DCM) track of the Centre for Next Generation Localisation (CNGL) is seeking to develop technologies to support advanced personalised access and presentation of information by combining elements from the existing research areas of Adaptive Hypermedia and Information Retrieval. The combination of these technologies is intended to produce significant improvements in the way users access information. We review key features of these technologies and introduce early ideas for how these technologies can support localisation and localised content before concluding with some impressions of future directions in DCM

Topic Similarity Networks: Visual Analytics for Large Document Sets

Author: Maiya Arun S.
Rolfe Robert M.
Publication venue
Publication date: 26/09/2014
Field of study

We investigate ways in which to improve the interpretability of LDA topic models by better analyzing and visualizing their outputs. We focus on examining what we refer to as topic similarity networks: graphs in which nodes represent latent topics in text collections and links represent similarity among topics. We describe efficient and effective approaches to both building and labeling such networks. Visualizations of topic models based on these networks are shown to be a powerful means of exploring, characterizing, and summarizing large collections of unstructured text documents. They help to "tease out" non-obvious connections among different sets of documents and provide insights into how topics form larger themes. We demonstrate the efficacy and practicality of these approaches through two case studies: 1) NSF grants for basic research spanning a 14 year period and 2) the entire English portion of Wikipedia.Comment: 9 pages; 2014 IEEE International Conference on Big Data (IEEE BigData 2014

arXiv.org e-Print Archive

Extending the 5S Framework of Digital Libraries to support Complex Objects, Superimposed Information, and Content-Based Image Retrieval Services

Author: Archer David
Delcambre Lois
Fox Edward
Goncalves Marcos
Kozievitch Nadia
Leidig Jonathan
Murthy Uma
Torres Ricardo
Yang Seungwon
Publication venue
Publication date: 01/01/2010
Field of study

Advanced services in digital libraries (DLs) have been developed and widely used to address the required capabilities of an assortment of systems as DLs expand into diverse application domains. These systems may require support for images (e.g., Content-Based Image Retrieval), Complex (information) Objects, and use of content at fine grain (e.g., Superimposed Information). Due to the lack of consensus on precise theoretical definitions for those services, implementation efforts often involve ad hoc development, leading to duplication and interoperability problems. This article presents a methodology to address those problems by extending a precisely specified minimal digital library (in the 5S framework) with formal definitions of aforementioned services. The theoretical extensions of digital library functionality presented here are reinforced with practical case studies as well as scenarios for the individual and integrative use of services to balance theory and practice. This methodology has implications that other advanced services can be continuously integrated into our current extended framework whenever they are identified. The theoretical definitions and case study we present may impact future development efforts and a wide range of digital library researchers, designers, and developers

Computer Science Technical Reports @Virginia Tech

From Words to Understanding

Author: Karlgren Jussi
Sahlgren Magnus
Publication venue: CSLI Publications
Publication date: 01/01/2001
Field of study

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Swedish Institute of Computer Science Publications Database

Discovery-led refinement in e-discovery investigations: sensemaking, cognitive ergonomics and system design.

Author: A Strauss
Ann Blandford
F Luthans
G Klein
HK Klein
J Baron
JB Thomas
JC Johnson
Simon Attfield
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2010
Field of study

Given the very large numbers of documents involved in e-discovery investigations, lawyers face a considerable challenge of collaborative sensemaking. We report findings from three workplace studies which looked at different aspects of how this challenge was met. From a sociotechnical perspective, the studies aimed to understand how investigators collectively and individually worked with information to support sensemaking and decision making. Here, we focus on discovery-led refinement; specifically, how engaging with the materials of the investigations led to discoveries that supported refinement of the problems and new strategies for addressing them. These refinements were essential for tractability. We begin with observations which show how new lines of enquiry were recursively embedded. We then analyse the conceptual structure of a line of enquiry and consider how reflecting this in e-discovery support systems might support scalability and group collaboration. We then focus on the individual activity of manual document review where refinement corresponded with the inductive identification of classes of irrelevant and relevant documents within a collection. Our observations point to the effects of priming on dealing with these efficiently and to issues of cognitive ergonomics at the human–computer interface. We use these observations to introduce visualisations that might enable reviewers to deal with such refinements more efficiently