Search CORE

4,544 research outputs found

Applying digital content management to support localisation

Author: Jones Gareth J.F.
Lawless Séamus
O'Connor Alexander
Wade Vincent
Zhou Dong
Publication venue: Localisation Research Centre
Publication date: 01/10/2009
Field of study

The retrieval and presentation of digital content such as that on the World Wide Web (WWW) is a substantial area of research. While recent years have seen huge expansion in the size of web-based archives that can be searched efficiently by commercial search engines, the presentation of potentially relevant content is still limited to ranked document lists represented by simple text snippets or image keyframe surrogates. There is expanding interest in techniques to personalise the presentation of content to improve the richness and effectiveness of the user experience. One of the most significant challenges to achieving this is the increasingly multilingual nature of this data, and the need to provide suitably localised responses to users based on this content. The Digital Content Management (DCM) track of the Centre for Next Generation Localisation (CNGL) is seeking to develop technologies to support advanced personalised access and presentation of information by combining elements from the existing research areas of Adaptive Hypermedia and Information Retrieval. The combination of these technologies is intended to produce significant improvements in the way users access information. We review key features of these technologies and introduce early ideas for how these technologies can support localisation and localised content before concluding with some impressions of future directions in DCM

Irish Universities

DCU Online Research Access Service

Cultural Heritage & Built Environment Scoping Report

Author: Martin K.
Nikolopoulou M.
Publication venue: University of Leeds
Publication date: 01/04/2013
Field of study

This report presents the findings of a scoping study that explores engagement between a heritage institution and its local community. The report addresses this topic by considering the opportunities and limitations of urban screens to form new audiences for heritage institutions; specifically through a case study of the BBC Big Screens. Literature suggests that urban screens have the potential to form new types of audiences for heritage institutions yet processes for achieving this are rarely described. This report proposes that understanding these processes may help address issues of measuring engagement associated with urban screens and contribute to assessing the value of urban screens for communities and heritage institutions. Key themes of participation, site and value are explored through a literature review. These themes are then used to structure the analysis and discussion of the case study. Further questions for future study are described

White Rose Research Online

Doctor of Philosophy

Author: Redd Douglas Fletcher
Publication venue: University of Utah
Publication date: 01/05/2016
Field of study

dissertationElectronic Health Records (EHRs) provide a wealth of information for secondary uses. Methods are developed to improve usefulness of free text query and text processing and demonstrate advantages to using these methods for clinical research, specifically cohort identification and enhancement. Cohort identification is a critical early step in clinical research. Problems may arise when too few patients are identified, or the cohort consists of a nonrepresentative sample. Methods of improving query formation through query expansion are described. Inclusion of free text search in addition to structured data search is investigated to determine the incremental improvement of adding unstructured text search over structured data search alone. Query expansion using topic- and synonym-based expansion improved information retrieval performance. An ensemble method was not successful. The addition of free text search compared to structured data search alone demonstrated increased cohort size in all cases, with dramatic increases in some. Representation of patients in subpopulations that may have been underrepresented otherwise is also shown. We demonstrate clinical impact by showing that a serious clinical condition, scleroderma renal crisis, can be predicted by adding free text search. A novel information extraction algorithm is developed and evaluated (Regular Expression Discovery for Extraction, or REDEx) for cohort enrichment. The REDEx algorithm is demonstrated to accurately extract information from free text clinical iv narratives. Temporal expressions as well as bodyweight-related measures are extracted. Additional patients and additional measurement occurrences are identified using these extracted values that were not identifiable through structured data alone. The REDEx algorithm transfers the burden of machine learning training from annotators to domain experts. We developed automated query expansion methods that greatly improve performance of keyword-based information retrieval. We also developed NLP methods for unstructured data and demonstrate that cohort size can be greatly increased, a more complete population can be identified, and important clinical conditions can be detected that are often missed otherwise. We found a much more complete representation of patients can be obtained. We also developed a novel machine learning algorithm for information extraction, REDEx, that efficiently extracts clinical values from unstructured clinical text, adding additional information and observations over what is available in structured text alone

The University of Utah: J. Willard Marriott Digital Library

Mnews: A Study of Multilingual News Search Interfaces

Author: Ling Chenjun
Publication venue: Scholar Commons
Publication date: 01/09/2019
Field of study

With the global expansion of the Internet and the World Wide Web, users are becoming increasingly diverse, particularly in terms of languages. In fact, the number of polyglot Web users across the globe has increased dramatically. However, even such multilingual users often continue to suffer from unbalanced and fragmented news information, as traditional news access systems seldom allow users to simultaneously search for and/or compare news in different languages, even though prior research results have shown that multilingual users make significant use of each of their languages when searching for information online. Relatively little human-centered research has been conducted to better understand and support multilingual user abilities and preferences. In particular, in the fields of cross-language and multilingual search, the majority of research has focused primarily on improving retrieval and translation accuracy, while paying comparably less attention to multilingual user interaction aspects. The research presented in this thesis provides the first large-scale investigations of multilingual news consumption and querying/search result selection behaviors, as well as a detailed comparative analysis of polyglots’ preferences and behaviors with respect to different multilingual news search interfaces on desktop and mobile platforms. Through a set of 4 phases of user studies, including surveys, interviews, as well as task-based user studies using crowdsourcing and laboratory experiments, this thesis presents the first human-centered studies in multilingual news access, aiming to drive the development of personalized multilingual news access systems to better support each individual user

Scholar Commons - Santa Clara University

Know Where to Go: Make LLM a Relevant, Responsible, and Trustworthy Searcher

Author: Cheng Qikai
Liu Jiawei
Liu Yinpeng
Lu Wei
Shi Xiang
Publication venue
Publication date: 18/10/2023
Field of study

The advent of Large Language Models (LLMs) has shown the potential to improve relevance and provide direct answers in web searches. However, challenges arise in validating the reliability of generated results and the credibility of contributing sources, due to the limitations of traditional information retrieval algorithms and the LLM hallucination problem. Aiming to create a "PageRank" for the LLM era, we strive to transform LLM into a relevant, responsible, and trustworthy searcher. We propose a novel generative retrieval framework leveraging the knowledge of LLMs to foster a direct link between queries and online sources. This framework consists of three core modules: Generator, Validator, and Optimizer, each focusing on generating trustworthy online sources, verifying source reliability, and refining unreliable sources, respectively. Extensive experiments and evaluations highlight our method's superior relevance, responsibility, and trustfulness against various SOTA methods.Comment: 14 pages, 4 figures, under peer revie

arXiv.org e-Print Archive

eCPD Programme - Enhanced Learning.

Author: Donovan Kevin
Saxton Lucy
Publication venue
Publication date: 01/01/2009
Field of study

This collection of papers (edited by Kevin Donovan) has been produced by the Association for Learning Technology (ALT) for LSIS. They are based on the summaries used by presenters during workshops at the 2009 launch of the eCPD Programme

ALT Open Access Repository

Data-driven prototyping via natural-language-based GUI retrieval

Author: Bartelt Christian
Kolthoff Kristian
Ponzetto Simone Paolo
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2023
Field of study

Rapid GUI prototyping has evolved into a widely applied technique in early stages of software development to facilitate the clarification and refinement of requirements. Especially high-fidelity GUI prototyping has shown to enable productive discussions with customers and mitigate potential misunderstandings, however, the benefits of applying high-fidelity GUI prototypes are accompanied by the disadvantage of being expensive and time-consuming in development and requiring experience to create. In this work, we show RaWi, a data-driven GUI prototyping approach that effectively retrieves GUIs for reuse from a large-scale semi-automatically created GUI repository for mobile apps on the basis of Natural Language (NL) searches to facilitate GUI prototyping and improve its productivity by leveraging the vast GUI prototyping knowledge embodied in the repository. Retrieved GUIs can directly be reused and adapted in the graphical editor of RaWi. Moreover, we present a comprehensive evaluation methodology to enable (i) the systematic evaluation of NL-based GUI ranking methods through a novel high-quality gold standard and conduct an in-depth evaluation of traditional IR and state-of-the-art BERT-based models for GUI ranking, and (ii) the assessment of GUI prototyping productivity accompanied by an extensive user study in a practical GUI prototyping environment

MAnnheim DOCument Server

Understanding the use of Virtual Reality in Marketing: a text mining-based review

Author: Eloy S.
Guerreiro J.
Langaro D.
Loureiro S. M. C.
Panchapakesan P.
Publication venue: 'Elsevier BV'
Publication date: 01/01/2019
Field of study

The current study intends to highlight the most relevant studies in simulated realities with special attention to VR and marketing, showing how studies have evolved over time and discussing the findings. A text-mining approach using a Bayesian statistical topic model called latent Dirichlet allocation is employed to conduct a comprehensive analysis of 150 articles from 115 journals, all indexed in Web of Science. The findings reveal seven relevant topics, as well as the number of articles published over time, the authors most cited in VR papers and the leading journals in each topic. The article also provides theoretical and practical implications and suggestions for further research.info:eu-repo/semantics/acceptedVersio

Crossref

Repositório Institucional do ISCTE-IUL

Evaluating Generative Ad Hoc Information Retrieval

Author: Bevendorff Janek
Deckers Niklas
Fröbe Maik
Gienapp Lukas
Hagen Matthias
Kiesel Johannes
Potthast Martin
Scells Harrisen
Stein Benno
Syed Shahbaz
Wang Shuai
Zuccon Guido
Publication venue
Publication date: 08/11/2023
Field of study

Recent advances in large language models have enabled the development of viable generative information retrieval systems. A generative retrieval system returns a grounded generated text in response to an information need instead of the traditional document ranking. Quantifying the utility of these types of responses is essential for evaluating generative retrieval systems. As the established evaluation methodology for ranking-based ad hoc retrieval may seem unsuitable for generative retrieval, new approaches for reliable, repeatable, and reproducible experimentation are required. In this paper, we survey the relevant information retrieval and natural language processing literature, identify search tasks and system architectures in generative retrieval, develop a corresponding user model, and study its operationalization. This theoretical analysis provides a foundation and new insights for the evaluation of generative ad hoc retrieval systems.Comment: 14 pages, 5 figures, 1 tabl

arXiv.org e-Print Archive