983 research outputs found
Evaluating the SiteStory Transactional Web Archive with the ApacheBench Tool
PDF of a powerpoint presentation from TPDL 2013: 17th International Conference on Theory and Practice of Digital Libraries, Valletta, Malta, September 22-26, 2013. Also available on Slideshare.https://digitalcommons.odu.edu/computerscience_presentations/1012/thumbnail.jp
Music Video Redundancy and Half-Life in YouTube
PDF of a powerpoint presentation from TPDL 2011: 15th International Conference on Theory and Practice of Digital Libraries, in Berlin, Germany, September 25-29, 2011. Also available on Slideshare.https://digitalcommons.odu.edu/computerscience_presentations/1019/thumbnail.jp
A Complete Year of User Retrieval Sessions in a Social Sciences Academic Search Engine
In this paper, we present an open data set extracted from the transaction log
of the social sciences academic search engine sowiport. The data set includes a
filtered set of 484,449 retrieval sessions which have been carried out by
sowiport users in the period from April 2014 to April 2015. We propose a
description of interactions performed by the academic search engine users that
can be used in different applications such as result ranking improvement, user
modeling, query reformulation analysis, search pattern recognition.Comment: 6 pages, 2 figures, accepted short paper at the 21st International
Conference on Theory and Practice of Digital Libraries (TPDL 2017
Towards Better Understanding Researcher Strategies in Cross-Lingual Event Analytics
With an increasing amount of information on globally important events, there
is a growing demand for efficient analytics of multilingual event-centric
information. Such analytics is particularly challenging due to the large amount
of content, the event dynamics and the language barrier. Although memory
institutions increasingly collect event-centric Web content in different
languages, very little is known about the strategies of researchers who conduct
analytics of such content. In this paper we present researchers' strategies for
the content, method and feature selection in the context of cross-lingual
event-centric analytics observed in two case studies on multilingual Wikipedia.
We discuss the influence factors for these strategies, the findings enabled by
the adopted methods along with the current limitations and provide
recommendations for services supporting researchers in cross-lingual
event-centric analytics.Comment: In Proceedings of the International Conference on Theory and Practice
of Digital Libraries 201
Improving Retrieval Results with discipline-specific Query Expansion
Choosing the right terms to describe an information need is becoming more
difficult as the amount of available information increases.
Search-Term-Recommendation (STR) systems can help to overcome these problems.
This paper evaluates the benefits that may be gained from the use of STRs in
Query Expansion (QE). We create 17 STRs, 16 based on specific disciplines and
one giving general recommendations, and compare the retrieval performance of
these STRs. The main findings are: (1) QE with specific STRs leads to
significantly better results than QE with a general STR, (2) QE with specific
STRs selected by a heuristic mechanism of topic classification leads to better
results than the general STR, however (3) selecting the best matching specific
STR in an automatic way is a major challenge of this process.Comment: 6 pages; to be published in Proceedings of Theory and Practice of
Digital Libraries 2012 (TPDL 2012
Integrating Research Data Management into Geographical Information Systems
Ocean modelling requires the production of high-fidelity computational meshes
upon which to solve the equations of motion. The production of such meshes by
hand is often infeasible, considering the complexity of the bathymetry and
coastlines. The use of Geographical Information Systems (GIS) is therefore a
key component to discretising the region of interest and producing a mesh
appropriate to resolve the dynamics. However, all data associated with the
production of a mesh must be provided in order to contribute to the overall
recomputability of the subsequent simulation. This work presents the
integration of research data management in QMesh, a tool for generating meshes
using GIS. The tool uses the PyRDM library to provide a quick and easy way for
scientists to publish meshes, and all data required to regenerate them, to
persistent online repositories. These repositories are assigned unique
identifiers to enable proper citation of the meshes in journal articles.Comment: Accepted, camera-ready version. To appear in the Proceedings of the
5th International Workshop on Semantic Digital Archives
(http://sda2015.dke-research.de/), held in Pozna\'n, Poland on 18 September
2015 as part of the 19th International Conference on Theory and Practice of
Digital Libraries (http://tpdl2015.info/
Query Expansion for Survey Question Retrieval in the Social Sciences
In recent years, the importance of research data and the need to archive and
to share it in the scientific community have increased enormously. This
introduces a whole new set of challenges for digital libraries. In the social
sciences typical research data sets consist of surveys and questionnaires. In
this paper we focus on the use case of social science survey question reuse and
on mechanisms to support users in the query formulation for data sets. We
describe and evaluate thesaurus- and co-occurrence-based approaches for query
expansion to improve retrieval quality in digital libraries and research data
archives. The challenge here is to translate the information need and the
underlying sociological phenomena into proper queries. As we can show retrieval
quality can be improved by adding related terms to the queries. In a direct
comparison automatically expanded queries using extracted co-occurring terms
can provide better results than queries manually reformulated by a domain
expert and better results than a keyword-based BM25 baseline.Comment: to appear in Proceedings of 19th International Conference on Theory
and Practice of Digital Libraries 2015 (TPDL 2015
On the Change in Archivability of Websites Over Time
As web technologies evolve, web archivists work to keep up so that our
digital history is preserved. Recent advances in web technologies have
introduced client-side executed scripts that load data without a referential
identifier or that require user interaction (e.g., content loading when the
page has scrolled). These advances have made automating methods for capturing
web pages more difficult. Because of the evolving schemes of publishing web
pages along with the progressive capability of web preservation tools, the
archivability of pages on the web has varied over time. In this paper we show
that the archivability of a web page can be deduced from the type of page being
archived, which aligns with that page's accessibility in respect to dynamic
content. We show concrete examples of when these technologies were introduced
by referencing mementos of pages that have persisted through a long evolution
of available technologies. Identifying these reasons for the inability of these
web pages to be archived in the past in respect to accessibility serves as a
guide for ensuring that content that has longevity is published using good
practice methods that make it available for preservation.Comment: 12 pages, 8 figures, Theory and Practice of Digital Libraries (TPDL)
2013, Valletta, Malt
Towards Serendipitous Research Paper Recommender Using Tweets and Diversification
23rd International Conference on Theory and Practice of Digital Libraries, TPDL 2019, Oslo, Norway, September 9-12, 2019. Part of the Lecture Notes in Computer Science book series (LNCS, volume 11799), also part of the Information Systems and Applications, incl. Internet/Web, and HCI book sub series (LNISA, volume 11799).In this paper, we examine whether a user’s tweets can help to a generate more serendipitous recommendations. In addition, we investigate whether the use of diversification applied on a list of recommended items further improves serendipity. To this end, we conduct an experiment with n = 22 subjects. The result of our experiment shows that the subject’s tweets did not improve serendipity, but diversification results in more serendipitous recommendations
- …