983 research outputs found

    Evaluating the SiteStory Transactional Web Archive with the ApacheBench Tool

    Get PDF
    PDF of a powerpoint presentation from TPDL 2013: 17th International Conference on Theory and Practice of Digital Libraries, Valletta, Malta, September 22-26, 2013. Also available on Slideshare.https://digitalcommons.odu.edu/computerscience_presentations/1012/thumbnail.jp

    Music Video Redundancy and Half-Life in YouTube

    Get PDF
    PDF of a powerpoint presentation from TPDL 2011: 15th International Conference on Theory and Practice of Digital Libraries, in Berlin, Germany, September 25-29, 2011. Also available on Slideshare.https://digitalcommons.odu.edu/computerscience_presentations/1019/thumbnail.jp

    A Complete Year of User Retrieval Sessions in a Social Sciences Academic Search Engine

    Full text link
    In this paper, we present an open data set extracted from the transaction log of the social sciences academic search engine sowiport. The data set includes a filtered set of 484,449 retrieval sessions which have been carried out by sowiport users in the period from April 2014 to April 2015. We propose a description of interactions performed by the academic search engine users that can be used in different applications such as result ranking improvement, user modeling, query reformulation analysis, search pattern recognition.Comment: 6 pages, 2 figures, accepted short paper at the 21st International Conference on Theory and Practice of Digital Libraries (TPDL 2017

    Towards Better Understanding Researcher Strategies in Cross-Lingual Event Analytics

    Full text link
    With an increasing amount of information on globally important events, there is a growing demand for efficient analytics of multilingual event-centric information. Such analytics is particularly challenging due to the large amount of content, the event dynamics and the language barrier. Although memory institutions increasingly collect event-centric Web content in different languages, very little is known about the strategies of researchers who conduct analytics of such content. In this paper we present researchers' strategies for the content, method and feature selection in the context of cross-lingual event-centric analytics observed in two case studies on multilingual Wikipedia. We discuss the influence factors for these strategies, the findings enabled by the adopted methods along with the current limitations and provide recommendations for services supporting researchers in cross-lingual event-centric analytics.Comment: In Proceedings of the International Conference on Theory and Practice of Digital Libraries 201

    Improving Retrieval Results with discipline-specific Query Expansion

    Full text link
    Choosing the right terms to describe an information need is becoming more difficult as the amount of available information increases. Search-Term-Recommendation (STR) systems can help to overcome these problems. This paper evaluates the benefits that may be gained from the use of STRs in Query Expansion (QE). We create 17 STRs, 16 based on specific disciplines and one giving general recommendations, and compare the retrieval performance of these STRs. The main findings are: (1) QE with specific STRs leads to significantly better results than QE with a general STR, (2) QE with specific STRs selected by a heuristic mechanism of topic classification leads to better results than the general STR, however (3) selecting the best matching specific STR in an automatic way is a major challenge of this process.Comment: 6 pages; to be published in Proceedings of Theory and Practice of Digital Libraries 2012 (TPDL 2012

    Integrating Research Data Management into Geographical Information Systems

    Full text link
    Ocean modelling requires the production of high-fidelity computational meshes upon which to solve the equations of motion. The production of such meshes by hand is often infeasible, considering the complexity of the bathymetry and coastlines. The use of Geographical Information Systems (GIS) is therefore a key component to discretising the region of interest and producing a mesh appropriate to resolve the dynamics. However, all data associated with the production of a mesh must be provided in order to contribute to the overall recomputability of the subsequent simulation. This work presents the integration of research data management in QMesh, a tool for generating meshes using GIS. The tool uses the PyRDM library to provide a quick and easy way for scientists to publish meshes, and all data required to regenerate them, to persistent online repositories. These repositories are assigned unique identifiers to enable proper citation of the meshes in journal articles.Comment: Accepted, camera-ready version. To appear in the Proceedings of the 5th International Workshop on Semantic Digital Archives (http://sda2015.dke-research.de/), held in Pozna\'n, Poland on 18 September 2015 as part of the 19th International Conference on Theory and Practice of Digital Libraries (http://tpdl2015.info/

    Query Expansion for Survey Question Retrieval in the Social Sciences

    Full text link
    In recent years, the importance of research data and the need to archive and to share it in the scientific community have increased enormously. This introduces a whole new set of challenges for digital libraries. In the social sciences typical research data sets consist of surveys and questionnaires. In this paper we focus on the use case of social science survey question reuse and on mechanisms to support users in the query formulation for data sets. We describe and evaluate thesaurus- and co-occurrence-based approaches for query expansion to improve retrieval quality in digital libraries and research data archives. The challenge here is to translate the information need and the underlying sociological phenomena into proper queries. As we can show retrieval quality can be improved by adding related terms to the queries. In a direct comparison automatically expanded queries using extracted co-occurring terms can provide better results than queries manually reformulated by a domain expert and better results than a keyword-based BM25 baseline.Comment: to appear in Proceedings of 19th International Conference on Theory and Practice of Digital Libraries 2015 (TPDL 2015

    On the Change in Archivability of Websites Over Time

    Get PDF
    As web technologies evolve, web archivists work to keep up so that our digital history is preserved. Recent advances in web technologies have introduced client-side executed scripts that load data without a referential identifier or that require user interaction (e.g., content loading when the page has scrolled). These advances have made automating methods for capturing web pages more difficult. Because of the evolving schemes of publishing web pages along with the progressive capability of web preservation tools, the archivability of pages on the web has varied over time. In this paper we show that the archivability of a web page can be deduced from the type of page being archived, which aligns with that page's accessibility in respect to dynamic content. We show concrete examples of when these technologies were introduced by referencing mementos of pages that have persisted through a long evolution of available technologies. Identifying these reasons for the inability of these web pages to be archived in the past in respect to accessibility serves as a guide for ensuring that content that has longevity is published using good practice methods that make it available for preservation.Comment: 12 pages, 8 figures, Theory and Practice of Digital Libraries (TPDL) 2013, Valletta, Malt

    Towards Serendipitous Research Paper Recommender Using Tweets and Diversification

    Get PDF
    23rd International Conference on Theory and Practice of Digital Libraries, TPDL 2019, Oslo, Norway, September 9-12, 2019. Part of the Lecture Notes in Computer Science book series (LNCS, volume 11799), also part of the Information Systems and Applications, incl. Internet/Web, and HCI book sub series (LNISA, volume 11799).In this paper, we examine whether a user’s tweets can help to a generate more serendipitous recommendations. In addition, we investigate whether the use of diversification applied on a list of recommended items further improves serendipity. To this end, we conduct an experiment with n = 22 subjects. The result of our experiment shows that the subject’s tweets did not improve serendipity, but diversification results in more serendipitous recommendations
    corecore