226 research outputs found
A Complete Year of User Retrieval Sessions in a Social Sciences Academic Search Engine
In this paper, we present an open data set extracted from the transaction log
of the social sciences academic search engine sowiport. The data set includes a
filtered set of 484,449 retrieval sessions which have been carried out by
sowiport users in the period from April 2014 to April 2015. We propose a
description of interactions performed by the academic search engine users that
can be used in different applications such as result ranking improvement, user
modeling, query reformulation analysis, search pattern recognition.Comment: 6 pages, 2 figures, accepted short paper at the 21st International
Conference on Theory and Practice of Digital Libraries (TPDL 2017
Geological influence on the formation of Samar natural bridge and collapse valley of Ravna River from the NE Kučaj Mountains (Carpatho-Balkanides, eastern Serbia)
The paper deals with the description of Samar natural bridge and collapse valley of Ravna River in eastern Serbia aiming to suggest an interpretation of their origin and development, in relationship with lithological and tectonic conditions, karst processes, and petrological analyses. In this study we present the geological setting, detailed morphology and hypothesis on the genesis of these karst landforms. The relationship between surface karst development and the geology is considerably acknowledged. The major contribution of the paper is to propose a framework for considering how recrystallization of limestone can affect the weathering potential of karst landforms and to introduce a term collapse valley. Finally, this study shows that the weathering potencial of the Samar natural bridge is decreased concerning the diagenetic changes these limestones underwent.
Retrievability in an Integrated Retrieval System: An Extended Study
Retrievability measures the influence a retrieval system has on the access to
information in a given collection of items. This measure can help in making an
evaluation of the search system based on which insights can be drawn. In this
paper, we investigate the retrievability in an integrated search system
consisting of items from various categories, particularly focussing on
datasets, publications \ijdl{and variables} in a real-life Digital Library
(DL). The traditional metrics, that is, the Lorenz curve and Gini coefficient,
are employed to visualize the diversity in retrievability scores of the
\ijdl{three} retrievable document types (specifically datasets, publications,
and variables). Our results show a significant popularity bias with certain
items being retrieved more often than others. Particularly, it has been shown
that certain datasets are more likely to be retrieved than other datasets in
the same category. In contrast, the retrievability scores of items from the
variable or publication category are more evenly distributed. We have observed
that the distribution of document retrievability is more diverse for datasets
as compared to publications and variables.Comment: To appear in International Journal on Digital Libraries (IJDL). arXiv
admin note: substantial text overlap with arXiv:2205.0093
Contextualised Browsing in a Digital Library's Living Lab
Contextualisation has proven to be effective in tailoring \linebreak search
results towards the users' information need. While this is true for a basic
query search, the usage of contextual session information during exploratory
search especially on the level of browsing has so far been underexposed in
research. In this paper, we present two approaches that contextualise browsing
on the level of structured metadata in a Digital Library (DL), (1) one variant
bases on document similarity and (2) one variant utilises implicit session
information, such as queries and different document metadata encountered during
the session of a users. We evaluate our approaches in a living lab environment
using a DL in the social sciences and compare our contextualisation approaches
against a non-contextualised approach. For a period of more than three months
we analysed 47,444 unique retrieval sessions that contain search activities on
the level of browsing. Our results show that a contextualisation of browsing
significantly outperforms our baseline in terms of the position of the first
clicked item in the result set. The mean rank of the first clicked document
(measured as mean first relevant - MFR) was 4.52 using a non-contextualised
ranking compared to 3.04 when re-ranking the result lists based on similarity
to the previously viewed document. Furthermore, we observed that both
contextual approaches show a noticeably higher click-through rate. A
contextualisation based on document similarity leads to almost twice as many
document views compared to the non-contextualised ranking.Comment: 10 pages, 2 figures, paper accepted at JCDL 201
Semi-automatische Verschlagwortung zur Integration externer semantischer Inhalte innerhalb einer medizinischen Kooperationsplattform
PubMed stellt mit 21 Mio. Aufsatzzitaten eines der umfangreichsten Informationssysteme in Bereich der Medizin. Durch die Verwendung einer einheitlichen Terminologie (Medical Subject Heading - MeSH) bei der Indizierung von PubMed Inhalten kann die Orientierung in solch großen Datenbeständen optimiert werden. Zwar bietet ein kontrolliertes Vokabular bei der Informationsbeschaffung zahlreiche Vorteile gegenüber einer Freitextsuche doch fällt Nutzern das Abbilden eines Informationsbedarfs auf die verwendete Terminologie oftmals schwer. In dieser Arbeit wird eine Systemunterstützung geschaffen, die den Abbildungsprozess automatisiert indem eine automatische Verschlagwortung textbasierter Inhalte unter Verwendung eines kontrollierten Vokabulars vorgenommen wird. Durch die Verwendung einer einheitliche Terminologie kann so eine konsistente Integration von PubMed Inhalten erreicht werden
- …