    Learning from Digital Library Evaluations

    In this paper we analyse evaluation studies of the Europeana digital library from its launch in 2009 until today. Using Saracevic’s digital library evaluation framework, the studies are categorised by their constructs, contexts, criteria, and methodologies. Concentrating on studies that evaluate Europeana services or single components, we show gaps in the evaluation of certain Europeana aspects. Finally, we derive strategies for building an evaluation archive that serves as memory and supports comparisons.Im vorliegenden Artikel analysieren wir Evaluationsstudien der digitalen Bibliothek Europeana von 2009 bis heute. Unter Berücksichtigung von Saracevic’ Evaluationsframework für digitale Bibliotheken werden die Studien nach ihren Konstrukten, Kontexten, Kriterien und Methodologien kategorisiert. Die Analyse konzentriert sich auf Studien, die Dienstleistungen oder einzelne Komponenten von Europeana evaluieren, und zeigt Lücken in der Evaluation bestimmter Aspekte von Europeana auf. Schließlich werden Strategien diskutiert, um ein Evaluationsarchiv zu entwickeln, welches sowohl der Langzeitarchivierung dient als auch Vergleiche von Evaluationsergebnissen unterstützt.Peer Reviewe

    Evaluating the strategic plans of public libraries : an inspection-based approach

    For public libraries, as with most organisations, effective strategic planning is critical to longevity, facilitating cohesive and coordinated responses to ever present and ever changing political, economic, social, and technological (PEST) forces which shape and influence direction. However, strategic planning is widely recognised as a challenging activity, which can be both time consuming and unproductive, and there exists limited guidance regarding how to evaluate documented and disseminated strategic plans, particularly within the not-for-profit sector. In response, this research proposes and tests an inspection-based approach to the evaluation of strategic plans, based upon a rubric specifying the key attributes of each of the core components of a plan, combined with an appropriate assessment scale. The rubric provides a method to identify and assess completeness of strategic plan, extending to qualitative assessment of communication aspects such as specification and terminology, and synergistic aspects such as cohesion and integration. The method is successfully trialled across the devolved Scottish public library sector with the strategic plans of 28 of the 32 regional networks evaluated. 17 of 28 plans (61%) were found to be incomplete and/or to contain contradictory or uncoordinated components, with it recommended that Scottish public libraries improve not only completeness of plans, but also their precision, specificity, explicitness, coordination, and consistency, and overall mapping to library services. Recommendations are made for further widespread application of the rubric

    In Homage of Change

    Creating sparks: comparing search results using discriminatory search term word co-occurrence to facilitate serendipity in the enterprise.

    Categories or tags that appear in faceted search interfaces which are representative of an information item, rarely convey unexpected or non-obvious associated concepts buried within search results. No prior research has been identified which assesses the usefulness of discriminative search term word co-occurrence to generate facets to act as catalysts to facilitate insightful and serendipitous encounters during exploratory search. In this study, 53 scientists from two organisations interacted with semi-interactive stimuli, 74% expressing a large/moderate desire to use such techniques within their workplace. Preferences were shown for certain algorithms and colour coding. Insightful and serendipitous encounters were identified. These techniques appear to offer a significant improvement over existing approaches used within the study organisations, providing further evidence that insightful and serendipitous encounters can be facilitated in the search user interface. This research has implications for organisational learning, knowledge discovery and exploratory search interface design

    Exploiting Query’s Temporal Patterns for Query Autocompletion

    Query autocompletion (QAC) is a common interactive feature of web search engines. It aims at assisting users to formulate queries and avoiding spelling mistakes by presenting them with a list of query completions as soon as they start typing in the search box. Existing QAC models mostly rank the query completions by their past popularity collected in the query logs. For some queries, their popularity exhibits relatively stable or periodic behavior while others may experience a sudden rise in their query popularity. Current time-sensitive QAC models focus on either periodicity or recency and are unable to respond swiftly to such sudden rise, resulting in a less optimal QAC performance. In this paper, we propose a hybrid QAC model that considers two temporal patterns of query’s popularity, that is, periodicity and burst trend. In detail, we first employ the Discrete Fourier Transform (DFT) to identify the periodicity of a query’s popularity, by which we forecast its future popularity. Then the burst trend of query’s popularity is detected and incorporated into the hybrid model with its cyclic behavior. Extensive experiments on a large, real-world query log dataset infer that modeling the temporal patterns of query popularity in the form of its periodicity and its burst trend can significantly improve the effectiveness of ranking query completions

    Resource discovery in heterogeneous digital content environments

    The concept of 'resource discovery' is central to our understanding of how users explore, navigate, locate and retrieve information resources. This submission for a PhD by Published Works examines a series of 11 related works which explore topics pertaining to resource discovery, each demonstrating heterogeneity in their digital discovery context. The assembled works are prefaced by nine chapters which seek to review and critically analyse the contribution of each work, as well as provide contextualization within the wider body of research literature. A series of conceptual sub-themes is used to organize and structure the works and the accompanying critical commentary. The thesis first begins by examining issues in distributed discovery contexts by studying collection level metadata (CLM), its application in 'information landscaping' techniques, and its relationship to the efficacy of federated item-level search tools. This research narrative continues but expands in the later works and commentary to consider the application of Knowledge Organization Systems (KOS), particularly within Semantic Web and machine interface contexts, with investigations of semantically aware terminology services in distributed discovery. The necessary modelling of data structures to support resource discovery - and its associated functionalities within digital libraries and repositories - is then considered within the novel context of technology-supported curriculum design repositories, where questions of human-computer interaction (HCI) are also examined. The final works studied as part of the thesis are those which investigate and evaluate the efficacy of open repositories in exposing knowledge commons to resource discovery via web search agents. Through the analysis of the collected works it is possible to identify a unifying theory of resource discovery, with the proposed concept of (meta)data alignment described and presented with a visual model. This analysis assists in the identification of a number of research topics worthy of further research; but it also highlights an incremental transition by the present author, from using research to inform the development of technologies designed to support or facilitate resource discovery, particularly at a 'meta' level, to the application of specific technologies to address resource discovery issues in a local context. Despite this variation the research narrative has remained focussed on topics surrounding resource discovery in heterogeneous digital content environments and is noted as having generated a coherent body of work.     Curadoria digital : o conceito no período de 2000 a 2013

    Dissertação (mestrado)—Universidade de Brasília, Faculdade de Ciência da Informação, Programa de Pós-Graduação em Ciência da Informação, 2014.A presente pesquisa pretende apresentar um breve panorama do conceito de curadoria digital. Para tanto, foi realizada revisão de literatura, além de pesquisa em bases de dados especializadas em Ciência da Informação, com o intuito de investigar a produção sobre o tema, de 2000 a novembro de 2013 e posterior análise da bibliométrica e substantiva dos documentos. É realizada revisão bibliográfica nas áreas de bibliotecas digitais, preservação digital e em curadoria digital a fim de identificar um caminho teórico percorrido até a emergência do conceito de curadoria digital a partir da preservação digital e das bibliotecas digitais. Com base na amostra representativa da produção bibliográfica sobre o tema, fez-se análise de características relativas à forma dos documentos como autoria, afiliação dos autores, ano de publicação, tipo de documento, idioma e palavras-chave atribuídas. Quanto à apreciação da parte textual dos registros levantados, o foco da investigação foi voltado para busca do que é entendido como curadoria pelos autores no intuito de clarificar e consolidar a definição do termo curadoria digital e sua importância para a preservação da informação digital. Conclui-se que a curadoria digital está em franco desenvolvimento e constitui termo guarda-chuva, que abarca definições correlatas voltadas à seleção, enriquecimento, tratamento e preservação da informação para o acesso e uso futuro.This research aims to provide a brief overview of the concept of digital curation. To this end, a literature review was performed, in addition to a thorough search in Information Science specialized databases, in order to investigate the writings on the topic from 2000 to November 2013, which were subsequently analyzed bibliometrically and substantively. The literature review is conducted in the areas of digital libraries, digital preservation and digital curation to identify a theoretical path toward the emergence of the concept of digital curation from digital preservation and digital libraries. A representative sample of the research output is analyzed in terms of characteristics concerning the form of the documents, such as authorship, authors’ affiliations, publication year, document type, language and keywords assigned. On the assessment of the textual part of the records collected, the research aimed to determine what is understood as curation by the authors in order to clarify and consolidate the definition of digital curation and its importance for the preservation of digital information. We conclude that digital curation is developing rapidly as an umbrella term that encompasses related settings focused on selection, enrichment, processing and preservation of information for future use and access.Esta investigación tiene como objetivo proporcionar una breve visión general del concepto de la curaduría digital. Para ello una revisión de la literatura se llevó a cabo, y la investigación en bases de datos especializadas en Ciencias de la Información, con el propósito de investigar la literatura sobre el tema desde 2000 a noviembre de 2013 y posteriormente el análisis bibliométrico y sustantivo de los documentos. Revisión de la literatura se llevó a cabo en las áreas de las bibliotecas digitales, preservación digital y la preservación digital para identificar un camino teórico recorrido hasta la emergencia del concepto de curaduría digital a partir de la preservación digital y las bibliotecas digitales. Basado en una muestra representativa de la erudición sobre el tema, hizo el análisis de las características relativas a la forma de documentos, tales como la autoría, filiación del autor, año de publicación, tipo de documento, idioma y palabras clave asignadas. En cuando a la evaluación de la parte textual de los registros recopilados, el foco de la investigación se ha centrado en la búsqueda de lo que se entiende como curaduría por los autores a fin de aclarar y consolidar la definición de curaduría digital y su importancia para la preservación de la información digital. Llegamos a la conclusión de que curaduría digital se está desarrollando rápidamente y es un término general que abarca definiciones relacionadas, centradas en la selección, el enriquecimiento, el procesamiento y la preservación de la información para uso futuro y el acceso.Cette recherche propose la présentation d’un brève panorama du concept de la Curation Digitale. Pour déveloper ce travaill on a fait une Revision Literaire et recherches en bases des donnés spécialisées en sciences d’information. Le principal but c’est d’explorer la production sur ce sujet de 2000 à novembre de 2013 et avant l’analyse bibliometrique et substantive des documents. Il est réalisé la révision bibliographique sur les bibliothèques numériques, la préservation numérique et la curation digitale afin d’identifiquer un chemin théorique parcouru jusqu’à l’emergence du concept de la Curation Digitale à partir de la concervation numérique et des bibliothèques numériques. Basé dans un échantillon des relatifs à la forme des documents comme la paternité, les aflitions des auteurs, date de la publication, type de document, langue et mots-clés attribuées. Par rapport à l’appréciation de la partie textuelle des registres recueillies, la mise au point de l’investigation est le concept de Curation Digitale pour les auteurs avec le but de clarifier et consolider le thèrme Curation Digitale et son importance pour la préservation de l’information numérique. Pour conclure la curation digitale est en developpment et forme le thèrme para-pluie, qui comprend les definitions qui sont tounées à la sélection, enrichissement, traitement et préservation de l’information pour l’accès et l’usage future

    Qualitätskontrolle mittels semantischer Technologien in digitalen Bibliotheken

    Controlled content quality especially in terms of indexing is one of the major ad-vantages of using digital libraries in contrast to general Web sources or Web search engines. Therefore, more and more digital libraries offer corpora related to a specialized domain. Beyond simple keyword based searches the resulting infor-mation systems often rely on entity centered searches. For being able to offer this kind of search, a high quality document processing is essential. However, considering today’s information flood the mostly manual effort in ac-quiring new sources and creating suitable (semantic) metadata for content indexing and retrieval is already prohibitive. A recent solution is given by automatic genera-tion of metadata, where mostly statistical techniques like e.g. document classifica-tion and entity extraction currently become more widespread. But in this case neglecting quality assurance is even more problematic, because heuristic genera-tion often fails and the resulting low-quality metadata will directly diminish the quality of service that a digital library provides. Thus, the quality assessment of information system’s metadata annotations used for subsequent querying of collections has to be enabled. In this thesis we discuss the importance of metadata quality assessment for information systems and the benefits gained from controlled and guaranteed quality.Eine kontrollierte Qualität der Metadaten ist einer der wichtigsten Vorteile bei der Verwendung von digitalen Bibliotheken im Vergleich zu Web Suchmaschinen. Auf diesen hochqualitativen Inhalten werden immer mehr fachspezifische Portale durch die digitalen Bibliotheken erzeugt. Die so entstehenden Informationssysteme bieten oftmals neben einer simplen Stichwortsuche auch Objekt zentrierte Suchen an. Um solch eine Objekt-Suche zu ermöglichen, ist aber eine hochqualitative Verarbeitung der zugrunde liegenden Dokumente notwendig. Betrachtet man hingegen die heutige Informationsflut, so stellt man fest, dass der Aufwand für eine manuelle Erschließung von neuen Quellen und die Erzeugung von (semantischen) Metadaten für die Indexierung schon heute unerschwinglich ist. Eine aktuelle Lösung für dieses Problem ist die zumeist automatische Erzeugung von (semantischen) Metadaten, durch statistische Methoden, wie die automatische Dokumenten Klassifizierung Entitäten Extraktion. Aber bei der Verwendung sol-cher Methoden ist die Vernachlässigung der Qualität noch problematischer, da eine heuristische Erzeugung oftmals fehlerbehaftet ist. Diese schlechte Qualität der so erzeugten Metadaten wird dabei direkt die Servicequalität einer digitalen Biblio-thek herabmindern. Somit muss eine Qualitätsbewertung der Metadaten garantiert werden. In dieser Arbeit diskutieren wir die Bedeutung von Metadaten Qualität für Digitale Bibliotheken und die Chancen die aus kontrollierter und garantierter Qua-lität gewonnen werden können

    Information Retrieval for Multivariate Research Data Repositories

    In this dissertation, I tackle the challenge of information retrieval for multivariate research data by providing novel means of content-based access. Large amounts of multivariate data are produced and collected in different areas of scientific research and industrial applications, including the human or natural sciences, the social or economical sciences and applications like quality control, security and machine monitoring. Archival and re-use of this kind of data has been identified as an important factor in the supply of information to support research and industrial production. Due to increasing efforts in the digital library community, such multivariate data are collected, archived and often made publicly available by specialized research data repositories. A multivariate research data document consists of tabular data with mm columns (measurement parameters, e.g., temperature, pressure, humidity, etc.) and nn rows (observations). To render such data-sets accessible, they are annotated with meta-data according to well-defined meta-data standard when being archived. These annotations include time, location, parameters, title, author (and potentially many more) of the document under concern. In particular for multivariate data, each column is annotated with the parameter name and unit of its data (e.g., water depth [m]). The task of retrieving and ranking the documents an information seeker is looking for is an important and difficult challenge. To date, access to this data is primarily provided by means of annotated, textual meta-data as described above. An information seeker can search for documents of interest, by querying for the annotated meta-data. For example, an information seeker can retrieve all documents that were obtained in a specific region or within a certain period of time. Similarly, she can search for data-sets that contain a particular measurement via its parameter name or search for data-sets that were produced by a specific scientist. However, retrieval via textual annotations is limited and does not allow for content-based search, e.g., retrieving data which contains a particular measurement pattern like a linear relationship between water depth and water pressure, or which is similar to example data the information seeker provides. In this thesis, I deal with this challenge and develop novel indexing and retrieval schemes, to extend the established, meta-data based access to multivariate research data. By analyzing and indexing the data patterns occurring in multivariate data, one can support new techniques for content-based retrieval and exploration, well beyond meta-data based query methods. This allows information seekers to query for multivariate data-sets that exhibit patterns similar to an example data-set they provide. Furthermore, information seekers can specify one or more particular patterns they are looking for, to retrieve multivariate data-sets that contain similar patterns. To this end, I also develop visual-interactive techniques to support information seekers in formulating such queries, which inherently are more complex than textual search strings. These techniques include providing an over-view of potentially interesting patterns to search for, that interactively adapt to the user's query as it is being entered. Furthermore, based on the pattern description of each multivariate data document, I introduce a similarity measure for multivariate data. This allows scientists to quickly discover similar (or contradictory) data to their own measurements
