9,778 research outputs found

    Towards Cleaning-up Open Data Portals: A Metadata Reconciliation Approach

    Full text link
    This paper presents an approach for metadata reconciliation, curation and linking for Open Governamental Data Portals (ODPs). ODPs have been lately the standard solution for governments willing to put their public data available for the society. Portal managers use several types of metadata to organize the datasets, one of the most important ones being the tags. However, the tagging process is subject to many problems, such as synonyms, ambiguity or incoherence, among others. As our empiric analysis of ODPs shows, these issues are currently prevalent in most ODPs and effectively hinders the reuse of Open Data. In order to address these problems, we develop and implement an approach for tag reconciliation in Open Data Portals, encompassing local actions related to individual portals, and global actions for adding a semantic metadata layer above individual portals. The local part aims to enhance the quality of tags in a single portal, and the global part is meant to interlink ODPs by establishing relations between tags.Comment: 8 pages,10 Figures - Under Revision for ICSC201

    Web Video in Numbers - An Analysis of Web-Video Metadata

    Full text link
    Web video is often used as a source of data in various fields of study. While specialized subsets of web video, mainly earmarked for dedicated purposes, are often analyzed in detail, there is little information available about the properties of web video as a whole. In this paper we present insights gained from the analysis of the metadata associated with more than 120 million videos harvested from two popular web video platforms, vimeo and YouTube, in 2016 and compare their properties with the ones found in commonly used video collections. This comparison has revealed that existing collections do not (or no longer) properly reflect the properties of web video "in the wild".Comment: Dataset available from http://download-dbis.dmi.unibas.ch/WWIN

    An analytical study of content and context of keywords on physics

    Get PDF
    This paper is based on the analysis of author-assigned and title keywords and their constituent component words collected from 769 articles published in the journal Low Temperature Physics since the year 2006 to 2010. The total number of distinct keywords is 1155 of which 869 are single keywords having total frequency of occurrence of 2287. The single keywords have been categorized in four broad classes, viz. eponymous word, form word, acronym and semantic word. A semantic word bears several contexts and thus it may be considered as relevant in several other subject areas. The probable subject areas have been found with the aid of two popular online reference tools. The semantic words are further categorized in twelve classes according to their contexts. Some parameters have been defined on the basis of associations among the words and formation of keywords in consequence, i.e. Word Association Density, Word Association Coefficient and Keyword Formation Density. The values of these parameters have been observed for different word categories. The statistics of word association tending keyword formation would be known from this study. The allied subject domains also become predictable from this study

    A Semantic Framework for the Analysis of Privacy Policies

    Get PDF
    • …
    corecore