19,982 research outputs found

    Exploring The Value Of Folksonomies For Creating Semantic Metadata

    No full text
    Finding good keywords to describe resources is an on-going problem: typically we select such words manually from a thesaurus of terms, or they are created using automatic keyword extraction techniques. Folksonomies are an increasingly well populated source of unstructured tags describing web resources. This paper explores the value of the folksonomy tags as potential source of keyword metadata by examining the relationship between folksonomies, community produced annotations, and keywords extracted by machines. The experiment has been carried-out in two ways: subjectively, by asking two human indexers to evaluate the quality of the generated keywords from both systems; and automatically, by measuring the percentage of overlap between the folksonomy set and machine generated keywords set. The results of this experiment show that the folksonomy tags agree more closely with the human generated keywords than those automatically generated. The results also showed that the trained indexers preferred the semantics of folksonomy tags compared to keywords extracted automatically. These results can be considered as evidence for the strong relationship of folksonomies to the human indexer’s mindset, demonstrating that folksonomies used in the del.icio.us bookmarking service are a potential source for generating semantic metadata to annotate web resources

    Evolutionary Subject Tagging in the Humanities; Supporting Discovery and Examination in Digital Cultural Landscapes

    Get PDF
    In this paper, the authors attempt to identify problematic issues for subject tagging in the humanities, particularly those associated with information objects in digital formats. In the third major section, the authors identify a number of assumptions that lie behind the current practice of subject classification that we think should be challenged. We move then to propose features of classification systems that could increase their effectiveness. These emerged as recurrent themes in many of the conversations with scholars, consultants, and colleagues. Finally, we suggest next steps that we believe will help scholars and librarians develop better subject classification systems to support research in the humanities.NEH Office of Digital Humanities: Digital Humanities Start-Up Grant (HD-51166-10

    Intelligent XML Tag Classification Techniques for XML Encryption Improvement

    Get PDF
    Flexibility, friendliness, and adaptability have been key components to use XML to exchange information across different networks providing the needed common syntax for various messaging systems. However excess usage of XML as a communication medium shed the light on security standards used to protect exchanged messages achieving data confidentiality and privacy. This research presents a novel approach to secure XML messages being used in various systems with efficiency providing high security measures and high performance. system model is based on two major modules, the first to classify XML messages and define which parts of the messages to be secured assigning an importance level for each tag presented in XML message and then using XML encryption standard proposed earlier by W3C [3] to perform a partial encryption on selected parts defined in classification stage. As a result, study aims to improve both the performance of XML encryption process and bulk message handling to achieve data cleansing efficiently

    Content repositories and social networking : can there be synergies?

    Get PDF
    This paper details the novel application of Web 2.0 concepts to current services offered to Social Scientists by the ReDReSS project, carried out by the Centre for e-Science at Lancaster University. We detail plans to introduce Social Bookmarking and Social Networking concepts into the repository software developed by the project. This will result in the improved discovery of e-Science concepts and training to Social Scientists and allow for much improved linking of resources in the repository. We describe plans that use Social Networking and Social Bookmarking concepts, using Open Standards, which will promote collaboration between researchers by using information gathered on user’s use of the repository and information about the user. This will spark collaborations that would not normally be possible in the academic repository context

    Folksonomy: the New Way to Serendipity

    Get PDF
    Folksonomy expands the collaborative process by allowing contributors to index content. It rests on three powerful properties: the absence of a prior taxonomy, multi-indexation and the absence of thesaurus. It concerns a more exploratory search than an entry in a search engine. Its original relationship-based structure (the three-way relationship between users, content and tags) means that folksonomy allows various modalities of curious explorations: a cultural exploration and a social exploration. The paper has two goals. Firstly, it tries to draw a general picture of the various folksonomy websites. Secundly, since labelling lacks any standardisation, folksonomies are often under threat of invasion by noise. This paper consequently tries to explore the different possible ways of regulating the self-generated indexation process.taxonomy; indexation; innovation and user-created content

    A matter of words: NLP for quality evaluation of Wikipedia medical articles

    Get PDF
    Automatic quality evaluation of Web information is a task with many fields of applications and of great relevance, especially in critical domains like the medical one. We move from the intuition that the quality of content of medical Web documents is affected by features related with the specific domain. First, the usage of a specific vocabulary (Domain Informativeness); then, the adoption of specific codes (like those used in the infoboxes of Wikipedia articles) and the type of document (e.g., historical and technical ones). In this paper, we propose to leverage specific domain features to improve the results of the evaluation of Wikipedia medical articles. In particular, we evaluate the articles adopting an "actionable" model, whose features are related to the content of the articles, so that the model can also directly suggest strategies for improving a given article quality. We rely on Natural Language Processing (NLP) and dictionaries-based techniques in order to extract the bio-medical concepts in a text. We prove the effectiveness of our approach by classifying the medical articles of the Wikipedia Medicine Portal, which have been previously manually labeled by the Wiki Project team. The results of our experiments confirm that, by considering domain-oriented features, it is possible to obtain sensible improvements with respect to existing solutions, mainly for those articles that other approaches have less correctly classified. Other than being interesting by their own, the results call for further research in the area of domain specific features suitable for Web data quality assessment

    Ensuring the discoverability of digital images for social work education : an online tagging survey to test controlled vocabularies

    Get PDF
    The digital age has transformed access to all kinds of educational content not only in text-based format but also digital images and other media. As learning technologists and librarians begin to organise these new media into digital collections for educational purposes, older problems associated with cataloguing and classifying non-text media have re-emerged. At the heart of this issue is the problem of describing complex and highly subjective images in a reliable and consistent manner. This paper reports on the findings of research designed to test the suitability of two controlled vocabularies to index and thereby improve the discoverability of images stored in the Learning Exchange, a repository for social work education and research. An online survey asked respondents to "tag", a series of images and responses were mapped against the two controlled vocabularies. Findings showed that a large proportion of user generated tags could be mapped to the controlled vocabulary terms (or their equivalents). The implications of these findings for indexing and discovering content are discussed in the context of a wider review of the literature on "folksonomies" (or user tagging) versus taxonomies and controlled vocabularies
    corecore