247 research outputs found

    Generation of Classificatory Metadata for Web Resources using Social Tags

    Get PDF
    With the increasing popularity of social tagging systems, the potential for using social tags as a source of metadata is being explored. Social tagging systems can simplify the involvement of a large number of users and improve the metadata generation process, especially for semantic metadata. This research aims to find a method to categorize web resources using social tags as metadata. In this research, social tagging systems are a mechanism to allow non-professional catalogers to participate in metadata generation. Because social tags are not from a controlled vocabulary, there are issues that have to be addressed in finding quality terms to represent the content of a resource. This research examines ways to deal with those issues to obtain a set of tags representing the resource from the tags provided by users.Two measurements that measure the importance of a tag are introduced. Annotation Dominance (AD) is a measurement of how much a tag term is agreed to by users. Another is Cross Resources Annotation Discrimination (CRAD), a measurement to discriminate tags in the collection. It is designed to remove tags that are used broadly or narrowly in the collection. Further, the study suggests a process to identify and to manage compound tags. The research aims to select important annotations (meta-terms) and remove meaningless ones (noise) from the tag set. This study, therefore, suggests two main measurements for getting a subset of tags with classification potential. To evaluate the proposed approach to find classificatory metadata candidates, we rely on users' relevance judgments comparing suggested tag terms and expert metadata terms. Human judges rate how relevant each term is on an n-point scale based on the relevance of each of the terms for the given resource

    #MPLP: a Comparison of Domain Novice and Expert User-generated Tags in a Minimally Processed Digital Archive

    Get PDF
    The high costs of creating and maintaining digital archives precluded many archives from providing users with digital content or increasing the amount of digitized materials. Studies have shown users increasingly demand immediate online access to archival materials with detailed descriptions (access points). The adoption of minimal processing to digital archives limits the access points at the folder or series level rather than the item-level description users\u27 desire. User-generated content such as tags, could supplement the minimally processed metadata, though users are reluctant to trust or use unmediated tags. This dissertation project explores the potential for controlling/mediating the supplemental metadata from user-generated tags through inclusion of only expert domain user-generated tags. The study was designed to answer three research questions with two parts each: 1(a) What are the similarities and differences between tags generated by expert and novice users in a minimally processed digital archive?, 1(b) Are there differences between expert and novice users\u27 opinions of the tagging experience and tag creation considerations?, 2(a) In what ways do tags generated by expert and/or novice users in a minimally processed collection correspond with metadata in a traditionally processed digital archive?, 2(b) Does user knowledge affect the proportion of tags matching unselected metadata in a minimally processed digital archive?, 3(a) In what ways do tags generated by expert and/or novice users in a minimally processed collection correspond with existing users\u27 search terms in a digital archive?, and 3(b) Does user knowledge affect the proportion of tags matching query terms in a minimally processed digital archive? The dissertation project was a mixed-methods, quasi-experimental design focused on tag generation within a sample minimally processed digital archive. The study used a sample collection of fifteen documents and fifteen photographs. Sixty participants divided into two groups (novices and experts) based on assessed prior knowledge of the sample collection\u27s domain generated tags for fifteen documents and fifteen photographs (a minimum of one tag per object). Participants completed a pre-questionnaire identifying prior knowledge, and use of social tagging and archives. Additionally, participants provided their opinions regarding factors associated with tagging including the tagging experience and considerations while creating tags through structured and open-ended questions in a post-questionnaire. An open-coding analysis of the created tags developed a coding scheme of six major categories and six subcategories. Application of the coding scheme categorized all generated tags. Additional descriptive statistics summarized the number of tags created by each domain group (expert, novice) for all objects and divided by format (photograph, document). T-tests and Chi-square tests explored the associations (and associative strengths) between domain knowledge and the number of tags created or types of tags created for all objects and divided by format. The subsequent analysis compared the tags with the metadata from the existing collection not displayed within the sample collection participants used. Descriptive statistics summarized the proportion of tags matching unselected metadata and Chi-square tests analyzed the findings for associations with domain knowledge. Finally, the author extracted existing users\u27 query terms from one month of server-log data and compared the generated-tags and unselected metadata. Descriptive statistics summarized the proportion of tags and unselected metadata matching query terms, and Chi-square tests analyzed the findings for associations with domain knowledge. Based on the findings, the author discussed the theoretical and practical implications of including social tags within a minimally processed digital archive

    Using Technology Enabled Qualitative Research to Develop Products for the Social Good, An Overview

    Get PDF
    This paper discusses the potential benefits of the convergence of three recent trends for the design of socially beneficial products and services: the increasing application of qualitative research techniques in a wide range of disciplines, the rapid mainstreaming of social media and mobile technologies, and the emergence of software as a service. Presented is a scenario facilitating the complex data collection, analysis, storage, and reporting required for the qualitative research recommended for the task of designing relevant solutions to address needs of the underserved. A pilot study is used as a basis for describing the infrastructure and services required to realize this scenario. Implications for innovation of enhanced forms of qualitative research are presented

    Semantically-enhanced image tagging system

    Get PDF
    In multimedia databases, data are images, audio, video, texts, etc. Research interests in these types of databases have increased in the last decade or so, especially with the advent of the Internet and Semantic Web. Fundamental research issues vary from unified data modelling, retrieval of data items and dynamic nature of updates. The thesis builds on findings in Semantic Web and retrieval techniques and explores novel tagging methods for identifying data items. Tagging systems have become popular which enable the users to add tags to Internet resources such as images, video and audio to make them more manageable. Collaborative tagging is concerned with the relationship between people and resources. Most of these resources have metadata in machine processable format and enable users to use free- text keywords (so-called tags) as search techniques. This research references some tagging systems, e.g. Flicker, delicious and myweb2.0. The limitation with such techniques includes polysemy (one word and different meaning), synonymy (different words and one meaning), different lexical forms (singular, plural, and conjugated words) and misspelling errors or alternate spellings. The work presented in this thesis introduces semantic characterization of web resources that describes the structure and organization of tagging, aiming to extend the existing Multimedia Query using similarity measures to cater for collaborative tagging. In addition, we discuss the semantic difficulties of tagging systems, suggesting improvements in their accuracies. The scope of our work is classified as follows: (i) Increase the accuracy and confidence of multimedia tagging systems. (ii) Increase the similarity measures of images by integrating varieties of measures. To address the first shortcoming, we use the WordNet based on a tagging system for social sharing and retrieval of images as a semantic lingual ontology resource. For the second shortcoming we use the similarity measures in different ways to recognise the multimedia tagging system. Fundamental to our work is the novel information model that we have constructed for our computation. This is based on the fact that an image is a rich object that can be characterised and formulated in n-dimensions, each dimension contains valuable information that will help in increasing the accuracy of the search. For example an image of a tree in a forest contains more information than an image of the same tree but in a different environment. In this thesis we characterise a data item (an image) by a primary description, followed by n-secondary descriptions. As n increases, the accuracy of the search improves. We give various techniques to analyse data and its associated query. To increase the accuracy of the tagging system we have performed different experiments on many images using similarity measures and various techniques from VoI (Value of Information). The findings have shown the linkage/integration between similarity measures and that VoI improves searches and helps/guides a tagger in choosing the most adequate of tags

    Semantic technologies: from niche to the mainstream of Web 3? A comprehensive framework for web Information modelling and semantic annotation

    Get PDF
    Context: Web information technologies developed and applied in the last decade have considerably changed the way web applications operate and have revolutionised information management and knowledge discovery. Social technologies, user-generated classification schemes and formal semantics have a far-reaching sphere of influence. They promote collective intelligence, support interoperability, enhance sustainability and instigate innovation. Contribution: The research carried out and consequent publications follow the various paradigms of semantic technologies, assess each approach, evaluate its efficiency, identify the challenges involved and propose a comprehensive framework for web information modelling and semantic annotation, which is the thesis’ original contribution to knowledge. The proposed framework assists web information modelling, facilitates semantic annotation and information retrieval, enables system interoperability and enhances information quality. Implications: Semantic technologies coupled with social media and end-user involvement can instigate innovative influence with wide organisational implications that can benefit a considerable range of industries. The scalable and sustainable business models of social computing and the collective intelligence of organisational social media can be resourcefully paired with internal research and knowledge from interoperable information repositories, back-end databases and legacy systems. Semantified information assets can free human resources so that they can be used to better serve business development, support innovation and increase productivity

    Web 2.0: Where does Europe stand?

    Get PDF
    This report provides a techno-economic analysis of Web 2.0 and an assessment of Europe's position in Web 2.0 applications. Firstly, it introduces the phenomenon of Web 2.0 and its main characteristics: technologies, applications, and user roles. It then provides an overview of its adoption, value chain and business models, before moving to an analysis of its drivers, industrial impact and disruptive potential. Finally, the report assesses the position of the European Web 2.0 applications industry and its prospects for growth. "Web 2.0" is defined as a set of applications (blogs, wikis, social tagging, social gaming etc.), technologies (including AJAX, syndication feeds, mash-ups, and wiki engines) and user roles. The most pertinent characteristic of Web 2.0, as compared to the previous "version" of the Web, is that it enables users to become, with little effort, a co-provider of content. The available figures about recent Web 2.0 diffusion deliver two messages. Firstly, its spread is extremely rapid by any standards, although not uniform for all its applications. Secondly, the intensity with which users participate differs a lot. At the centre of the Web 2.0 value chain are the providers of Web 2.0 applications who may be pure Web 2.0 players or traditional players from related industries such as the media and Web 1.0 industry. They provide opportunities for users to network and/or to create content. As yet, no dominant revenue model for Web 2.0 content-hosting sites has been established, although advertising is the most common one. The content hosting platforms may in turn choose to remunerate content creators through different revenue-sharing schemes, or simply rely on their voluntary contributions. We discuss four aspects of Web 2.0 which may have a disruptive impact on industry: (1) Providers of Web 2.0 applications are becoming increasingly numerous and large, and contribute to growth and employment. (2) They already constitute an important threat to other industries, in particular content industries. The content industry is responding by diversifying into Web 2.0. (3) Web 2.0 applications and software are being increasingly adopted by the enterprise and public sectors as tools for improving internal work processes, managing customer and public relations, innovation, recruitment and networking. (4) The growth of Web 2.0 leads to a derived demand in the supply of ICT hardware and software. Europe's current position in the supply and development of Web 2.0 applications is rather weak. Although Web 2.0 is used almost as much in Europe as it is in Asia and the US, Web 2.0 applications are largely provided by US companies, while Europe and all other regions are left behind. About two thirds of the major Web 2.0 applications are provided by US companies, with similar shares for revenues, employees, and even higher shares for innovation indicators such as patents, venture capital and R&D expenditures. The corresponding shares for the EU are around 10%. Nevertheless, Europe could have the advantage in some areas of the Web 2.0 landscape, for example social gaming, social networking, and Mobile 2.0.JRC.J.4-Information Societ

    Social software for music

    Get PDF
    Tese de mestrado integrado. Engenharia Informática e Computação. Faculdade de Engenharia. Universidade do Porto. 200

    CUSTOMER REVIEWS ANALYSIS WITH DEEP NEURAL NETWORKS FOR E-COMMERCE RECOMMENDER SYSTEMS

    Get PDF
    The first part of this thesis systematically reviews the trend of researches conducted from 2011 to 2018 in terms of challenges and problems regarding developing a recommendation system, areas of application, proposed methodologies, evaluations criteria used to assess the performance and limitations and drawbacks that require investigation and improvements. The study provides an overview for those who are interested in this field to understand the current and the future research opportunities. The second part of this thesis proposes a new methodology to consider customer reviews in recommender systems. An essential prerequisite of an effective recommender system is providing helpful information regarding users and items to generate high-quality recommendations. Customer reviews are a rich source of information that can offer insights into the recommender systems. However, dealing with the customer feedback in text format, as unstructured data, is challenging. Our research includes extraction of the features from customer reviews and use them for similarity evaluation of the users to generate the recommendations. To do so, we have developed a glossary of features for each product category using Latent Dirichlet Allocation. We then employed a deep neural network to extract deep features from the users-attributes matrix to deal with sparsity, ambiguity, and redundancy. Furthermore, we then applied matrix factorization as the collaborative filtering method to provide recommendations. The experimental results using Amazon dataset demonstrate that our methodology improves the performance of the recommender system by incorporating information from reviews when compared to the baselines
    corecore