87 research outputs found

    Usefulness of social tagging in organizing and providing access to the web: An analysis of indexing consistency and quality

    Get PDF
    This dissertation research points out major challenging problems with current Knowledge Organization (KO) systems, such as subject gateways or web directories: (1) the current systems use traditional knowledge organization systems based on controlled vocabulary which is not very well suited to web resources, and (2) information is organized by professionals not by users, which means it does not reflect intuitively and instantaneously expressed users’ current needs. In order to explore users’ needs, I examined social tags which are user-generated uncontrolled vocabulary. As investment in professionally-developed subject gateways and web directories diminishes (support for both BUBL and Intute, examined in this study, is being discontinued), understanding characteristics of social tagging becomes even more critical. Several researchers have discussed social tagging behavior and its usefulness for classification or retrieval; however, further research is needed to qualitatively and quantitatively investigate social tagging in order to verify its quality and benefit. This research particularly examined the indexing consistency of social tagging in comparison to professional indexing to examine the quality and efficacy of tagging. The data analysis was divided into three phases: analysis of indexing consistency, analysis of tagging effectiveness, and analysis of tag attributes. Most indexing consistency studies have been conducted with a small number of professional indexers, and they tended to exclude users. Furthermore, the studies mainly have focused on physical library collections. This dissertation research bridged these gaps by (1) extending the scope of resources to various web documents indexed by users and (2) employing the Information Retrieval (IR) Vector Space Model (VSM) - based indexing consistency method since it is suitable for dealing with a large number of indexers. As a second phase, an analysis of tagging effectiveness with tagging exhaustivity and tag specificity was conducted to ameliorate the drawbacks of consistency analysis based on only the quantitative measures of vocabulary matching. Finally, to investigate tagging pattern and behaviors, a content analysis on tag attributes was conducted based on the FRBR model. The findings revealed that there was greater consistency over all subjects among taggers compared to that for two groups of professionals. The analysis of tagging exhaustivity and tag specificity in relation to tagging effectiveness was conducted to ameliorate difficulties associated with limitations in the analysis of indexing consistency based on only the quantitative measures of vocabulary matching. Examination of exhaustivity and specificity of social tags provided insights into particular characteristics of tagging behavior and its variation across subjects. To further investigate the quality of tags, a Latent Semantic Analysis (LSA) was conducted to determine to what extent tags are conceptually related to professionals’ keywords and it was found that tags of higher specificity tended to have a higher semantic relatedness to professionals’ keywords. This leads to the conclusion that the term’s power as a differentiator is related to its semantic relatedness to documents. The findings on tag attributes identified the important bibliographic attributes of tags beyond describing subjects or topics of a document. The findings also showed that tags have essential attributes matching those defined in FRBR. Furthermore, in terms of specific subject areas, the findings originally identified that taggers exhibited different tagging behaviors representing distinctive features and tendencies on web documents characterizing digital heterogeneous media resources. These results have led to the conclusion that there should be an increased awareness of diverse user needs by subject in order to improve metadata in practical applications. This dissertation research is the first necessary step to utilize social tagging in digital information organization by verifying the quality and efficacy of social tagging. This dissertation research combined both quantitative (statistics) and qualitative (content analysis using FRBR) approaches to vocabulary analysis of tags which provided a more complete examination of the quality of tags. Through the detailed analysis of tag properties undertaken in this dissertation, we have a clearer understanding of the extent to which social tagging can be used to replace (and in some cases to improve upon) professional indexing

    Special Libraries, February 1951

    Get PDF
    Volume 42, Issue 2https://scholarworks.sjsu.edu/sla_sl_1951/1001/thumbnail.jp

    Pictures in words : indexing, folksonomy and representation of subject content in historic photographs

    Get PDF
    Subject access to images is a major issue for image collections. Research is needed to understand how indexing and tagging contribute to make the subjects of historic photographs accessible. This thesis firstly investigates the evidence of cognitive dissonance between indexers and users in the way they attribute subjects to historic photographs, and, secondly, how indexers and users might work together to enhance subject description. It analyses how current indexing and social tagging represent the subject content of historic photographs. It also suggests a practical way indexers can work with taggers to deal with the classic problem of resource constraints and to enhance metadata to make photo collections more accessible. In an original application of the Shatford/Panofsky classification matrix within the applications domain of historic images, patterns of subject attribution are explored between taggers and professional indexers. The study was conducted in two stages. The first stage (Studies A to D) investigated how professional indexers and taggers represent the subject content of historic photographs and revealed differences based on Shatford/Panofsky. The indexers (Study A) demonstrated a propensity for specific and generic subjects and almost complete avoidance of abstracts. In contrast, a pilot study with users (Study B) and with baseline taggers (Studies C and D) showed their propensity for generics and equal inclination to specifics and abstracts. The evidence supports the conclusion that indexers and users approach the subject content of historic photographs differently, demonstrating cognitive dissonance, a conflict between how they appear to think about and interpret images. The second stage (Study E) demonstrated that an online training intervention affected tagging behaviour. The intervention resulted in increased tagging and fuller representation of all subject facets according to the Shatford/Panofsky classification matrix. The evidence showed that trained taggers tagged more generic and abstract facets than untrained taggers. Importantly, this suggests that training supports the annotation of the higher levels of subject content and so potentially provides enhanced intellectual access. The research demonstrated a practical way institutions can work with taggers to extend the representation of subject content in historic photographs. Improved subject description is critical for intellectual access and retrieval in the cultural heritage space. Through systematic application of the training method a richer corpus of descriptors might be created that enhances machine based information retrieval via automatic extraction

    Crowdsourcing for image metadata : a comparison between game-generated tags and professional descriptors

    Get PDF
    One way to address the challenge of creating metadata for digitized image collections is to rely on user-created index terms, typically by harvesting tags from the collaborative information services known as folksonomies or by allowing the users to tag directly in the catalog. An alternative method, only recently applied in cultural heritage institutions, is Human Computation Games, a crowdsourcing tool that relies on user-agreement to create valid tags. This study contributes to the research by investigating tags (at various degrees of validation) generated by a Human Computation Game and comparing them to descriptors assigned to the same images by professional indexers. The analysis is done by classifying tags and descriptors by term-category, as well as by measuring overlap on both syntactic (matching on terms) and semantic (matching on meaning) level between the tags and the descriptors. The findings shows that validated tags tend to describe ‘artifacts/objects’ and that game-generated tags typically will represent what is in the picture, rather than what it is about. Descriptors also primarily belonged to this term-category but also had a substantial amount of ‘Proper nouns’, mainly named locations. Tags generated by the game, not validated by player-agreement, had a higher frequency of ‘subjective/narrative’ tags, but also more errors. It was determined that the exact (character-for-character) overlap i.e. the number of common terms compared to the entire pool of tags and descriptors was slightly less than 5% for all types of tags. By extending the analysis to include fuzzy (word-stem) matching, the overlap more than doubled. The semantic overlap was established with thesaurus relations between a sample of tags and descriptors and adapting this - more inclusive - view of overlap resulted in an increase in percentage of tags that were matched to descriptors. More than half of the validated tags had some thesaurus relation to a descriptor added by a professional indexer. Approximately 60% of the thesaurus relations between descriptors and valid tags were either ‘same’ or ‘equivalent’ and roughly 20% were associative and 20% were hierarchical. For the hierarchical relations it was found that tags typically describe images at a less specific level than descriptors.Joint Master Degree in Digital Library Learning (DILL

    Special Libraries, Fall 1985

    Get PDF
    Volume 76, Issue 4https://scholarworks.sjsu.edu/sla_sl_1985/1003/thumbnail.jp

    Vocabulário controlado e indexação social de imagens de arquitetura: um sistema de organização do conhecimento em ambiente colaborativo

    Get PDF
    This paper aims to report the research carried out for the development of a controlled vocabulary in a collaborative web environment, which allows social indexing by tagging the images posted by both the personal user and the institutional user. Created for the preservation and dissemination of Brazilian architecture images, Arquigrafia is also a social network formed by students, teachers, researchers, professionals and others interested in architecture and urban spaces photography. Thus, it was necessary to analyze the list of tags to improve the consistency of indexing, seeking a conceptual organization of domain terms from the application of terminological methodology, aiming at the alignment of vocabulary terms under construction with other knowledge organization systems.El objetivo de este trabajo es informar la investigación realizada para el desarrollo de un vocabulario controlado en un entorno web colaborativo, que permite la indexación social al etiquetar las imágenes publicadas tanto por el usuario personal como por el usuario institucional. Creado para la preservación y difusión de imágenes de la arquitectura brasileña, el Arquigrafia es también una red social formada por estudiantes, profesores, investigadores, profesionales y aquellos interesados en fotografías de Arquitectura y Espacios Urbanos. Por lo tanto, fue necesario analizar la lista de etiquetas para mejorar la consistencia de la indexación, buscando una organización conceptual de los términos del dominio a partir de la aplicación de la metodología terminológica, y también con el objetivo de alinear los términos del vocabulario en construcción con otros sistemas de organización del conocimiento.Este trabalho tem por objetivo relatar a pesquisa realizada para o desenvolvimento de um vocabulário controlado em ambiente colaborativo web, o qual permite a indexação social pelo tagueamento das imagens postadas tanto pelo usuário pessoal quanto pelo usuário institucional. Criado para preservação e divulgação de imagens de arquitetura brasileira, o Arquigrafia é também uma rede social formada por estudantes, professores, pesquisadores, profssionais e interessados em fotografias de Arquitetura e Espaços Urbanos. Assim, foi necessária a análise da lista de tags para melhoria da consistência da indexação, buscando-se uma organização conceitual dos termos do domínio a partir da aplicação de metodologia terminológica, visando ainda o alinhamento dos termos do vocabulário em construção com outros sistemas de organização do conhecimento

    The place of cataloguing and classification in the curricula of South African universities

    Get PDF
    Bibliography: pages 361-372.The aim of this study is to determine the place of cataloguing and classification in the library and information science curricula of South African universities today, and to determine whether, in compiling the syllabus comprising bibliographic description and subject analysis, new developments and changes are being taken into consideration. With this in mind, attention has been given to the following: (a) Developments in general have been reconstructed by means of a review of the history of cataloguing and classification, from ancient to present times; (b) a review of the comprehensive development of education for librarianship overseas and in South Africa; and (c) an investigation of the present position of bibliographic description and subject analysis in the curricula of library and information science of South African universities

    Special Libraries, February 1964

    Get PDF
    Volume 55, Issue 2https://scholarworks.sjsu.edu/sla_sl_1964/1001/thumbnail.jp

    The Simple Knowledge Organization System (SKOS): a situation report for the HIVE Project

    Get PDF
    HIVE (Helping Interdisciplinary Vocabularies Engineering) es un proyecto financiado por el IMLS (Institute of Museums and Library Services), e indirectamente, en Dryad, ambos proyectos en colaboración del Metadata Research Center y el National Evolutionary Synthesis Center (NESCent) in Durham, North Carolina. Con el desarrollo de HIVE se pretende resolver esta problemática mediante una propuesta de generación automática de metadatos que permita la integración dinámica de vocabularios controlados específicos. Para asistir la integración de vocabularios se seleccionó SKOS (Simple Knowledge Organisation System), un estándar del World Wide Web Consortium (W3C) para la representación de sistemas de organización del conocimiento o vocabularios, como tesauros, esquemas de clasificación, sistemas de encabezamiento de materias y taxonomías, en el marco de la Web Semántica.El presente informe realiza un análisis exhaustivo de la situación en cuanto a la aplicación de SKOS. El estudio incluye una detallada revisión de literatura científica y recursos web sobre el modelo, una selección de los proyectos, iniciativas, herramientas, grupos de investigación claves y cualquier otro tipo de información que pudiera ser de relevancia para el logro de los objetivos del proyecto HIVE. Asimismo, se analiza la importancia de SKOS para el logro de la interoperabilidad semántica y se elaboran un conjunto de recomendaciones para los miembros del proyecto HIVE
    • …
    corecore