5 research outputs found

    Using semantic indexing to improve searching performance in web archives

    Get PDF
    The sheer volume of electronic documents being published on the Web can be overwhelming for users if the searching aspect is not properly addressed. This problem is particularly acute inside archives and repositories containing large collections of web resources or, more precisely, web pages and other web objects. Using the existing search capabilities in web archives, results can be compromised because of the size of data, content heterogeneity and changes in scientific terminologies and meanings. During the course of this research, we will explore whether semantic web technologies, particularly ontology-based annotation and retrieval, could improve precision in search results in multi-disciplinary web archives

    The First International Conference on Building and Exploring Web Based Environments-WEB2013

    Get PDF
    he sheer volume of electronic documents being published on the Web can be overwhelming for users if the searching aspect is not properly addressed. This problem is particularly acute inside archives and repositories containing large collections of web resources or, more precisely, web pages and other web objects. Using the existing search capabilities in web archives, results can be compromised because of the size of data, content heterogeneity and changes in scientific terminologies and meanings. During the course of this research, we will explore whether semantic web technologies, particularly ontology-based annotation and retrieval, could improve precision in search results in multi-disciplinary web archives

    Providing context to Web collections: A survey of Archive-It users

    Get PDF
    This study describes a survey to users of the Internet Archive's Archive-It Web-archiving tool, aiming to examine the descriptive metadata practice of archivists of the Web, how Web archives are accessed, and what variables facilitate or impede metadata implementation in Web collections. Whereas books often contain contextual information bound between their covers, archival materials require additional explanation of context. The Web is the most transient of electronic records, and although it is currently being preserved at a higher rate than ever before, treatment of Web collections is still not up to archival standards. Through better understanding of current Web archiving metadata practices, this study hopes to help lay groundwork for future best practices.Master of Science in Information Scienc

    Crowd-annotation and LoD-based semantic indexing of content in multi-disciplinary web repositories to improve search results

    Get PDF
    Searching for relevant information in multi-disciplinary web repositories is becoming a topic of increasing interest among the computer science research community. To date, methods and techniques to extract useful and relevant information from online repositories of research data have largely been based on static full text indexing which entails a ‘produce once and use forever’ kind of strategy. That strategy is fast becoming insufficient due to increasing data volume, concept obsolescence, and complexity and heterogeneity of content types in web repositories. We propose that by automatic semantic annotation of content in web repositories (using Linked Open Data or LoD sources) without using domain-specific ontologies, we can sustain the performance of searching by retrieving highly relevant search results. Secondly, we claim that by expert crowd-annotation of content on top of automatic semantic annotation, we can enrich the semantic index over time to augment the contextual value of content in web repositories so that they remain findable despite changes in language, terminology and scientific concepts. We deployed a custom- built annotation, indexing and searching environment in a web repository website that has been used by expert annotators to annotate webpages using free text and vocabulary terms. We present our findings based on the annotation and tagging data on top of LoD-based annotations and the overall modus operandi. We also analyze and demonstrate that by adding expert annotations to the existing semantic index, we can improve the relationship between query and documents using Cosine Similarity Measures (CSM)

    Una mirada a la ciencia de la información desde los nuevos contextos paradigmáticos de la posmodernidad

    Get PDF
    As mudanças sempre vêm acompanhadas de elementos de inovação, gerados por algo ou alguém para atender necessidades ou demandas. Novos processos, produtos, conhecimentos e ferramentas são criados para viabilizar avanços, perspectivas e condições não existentes. O entendimento dessas mudanças e o conhecimento dos elementos de inovação, bem como das necessidades e demandas que os impulsionaram, precisam de contextualização temporal e situacional a partir de estudos teóricos e metodológicos que, além de definir, conceituar, descrever e caracterizar pode organizar os conhecimentos de forma reflexiva e crítica. Tais estudos propiciam a observação de problemas, novos caminhos e melhor integração culturais dos profissionais e sociedade em geral frente aos desafios da “desacomodação” que as mudanças provocam, embora estejam sempre associadas à melhoria do bem estar social
    corecore