350 research outputs found

    An examination of automatic video retrieval technology on access to the contents of an historical video archive

    Get PDF
    Purpose – This paper aims to provide an initial understanding of the constraints that historical video collections pose to video retrieval technology and the potential that online access offers to both archive and users. Design/methodology/approach – A small and unique collection of videos on customs and folklore was used as a case study. Multiple methods were employed to investigate the effectiveness of technology and the modality of user access. Automatic keyframe extraction was tested on the visual content while the audio stream was used for automatic classification of speech and music clips. The user access (search vs browse) was assessed in a controlled user evaluation. A focus group and a survey provided insight on the actual use of the analogue archive. The results of these multiple studies were then compared and integrated (triangulation). Findings – The amateur material challenged automatic techniques for video and audio indexing, thus suggesting that the technology must be tested against the material before deciding on a digitisation strategy. Two user interaction modalities, browsing vs searching, were tested in a user evaluation. Results show users preferred searching, but browsing becomes essential when the search engine fails in matching query and indexed words. Browsing was also valued for serendipitous discovery; however the organisation of the archive was judged cryptic and therefore of limited use. This indicates that the categorisation of an online archive should be thought of in terms of users who might not understand the current classification. The focus group and the survey showed clearly the advantage of online access even when the quality of the video surrogate is poor. The evidence gathered suggests that the creation of a digital version of a video archive requires a rethinking of the collection in terms of the new medium: a new archive should be specially designed to exploit the potential that the digital medium offers. Similarly, users' needs have to be considered before designing the digital library interface, as needs are likely to be different from those imagined. Originality/value – This paper is the first attempt to understand the advantages offered and limitations held by video retrieval technology for small video archives like those often found in special collections

    ICONCLASS - Klasifikacijski sustav za umjetnost i ikonografiju

    Get PDF
    Documenting is a crucial activity for any museum or art institution. Today, that importance is growing for the metadata museum provides us with, is essential in retrieving information in the vast amount of data of the modern world. The goal of this study is to discuss the design of thesauri, how they work and what is their purpose in documenting museum objects. It further discusses content indexing together with aboutness, isness and ofness, to draw a parallel with Panofsky’s categories in iconography. The central focus of the work falls onto analyzing Iconclass, its features, and usage. Additionally, it concentrates on new developments in machine learning within artificial intelligence, which use Iconclass to generate and automatize new data and connections. Finally, it gives a brief overview of folksonomy and social tagging.Dokumentiranje je ključna aktivnost svakog muzeja ili umjetničke institucije. Danas ta važnost raste jer metapodaci koje nam muzej pruža igraju bitnu ulogu u pronalaženju informacija u ogromnoj količini podataka suvremenog svijeta. Cilj ovog rada je predstaviti i raspravljati o dizajnu tezaurusa, kako oni rade i koja je njihova svrha u dokumentiranju muzejskih objekata. Nadalje se takodjer predstavlja sadržajnu obradu zajedno s sustinom, postojanoscu i svojstvom (aboutness, isness, ofness) kako bi se usporedila s Panofskijevim kategorijama u ikonografiji. Središnji fokus rada je analiziranje Iconclass-a, njegovih značajki i upotrebe. Osim toga, rad se usredotočuje na nove razvoje u strojnom učenju preko umjetne inteligencije, koji koriste Iconclass za generiranje i automatizaciju novih podataka i veza. Na kraju, daje se kratak pregled folksonomije i socijalnog označavanja

    Information extraction from the web using a search engine

    Get PDF

    A model for information retrieval driven by conceptual spaces

    Get PDF
    A retrieval model describes the transformation of a query into a set of documents. The question is: what drives this transformation? For semantic information retrieval type of models this transformation is driven by the content and structure of the semantic models. In this case, Knowledge Organization Systems (KOSs) are the semantic models that encode the meaning employed for monolingual and cross-language retrieval. The focus of this research is the relationship between these meanings’ representations and their role and potential in augmenting existing retrieval models effectiveness. The proposed approach is unique in explicitly interpreting a semantic reference as a pointer to a concept in the semantic model that activates all its linked neighboring concepts. It is in fact the formalization of the information retrieval model and the integration of knowledge resources from the Linguistic Linked Open Data cloud that is distinctive from other approaches. The preprocessing of the semantic model using Formal Concept Analysis enables the extraction of conceptual spaces (formal contexts)that are based on sub-graphs from the original structure of the semantic model. The types of conceptual spaces built in this case are limited by the KOSs structural relations relevant to retrieval: exact match, broader, narrower, and related. They capture the definitional and relational aspects of the concepts in the semantic model. Also, each formal context is assigned an operational role in the flow of processes of the retrieval system enabling a clear path towards the implementations of monolingual and cross-lingual systems. By following this model’s theoretical description in constructing a retrieval system, evaluation results have shown statistically significant results in both monolingual and bilingual settings when no methods for query expansion were used. The test suite was run on the Cross-Language Evaluation Forum Domain Specific 2004-2006 collection with additional extensions to match the specifics of this model

    CHORUS Deliverable 2.2: Second report - identification of multi-disciplinary key issues for gap analysis toward EU multimedia search engines roadmap

    Get PDF
    After addressing the state-of-the-art during the first year of Chorus and establishing the existing landscape in multimedia search engines, we have identified and analyzed gaps within European research effort during our second year. In this period we focused on three directions, notably technological issues, user-centred issues and use-cases and socio- economic and legal aspects. These were assessed by two central studies: firstly, a concerted vision of functional breakdown of generic multimedia search engine, and secondly, a representative use-cases descriptions with the related discussion on requirement for technological challenges. Both studies have been carried out in cooperation and consultation with the community at large through EC concertation meetings (multimedia search engines cluster), several meetings with our Think-Tank, presentations in international conferences, and surveys addressed to EU projects coordinators as well as National initiatives coordinators. Based on the obtained feedback we identified two types of gaps, namely core technological gaps that involve research challenges, and “enablers”, which are not necessarily technical research challenges, but have impact on innovation progress. New socio-economic trends are presented as well as emerging legal challenges

    NarDis:Narrativizing Disruption -How exploratory search can support media researchers to interpret ‘disruptive’ media events as lucid narratives

    Get PDF
    This project investigates how CLARIAH’s exploratory search and linked open data (LO D) browser DIVE+ supports media researchers to construct narratives about events, especially ‘disruptive’ events such as terrorist attacks and natural disasters. This project approaches this question by conducting user studies to examine how researchers use and create narratives with exploratory search tools, particularly DIVE+, to understand media events. These user studies were organized as workshops (using co-creation as an iterative approach to map search practices and storytelling data, including: focus groups & interviews; tasks & talk aloud protocols; surveys/questionnaires; and research diaries) and included more than 100 (digital) humanities researchers across Europe. Insights from these workshops show that exploratory search does facilitate the development of new research questions around disruptive events. DIVE+ triggers academic curiosity, by suggesting alternative connections between entities. Beside learning about research practices of (digital) humanities researchers and how these can be supported with digital tools, the pilot also culminated in improvements to the DIVE+ browser. The pilot helped optimize the browser’s functionalities, making it possible for users to annotate paths of search narratives, and save these in CLARIAH’s overarching, personalised, user space. The pilot was widely promoted at (inter)national conferences, and DIVE+ won the international LO DLAM (Linked Open Data in Libraries, Archives and Museums) Challenge Grand Prize in Venice (2017)

    A Linked term bank of copyright-related terms

    Get PDF
    A multi-lingual term bank of copyright-related terms has been published connecting WIPO definitions, IATE terms and definitions from Creative Commons licenses. These terms have been hierarchically arranged, spanning multiple languages and targeting different jurisdictions. The term bank has been published as a TBX dump file and is publicly accessible as linked data. Models for the RDF data structure are based on Lemon and W3C Recommendations. The term bank has been used to annotate common licenses in the RDFLicense dataset
    corecore