2,311 research outputs found

    A Deep Siamese Network for Scene Detection in Broadcast Videos

    Get PDF
    We present a model that automatically divides broadcast videos into coherent scenes by learning a distance measure between shots. Experiments are performed to demonstrate the effectiveness of our approach by comparing our algorithm against recent proposals for automatic scene segmentation. We also propose an improved performance measure that aims to reduce the gap between numerical evaluation and expected results, and propose and release a new benchmark dataset.Comment: ACM Multimedia 201

    Scaling Up Medical Visualization : Multi-Modal, Multi-Patient, and Multi-Audience Approaches for Medical Data Exploration, Analysis and Communication

    Get PDF
    Medisinsk visualisering er en av de mest applikasjonsrettede omrĂ„dene av visualiseringsforsking. Tett samarbeid med medisinske eksperter er nĂždvendig for Ă„ tolke medisinsk bildedata og lage betydningsfulle visualiseringsteknikker og visualiseringsapplikasjoner. Kreft er en av de vanligste dĂždsĂ„rsakene, og med Ăžkende gjennomsnittsalder i i-land Ăžker ogsĂ„ antallet diagnoser av gynekologisk kreft. Moderne avbildningsteknikker er et viktig verktĂžy for Ă„ vurdere svulster og produsere et Ăžkende antall bildedata som radiologer mĂ„ tolke. I tillegg til antallet bildemodaliteter, Ăžker ogsĂ„ antallet pasienter, noe som fĂžrer til at visualiseringslĂžsninger mĂ„ bli skalert opp for Ă„ adressere den Ăžkende kompleksiteten av multimodal- og multipasientdata. Dessuten er ikke medisinsk visualisering kun tiltenkt medisinsk personale, men har ogsĂ„ som mĂ„l Ă„ informere pasienter, pĂ„rĂžrende, og offentligheten om risikoen relatert til visse sykdommer, og mulige behandlinger. Derfor har vi identifisert behovet for Ă„ skalere opp medisinske visualiseringslĂžsninger for Ă„ kunne hĂ„ndtere multipublikumdata. Denne avhandlingen adresserer skaleringen av disse dimensjonene i forskjellige bidrag vi har kommet med. FĂžrst presenterer vi teknikkene vĂ„re for Ă„ skalere visualiseringer i flere modaliteter. Vi introduserer en visualiseringsteknikk som tar i bruk smĂ„ multipler for Ă„ vise data fra flere modaliteter innenfor et bildesnitt. Dette lar radiologer utforske dataen effektivt uten Ă„ mĂ„tte bruke flere sidestilte vinduer. I det neste steget utviklet vi en analyseplatform ved Ă„ ta i bruk «radiomic tumor profiling» pĂ„ forskjellige bildemodaliteter for Ă„ analysere kohortdata og finne nye biomarkĂžrer fra bilder. BiomarkĂžrer fra bilder er indikatorer basert pĂ„ bildedata som kan forutsi variabler relatert til kliniske utfall. «Radiomic tumor profiling» er en teknikk som genererer mulige biomarkĂžrer fra bilder basert pĂ„ fĂžrste- og andregrads statistiske mĂ„linger. Applikasjonen lar medisinske eksperter analysere multiparametrisk bildedata for Ă„ finne mulige korrelasjoner mellom kliniske parameter og data fra «radiomic tumor profiling». Denne tilnĂŠrmingen skalerer i to dimensjoner, multimodal og multipasient. I en senere versjon la vi til funksjonalitet for Ă„ skalere multipublikumdimensjonen ved Ă„ gjĂžre applikasjonen vĂ„r anvendelig for livmorhalskreft- og prostatakreftdata, i tillegg til livmorkreftdataen som applikasjonen var designet for. I et senere bidrag fokuserer vi pĂ„ svulstdata pĂ„ en annen skala og muliggjĂžr analysen av svulstdeler ved Ă„ bruke multimodal bildedata i en tilnĂŠrming basert pĂ„ hierarkisk gruppering. Applikasjonen vĂ„r finner mulige interessante regioner som kan informere fremtidige behandlingsavgjĂžrelser. I et annet bidrag, en digital sonderingsinteraksjon, fokuserer vi pĂ„ multipasientdata. Bildedata fra flere pasienter kan sammenlignes for Ă„ finne interessante mĂžnster i svulstene som kan vĂŠre knyttet til hvor aggressive svulstene er. Til slutt skalerer vi multipublikumdimensjonen med en likhetsvisualisering som er anvendelig for forskning pĂ„ livmorkreft, pĂ„ bilder av nevrologisk kreft, og maskinlĂŠringsforskning pĂ„ automatisk segmentering av svulstdata. Som en kontrast til de allerede fremhevete bidragene, fokuserer vĂ„rt siste bidrag, ScrollyVis, hovedsakelig pĂ„ multipublikumkommunikasjon. Vi muliggjĂžr skapelsen av dynamiske og vitenskapelige “scrollytelling”-opplevelser for spesifikke eller generelle publikum. Slike historien kan bli brukt i spesifikke brukstilfeller som kommunikasjon mellom lege og pasient, eller for Ă„ kommunisere vitenskapelige resultater via historier til et generelt publikum i en digital museumsutstilling. VĂ„re foreslĂ„tte applikasjoner og interaksjonsteknikker har blitt demonstrert i brukstilfeller og evaluert med domeneeksperter og fokusgrupper. Dette har fĂžrt til at noen av vĂ„re bidrag allerede er i bruk pĂ„ andre forskingsinstitusjoner. Vi Ăžnsker Ă„ evaluere innvirkningen deres pĂ„ andre vitenskapelige felt og offentligheten i fremtidige arbeid.Medical visualization is one of the most application-oriented areas of visualization research. Close collaboration with medical experts is essential for interpreting medical imaging data and creating meaningful visualization techniques and visualization applications. Cancer is one of the most common causes of death, and with increasing average age in developed countries, gynecological malignancy case numbers are rising. Modern imaging techniques are an essential tool in assessing tumors and produce an increasing number of imaging data radiologists must interpret. Besides the number of imaging modalities, the number of patients is also rising, leading to visualization solutions that must be scaled up to address the rising complexity of multi-modal and multi-patient data. Furthermore, medical visualization is not only targeted toward medical professionals but also has the goal of informing patients, relatives, and the public about the risks of certain diseases and potential treatments. Therefore, we identify the need to scale medical visualization solutions to cope with multi-audience data. This thesis addresses the scaling of these dimensions in different contributions we made. First, we present our techniques to scale medical visualizations in multiple modalities. We introduced a visualization technique using small multiples to display the data of multiple modalities within one imaging slice. This allows radiologists to explore the data efficiently without having several juxtaposed windows. In the next step, we developed an analysis platform using radiomic tumor profiling on multiple imaging modalities to analyze cohort data and to find new imaging biomarkers. Imaging biomarkers are indicators based on imaging data that predict clinical outcome related variables. Radiomic tumor profiling is a technique that generates potential imaging biomarkers based on first and second-order statistical measurements. The application allows medical experts to analyze the multi-parametric imaging data to find potential correlations between clinical parameters and the radiomic tumor profiling data. This approach scales up in two dimensions, multi-modal and multi-patient. In a later version, we added features to scale the multi-audience dimension by making our application applicable to cervical and prostate cancer data and the endometrial cancer data the application was designed for. In a subsequent contribution, we focus on tumor data on another scale and enable the analysis of tumor sub-parts by using the multi-modal imaging data in a hierarchical clustering approach. Our application finds potentially interesting regions that could inform future treatment decisions. In another contribution, the digital probing interaction, we focus on multi-patient data. The imaging data of multiple patients can be compared to find interesting tumor patterns potentially linked to the aggressiveness of the tumors. Lastly, we scale the multi-audience dimension with our similarity visualization applicable to endometrial cancer research, neurological cancer imaging research, and machine learning research on the automatic segmentation of tumor data. In contrast to the previously highlighted contributions, our last contribution, ScrollyVis, focuses primarily on multi-audience communication. We enable the creation of dynamic scientific scrollytelling experiences for a specific or general audience. Such stories can be used for specific use cases such as patient-doctor communication or communicating scientific results via stories targeting the general audience in a digital museum exhibition. Our proposed applications and interaction techniques have been demonstrated in application use cases and evaluated with domain experts and focus groups. As a result, we brought some of our contributions to usage in practice at other research institutes. We want to evaluate their impact on other scientific fields and the general public in future work.Doktorgradsavhandlin

    Indexing, browsing and searching of digital video

    Get PDF
    Video is a communications medium that normally brings together moving pictures with a synchronised audio track into a discrete piece or pieces of information. The size of a “piece ” of video can variously be referred to as a frame, a shot, a scene, a clip, a programme or an episode, and these are distinguished by their lengths and by their composition. We shall return to the definition of each of these in section 4 this chapter. In modern society, video is ver

    Measuring concept similarities in multimedia ontologies: analysis and evaluations

    Get PDF
    The recent development of large-scale multimedia concept ontologies has provided a new momentum for research in the semantic analysis of multimedia repositories. Different methods for generic concept detection have been extensively studied, but the question of how to exploit the structure of a multimedia ontology and existing inter-concept relations has not received similar attention. In this paper, we present a clustering-based method for modeling semantic concepts on low-level feature spaces and study the evaluation of the quality of such models with entropy-based methods. We cover a variety of methods for assessing the similarity of different concepts in a multimedia ontology. We study three ontologies and apply the proposed techniques in experiments involving the visual and semantic similarities, manual annotation of video, and concept detection. The results show that modeling inter-concept relations can provide a promising resource for many different application areas in semantic multimedia processing

    Interactive searching and browsing of video archives: using text and using image matching

    Get PDF
    Over the last number of decades much research work has been done in the general area of video and audio analysis. Initially the applications driving this included capturing video in digital form and then being able to store, transmit and render it, which involved a large effort to develop compression and encoding standards. The technology needed to do all this is now easily available and cheap, with applications of digital video processing now commonplace, ranging from CCTV (Closed Circuit TV) for security, to home capture of broadcast TV on home DVRs for personal viewing. One consequence of the development in technology for creating, storing and distributing digital video is that there has been a huge increase in the volume of digital video, and this in turn has created a need for techniques to allow effective management of this video, and by that we mean content management. In the BBC, for example, the archives department receives approximately 500,000 queries per year and has over 350,000 hours of content in its library. Having huge archives of video information is hardly any benefit if we have no effective means of being able to locate video clips which are of relevance to whatever our information needs may be. In this chapter we report our work on developing two specific retrieval and browsing tools for digital video information. Both of these are based on an analysis of the captured video for the purpose of automatically structuring into shots or higher level semantic units like TV news stories. Some also include analysis of the video for the automatic detection of features such as the presence or absence of faces. Both include some elements of searching, where a user specifies a query or information need, and browsing, where a user is allowed to browse through sets of retrieved video shots. We support the presentation of these tools with illustrations of actual video retrieval systems developed and working on hundreds of hours of video content

    "You Tube and I Find" - personalizing multimedia content access

    Full text link
    Recent growth in broadband access and proliferation of small personal devices that capture images and videos has led to explosive growth of multimedia content available everywhereVfrom personal disks to the Web. While digital media capture and upload has become nearly universal with newer device technology, there is still a need for better tools and technologies to search large collections of multimedia data and to find and deliver the right content to a user according to her current needs and preferences. A renewed focus on the subjective dimension in the multimedia lifecycle, fromcreation, distribution, to delivery and consumption, is required to address this need beyond what is feasible today. Integration of the subjective aspects of the media itselfVits affective, perceptual, and physiological potential (both intended and achieved), together with those of the users themselves will allow for personalizing the content access, beyond today’s facility. This integration, transforming the traditional multimedia information retrieval (MIR) indexes to more effectively answer specific user needs, will allow a richer degree of personalization predicated on user intention and mode of interaction, relationship to the producer, content of the media, and their history and lifestyle. In this paper, we identify the challenges in achieving this integration, current approaches to interpreting content creation processes, to user modelling and profiling, and to personalized content selection, and we detail future directions. The structure of the paper is as follows: In Section I, we introduce the problem and present some definitions. In Section II, we present a review of the aspects of personalized content and current approaches for the same. Section III discusses the problem of obtaining metadata that is required for personalized media creation and present eMediate as a case study of an integrated media capture environment. Section IV presents the MAGIC system as a case study of capturing effective descriptive data and putting users first in distributed learning delivery. The aspects of modelling the user are presented as a case study in using user’s personality as a way to personalize summaries in Section V. Finally, Section VI concludes the paper with a discussion on the emerging challenges and the open problems

    Semantic annotation of narrative media objects

    Get PDF

    Exploring digital comics as an edutainment tool: An overview

    Get PDF
    This paper aims t oexplore the growing potential of digital comics and graphic novels as an edutainment tool.Initially, the evolvement of comics medium along with academic and commercial initiatives in designing comicware systems arebriefly discussed. Prominent to this study, the methods and impact of utilizing this visual media with embedded instructional content and student-generated comics in classroom setting are rationallyoutlined.By recognizing the emerging technologies available for supporting and accelerating educational comic development, this article addresses the diverse research challenges and opportunities of innovating effective strategies to enhance comics integrated learning across disciplines
    • 

    corecore