2,542 research outputs found

    Language-based multimedia information retrieval

    Get PDF
    This paper describes various methods and approaches for language-based multimedia information retrieval, which have been developed in the projects POP-EYE and OLIVE and which will be developed further in the MUMIS project. All of these project aim at supporting automated indexing of video material by use of human language technologies. Thus, in contrast to image or sound-based retrieval methods, where both the query language and the indexing methods build on non-linguistic data, these methods attempt to exploit advanced text retrieval technologies for the retrieval of non-textual material. While POP-EYE was building on subtitles or captions as the prime language key for disclosing video fragments, OLIVE is making use of speech recognition to automatically derive transcriptions of the sound tracks, generating time-coded linguistic elements which then serve as the basis for text-based retrieval functionality

    Video anatomy : spatial-temporal video profile

    Get PDF
    Indiana University-Purdue University Indianapolis (IUPUI)A massive amount of videos are uploaded on video websites, smooth video browsing, editing, retrieval, and summarization are demanded. Most of the videos employ several types of camera operations for expanding field of view, emphasizing events, and expressing cinematic effect. To digest heterogeneous videos in video websites and databases, video clips are profiled to 2D image scroll containing both spatial and temporal information for video preview. The video profile is visually continuous, compact, scalable, and indexing to each frame. This work analyzes the camera kinematics including zoom, translation, and rotation, and categorize camera actions as their combinations. An automatic video summarization framework is proposed and developed. After conventional video clip segmentation and video segmentation for smooth camera operations, the global flow field under all camera actions has been investigated for profiling various types of video. A new algorithm has been designed to extract the major flow direction and convergence factor using condensed images. Then this work proposes a uniform scheme to segment video clips and sections, sample video volume across the major flow, compute flow convergence factor, in order to obtain an intrinsic scene space less influenced by the camera ego-motion. The motion blur technique has also been used to render dynamic targets in the profile. The resulting profile of video can be displayed in a video track to guide the access to video frames, help video editing, and facilitate the applications such as surveillance, visual archiving of environment, video retrieval, and online video preview

    Cooperative Interactive Distributed Guidance on Mobile Devices

    Get PDF
    Mobiles device are quickly becoming an indispensable part of our society. Equipped with numerous communication capabilities, they are increasingly being examined as potential tools for civilian and military usage to aide in distributed remote collaboration for dynamic decision making and physical task completion. With an ever growing mobile workforce, the need for remote assistance in aiding field workers who are confronted with situations outside their expertise certainly increases. Enhanced capabilities in using mobile devices could significantly improve numerous components of a task\u27s completion (i.e. accuracy, timing, etc.). This dissertation considers the design of mobile implementation of technology and communication capabilities to support interactive collaboration between distributed team members. Specifically, this body of research seeks to explore and understand how various multimodal remote assistances affect both the human user\u27s performance and the mobile device\u27s effectiveness when used during cooperative tasks. Additionally, power effects are additionally studied to assess the energy demands on a mobile device supporting multimodal communication. In a series of applied experiments and demonstrations, the effectiveness of a mobile device facilitating multimodal collaboration is analyzed through both empirical data collection and subjective exploration. The utility of the mobile interactive system and its configurations are examined to assess the impact on distributed task performance and collaborative dialogue between pairs. The dissertation formulates and defends an argument that multimodal communication capabilities should be incorporated into mobile communication channels to provide collaborating partners salient perspectives with a goal of reaching a mutual understanding of task procedures. The body of research discusses the findings of this investigation and highlight these findings they may influence future mobile research seeking to enhance interactive distributed guidance

    THE NEW “UNIVERSAL TRUTH” OF THE WORLD WIDE WEB

    Get PDF
    We all see that the world wide web is permanently evolving and developing. New websites are created continuously and push the limits of the old HTML specs in all respects. HTML4 is the real standard for almost 10 years and developers are starting to look for new and improved technologies to help them provide greater functionality. In order to give the authors flexibility and interoperability and to enable much more interactive and innovative websites and applications, HTML5 introduces and enhances a large set of features, such as new form elements, APIs, multimedia elements, structure and semantics updates. The development of HTML5, started in 2004, is currently carried out by a joint effort of the W3C HTMLWG and the WHATWG organizations. A lot of important companies participate in this effort, including the largest browser developers: Microsoft, Mozilla, Opera and Apple. The specifications of the new “to be” standard is still work in progress and quite a way lies ahead before its completion. Taking into account this fact there is a certain possibility that the features presented below have already been somehow modified or changed even in the near future.html5, cross-platform, css3, JavaScript, mobile application development, flexibility, interoperability

    Locational wireless and social media-based surveillance

    Get PDF
    The number of smartphones and tablets as well as the volume of traffic generated by these devices has been growing constantly over the past decade and this growth is predicted to continue at an increasing rate over the next five years. Numerous native features built into contemporary smart devices enable highly accurate digital fingerprinting techniques. Furthermore, software developers have been taking advantage of locational capabilities of these devices by building applications and social media services that enable convenient sharing of information tied to geographical locations. Mass online sharing resulted in a large volume of locational and personal data being publicly available for extraction. A number of researchers have used this opportunity to design and build tools for a variety of uses – both respectable and nefarious. Furthermore, due to the peculiarities of the IEEE 802.11 specification, wireless-enabled smart devices disclose a number of attributes, which can be observed via passive monitoring. These attributes coupled with the information that can be extracted using social media APIs present an opportunity for research into locational surveillance, device fingerprinting and device user identification techniques. This paper presents an in-progress research study and details the findings to date

    DGD Gallery: Storage, sharing, and publication of digital research data

    Get PDF
    We describe a project, called the "Discretization in Geometry and Dynamics Gallery", or DGD Gallery for short, whose goal is to store geometric data and to make it publicly available. The DGD Gallery offers an online web service for the storage, sharing, and publication of digital research data.Comment: 19 pages, 8 figures, to appear in "Advances in Discrete Differential Geometry", ed. A. I. Bobenko, Springer, 201

    Scaling Up Medical Visualization : Multi-Modal, Multi-Patient, and Multi-Audience Approaches for Medical Data Exploration, Analysis and Communication

    Get PDF
    Medisinsk visualisering er en av de mest applikasjonsrettede omrĂ„dene av visualiseringsforsking. Tett samarbeid med medisinske eksperter er nĂždvendig for Ă„ tolke medisinsk bildedata og lage betydningsfulle visualiseringsteknikker og visualiseringsapplikasjoner. Kreft er en av de vanligste dĂždsĂ„rsakene, og med Ăžkende gjennomsnittsalder i i-land Ăžker ogsĂ„ antallet diagnoser av gynekologisk kreft. Moderne avbildningsteknikker er et viktig verktĂžy for Ă„ vurdere svulster og produsere et Ăžkende antall bildedata som radiologer mĂ„ tolke. I tillegg til antallet bildemodaliteter, Ăžker ogsĂ„ antallet pasienter, noe som fĂžrer til at visualiseringslĂžsninger mĂ„ bli skalert opp for Ă„ adressere den Ăžkende kompleksiteten av multimodal- og multipasientdata. Dessuten er ikke medisinsk visualisering kun tiltenkt medisinsk personale, men har ogsĂ„ som mĂ„l Ă„ informere pasienter, pĂ„rĂžrende, og offentligheten om risikoen relatert til visse sykdommer, og mulige behandlinger. Derfor har vi identifisert behovet for Ă„ skalere opp medisinske visualiseringslĂžsninger for Ă„ kunne hĂ„ndtere multipublikumdata. Denne avhandlingen adresserer skaleringen av disse dimensjonene i forskjellige bidrag vi har kommet med. FĂžrst presenterer vi teknikkene vĂ„re for Ă„ skalere visualiseringer i flere modaliteter. Vi introduserer en visualiseringsteknikk som tar i bruk smĂ„ multipler for Ă„ vise data fra flere modaliteter innenfor et bildesnitt. Dette lar radiologer utforske dataen effektivt uten Ă„ mĂ„tte bruke flere sidestilte vinduer. I det neste steget utviklet vi en analyseplatform ved Ă„ ta i bruk «radiomic tumor profiling» pĂ„ forskjellige bildemodaliteter for Ă„ analysere kohortdata og finne nye biomarkĂžrer fra bilder. BiomarkĂžrer fra bilder er indikatorer basert pĂ„ bildedata som kan forutsi variabler relatert til kliniske utfall. «Radiomic tumor profiling» er en teknikk som genererer mulige biomarkĂžrer fra bilder basert pĂ„ fĂžrste- og andregrads statistiske mĂ„linger. Applikasjonen lar medisinske eksperter analysere multiparametrisk bildedata for Ă„ finne mulige korrelasjoner mellom kliniske parameter og data fra «radiomic tumor profiling». Denne tilnĂŠrmingen skalerer i to dimensjoner, multimodal og multipasient. I en senere versjon la vi til funksjonalitet for Ă„ skalere multipublikumdimensjonen ved Ă„ gjĂžre applikasjonen vĂ„r anvendelig for livmorhalskreft- og prostatakreftdata, i tillegg til livmorkreftdataen som applikasjonen var designet for. I et senere bidrag fokuserer vi pĂ„ svulstdata pĂ„ en annen skala og muliggjĂžr analysen av svulstdeler ved Ă„ bruke multimodal bildedata i en tilnĂŠrming basert pĂ„ hierarkisk gruppering. Applikasjonen vĂ„r finner mulige interessante regioner som kan informere fremtidige behandlingsavgjĂžrelser. I et annet bidrag, en digital sonderingsinteraksjon, fokuserer vi pĂ„ multipasientdata. Bildedata fra flere pasienter kan sammenlignes for Ă„ finne interessante mĂžnster i svulstene som kan vĂŠre knyttet til hvor aggressive svulstene er. Til slutt skalerer vi multipublikumdimensjonen med en likhetsvisualisering som er anvendelig for forskning pĂ„ livmorkreft, pĂ„ bilder av nevrologisk kreft, og maskinlĂŠringsforskning pĂ„ automatisk segmentering av svulstdata. Som en kontrast til de allerede fremhevete bidragene, fokuserer vĂ„rt siste bidrag, ScrollyVis, hovedsakelig pĂ„ multipublikumkommunikasjon. Vi muliggjĂžr skapelsen av dynamiske og vitenskapelige “scrollytelling”-opplevelser for spesifikke eller generelle publikum. Slike historien kan bli brukt i spesifikke brukstilfeller som kommunikasjon mellom lege og pasient, eller for Ă„ kommunisere vitenskapelige resultater via historier til et generelt publikum i en digital museumsutstilling. VĂ„re foreslĂ„tte applikasjoner og interaksjonsteknikker har blitt demonstrert i brukstilfeller og evaluert med domeneeksperter og fokusgrupper. Dette har fĂžrt til at noen av vĂ„re bidrag allerede er i bruk pĂ„ andre forskingsinstitusjoner. Vi Ăžnsker Ă„ evaluere innvirkningen deres pĂ„ andre vitenskapelige felt og offentligheten i fremtidige arbeid.Medical visualization is one of the most application-oriented areas of visualization research. Close collaboration with medical experts is essential for interpreting medical imaging data and creating meaningful visualization techniques and visualization applications. Cancer is one of the most common causes of death, and with increasing average age in developed countries, gynecological malignancy case numbers are rising. Modern imaging techniques are an essential tool in assessing tumors and produce an increasing number of imaging data radiologists must interpret. Besides the number of imaging modalities, the number of patients is also rising, leading to visualization solutions that must be scaled up to address the rising complexity of multi-modal and multi-patient data. Furthermore, medical visualization is not only targeted toward medical professionals but also has the goal of informing patients, relatives, and the public about the risks of certain diseases and potential treatments. Therefore, we identify the need to scale medical visualization solutions to cope with multi-audience data. This thesis addresses the scaling of these dimensions in different contributions we made. First, we present our techniques to scale medical visualizations in multiple modalities. We introduced a visualization technique using small multiples to display the data of multiple modalities within one imaging slice. This allows radiologists to explore the data efficiently without having several juxtaposed windows. In the next step, we developed an analysis platform using radiomic tumor profiling on multiple imaging modalities to analyze cohort data and to find new imaging biomarkers. Imaging biomarkers are indicators based on imaging data that predict clinical outcome related variables. Radiomic tumor profiling is a technique that generates potential imaging biomarkers based on first and second-order statistical measurements. The application allows medical experts to analyze the multi-parametric imaging data to find potential correlations between clinical parameters and the radiomic tumor profiling data. This approach scales up in two dimensions, multi-modal and multi-patient. In a later version, we added features to scale the multi-audience dimension by making our application applicable to cervical and prostate cancer data and the endometrial cancer data the application was designed for. In a subsequent contribution, we focus on tumor data on another scale and enable the analysis of tumor sub-parts by using the multi-modal imaging data in a hierarchical clustering approach. Our application finds potentially interesting regions that could inform future treatment decisions. In another contribution, the digital probing interaction, we focus on multi-patient data. The imaging data of multiple patients can be compared to find interesting tumor patterns potentially linked to the aggressiveness of the tumors. Lastly, we scale the multi-audience dimension with our similarity visualization applicable to endometrial cancer research, neurological cancer imaging research, and machine learning research on the automatic segmentation of tumor data. In contrast to the previously highlighted contributions, our last contribution, ScrollyVis, focuses primarily on multi-audience communication. We enable the creation of dynamic scientific scrollytelling experiences for a specific or general audience. Such stories can be used for specific use cases such as patient-doctor communication or communicating scientific results via stories targeting the general audience in a digital museum exhibition. Our proposed applications and interaction techniques have been demonstrated in application use cases and evaluated with domain experts and focus groups. As a result, we brought some of our contributions to usage in practice at other research institutes. We want to evaluate their impact on other scientific fields and the general public in future work.Doktorgradsavhandlin

    After Over-Privileged Permissions: Using Technology and Design to Create Legal Compliance

    Get PDF
    Consumers in the mobile ecosystem can putatively protect their privacy with the use of application permissions. However, this requires the mobile device owners to understand permissions and their privacy implications. Yet, few consumers appreciate the nature of permissions within the mobile ecosystem, often failing to appreciate the privacy permissions that are altered when updating an app. Even more concerning is the lack of understanding of the wide use of third-party libraries, most which are installed with automatic permissions, that is permissions that must be granted to allow the application to function appropriately. Unsurprisingly, many of these third-party permissions violate consumers’ privacy expectations and thereby, become “over-privileged” to the user. Consequently, an obscurity of privacy expectations between what is practiced by the private sector and what is deemed appropriate by the public sector is exhibited. Despite the growing attention given to privacy in the mobile ecosystem, legal literature has largely ignored the implications of mobile permissions. This article seeks to address this omission by analyzing the impacts of mobile permissions and the privacy harms experienced by consumers of mobile applications. The authors call for the review of industry self-regulation and the overreliance upon simple notice and consent. Instead, the authors set out a plan for greater attention to be paid to socio-technical solutions, focusing on better privacy protections and technology embedded within the automatic permission-based application ecosystem

    LOCALIZED TEMPORAL PROFILE OF SURVEILLANCE VIDEO

    Get PDF
    Surveillance videos are recorded pervasively and their retrieval currently still relies on human operators. As an intermediate representation, this work develops a new temporal profile of video to convey accurate temporal information in the video while keeping certain spatial characteristics of targets of interest for recognition. The profile is obtained at critical positions where major target flow appears. We set a sampling line crossing the motion direction to profile passing targets in the temporal domain. In order to add spatial information to the temporal profile to certain extent, we integrate multiple profiles from a set of lines with blending method to reflect the target motion direction and position in the temporal profile. Different from mosaicing/montage methods for video synopsis in spatial domain, our temporal profile has no limit on the time length, and the created profile significantly reduces the data size for brief indexing and fast search of video

    DeepFakes and Beyond: A Survey of Face Manipulation and Fake Detection

    Full text link
    The free access to large-scale public databases, together with the fast progress of deep learning techniques, in particular Generative Adversarial Networks, have led to the generation of very realistic fake content with its corresponding implications towards society in this era of fake news. This survey provides a thorough review of techniques for manipulating face images including DeepFake methods, and methods to detect such manipulations. In particular, four types of facial manipulation are reviewed: i) entire face synthesis, ii) identity swap (DeepFakes), iii) attribute manipulation, and iv) expression swap. For each manipulation group, we provide details regarding manipulation techniques, existing public databases, and key benchmarks for technology evaluation of fake detection methods, including a summary of results from those evaluations. Among all the aspects discussed in the survey, we pay special attention to the latest generation of DeepFakes, highlighting its improvements and challenges for fake detection. In addition to the survey information, we also discuss open issues and future trends that should be considered to advance in the field
    • 

    corecore