5,156 research outputs found

    Content-based video copy detection using multimodal analysis

    Get PDF
    Ankara : The Department of Computer Engineering and the Institute of Engineering and Science of Bilkent University, 2009.Thesis (Master's) -- Bilkent University, 2009.Includes bibliographical references leaves 67-76.Huge and increasing amount of videos broadcast through networks has raised the need of automatic video copy detection for copyright protection. Recent developments in multimedia technology introduced content-based copy detection (CBCD) as a new research field alternative to the watermarking approach for identification of video sequences. This thesis presents a multimodal framework for matching video sequences using a three-step approach: First, a high-level face detector identifies facial frames/shots in a video clip. Matching faces with extended body regions gives the flexibility to discriminate the same person (e.g., an anchor man or a political leader) in different events or scenes. In the second step, a spatiotemporal sequence matching technique is employed to match video clips/segments that are similar in terms of activity. Finally the non-facial shots are matched using low-level visual features. In addition, we utilize fuzzy logic approach for extracting color histogram to detect shot boundaries of heavily manipulated video clips. Methods for detecting noise, frame-droppings, picture-in-picture transformation windows, and extracting mask for still regions are also proposed and evaluated. The proposed method was tested on the query and reference dataset of CBCD task of TRECVID 2008. Our results were compared with the results of top-8 most successful techniques submitted to this task. Experimental results show that the proposed method performs better than most of the state-of-the-art techniques, in terms of both effectiveness and efficiency.Küçüktunç, OnurM.S

    Reconstrução de filogenias para imagens e vídeos

    Get PDF
    Orientadores: Anderson de Rezende Rocha, Zanoni DiasTese (doutorado) - Universidade Estadual de Campinas, Instituto de ComputaçãoResumo: Com o advento das redes sociais, documentos digitais (e.g., imagens e vídeos) se tornaram poderosas ferramentas de comunicação. Dada esta nova realidade, é comum esses documentos serem publicados, compartilhados, modificados e republicados por vários usuários em diferentes canais da Web. Além disso, com a popularização de programas de edição de imagens e vídeos, muitas vezes não somente cópias exatas de documentos estão disponíveis, mas, também, versões modificadas das fontes originais (duplicatas próximas). Entretanto, o compartilhamento de documentos facilita a disseminação de conteúdo abusivo (e.g., pornografia infantil), que não respeitam direitos autorais e, em alguns casos, conteúdo difamatório, afetando negativamente a imagem pública de pessoas ou corporações (e.g., imagens difamatórias de políticos ou celebridades, pessoas em situações constrangedoras, etc.). Muitos pesquisadores têm desenvolvido, com sucesso, abordagens para detecção de duplicatas de documentos com o intuito de identificar cópias semelhantes de um dado documento multimídia (e.g., imagem, vídeo, etc.) publicado na Internet. Entretanto, somente recentemente têm se desenvolvido as primeiras pesquisas para ir além da detecção de duplicatas e encontrar a estrutura de evolução de um conjunto de documentos relacionados e modificados ao longo do tempo. Para isso, é necessário o desenvolvimento de abordagens que calculem a dissimilaridade entre duplicatas e as separem corretamente em estruturas que representem a relação entre elas de forma automática. Este problema é denominado na literatura como Reconstrução de Filogenia de Documentos Multimídia. Pesquisas na área de filogenia de documentos multimídia são importantes para auxiliar na resolução de problemas como, por exemplo, análise forense, recuperação de imagens por conteúdo e rastreamento de conteúdo ilegal. Nesta tese de doutorado, apresentamos abordagens desenvolvidas para solucionar o problema de filogenias para imagens e vídeos digitais. Considerando imagens, propomos novas abordagens para tratar o problema de filogenia considerando dois pontos principais: (i) a reconstrução de florestas, importante em cenários onde se tem um conjunto de imagens semanticamente semelhantes, mas geradas por fontes ou em momentos diferentes no tempo; e (ii) novas medidas para o cálculo de dissimilaridade entre as duplicatas, uma vez que esse cálculo afeta diretamente a qualidade de reconstrução da filogenia. Os resultados obtidos com as soluções para filogenia de imagens apresentadas neste trabalho confirmam a efetividade das abordagens propostas, identificando corretamente as raízes das florestas (imagens originais de uma sequencia de evolução) com até 95% de acurácia. Para filogenia de vídeos, propomos novas abordagens que realizam alinhamento temporal nos vídeos antes de se calcular a dissimilaridade, uma vez que, em cenários reais, os vídeos podem estar desalinhados temporalmente, terem sofrido recorte temporal ou serem comprimidos, por exemplo. Nesse contexto, nossas abordagens conseguem identificar a raiz das árvores com acurácia de até 87%Abstract: Digital documents (e.g., images and videos) have become powerful tools of communication with the advent of social networks. Within this new reality, it is very common these documents to be published, shared, modified and often republished by multiple users on different web channels. Additionally, with the popularization of image editing software and online editor tools, in most of the cases, not only their exact duplicates will be available, but also manipulated versions of the original source (near duplicates). Nevertheless, this document sharing facilitates the spread of abusive content (e.g., child pornography), copyright infringement and, in some cases, defamatory content, adversely affecting the public image of people or corporations (e.g., defamatory images of politicians and celebrities, people in embarrassing situations, etc.). Several researchers have successfully developed approaches for the detection and recognition of near-duplicate documents, aiming at identifying similar copies of a given multimedia document (e.g., image, video, etc.) published on the Internet. Notwithstanding, only recently some researches have developed approaches that go beyond the near-duplicate detection task and aim at finding the ancestral relationship between the near duplicates and the original source of a document. For this, the development of approaches for calculating the dissimilarity between near duplicates and correctly reconstruct structures that represent the relationship between them automatically is required. This problem is referred to in the literature as Multimedia Phylogeny. Solutions for multimedia phylogeny can help researchers to solve problems in forensics, content-based document retrieval and illegal-content document tracking, for instance. In this thesis, we designed and developed approaches to solve the phylogeny reconstruction problem for digital images and videos. Considering images, we proposed approaches to deal with the phylogeny problem considering two main points: (i) the forest reconstruction, an important task when we consider scenarios in which there is a set of semantically similar images, but generated by different sources or at different times; and (ii) new measures for dissimilarity calculation between near-duplicates, given that the dissimilarity calculation directly impacts the quality of the phylogeny reconstruction. The results obtained with our approaches for image phylogeny showed effective, identifying the root of the forests (original images of an evolution sequence) with accuracy up to 95%. For video phylogeny, we developed a new approach for temporal alignment in the video sequences before calculating the dissimilarity between them, once that, in real-world conditions, a pair of videos can be temporally misaligned, one video can have some frames removed and video compression can be applied, for example. For such problem, the proposed methods yield up to 87% correct of accuracy for finding the roots of the treesDoutoradoCiência da ComputaçãoDoutor em Ciência da Computação2013/05815-2FAPESPCAPE

    BlogForever: D3.1 Preservation Strategy Report

    Get PDF
    This report describes preservation planning approaches and strategies recommended by the BlogForever project as a core component of a weblog repository design. More specifically, we start by discussing why we would want to preserve weblogs in the first place and what it is exactly that we are trying to preserve. We further present a review of past and present work and highlight why current practices in web archiving do not address the needs of weblog preservation adequately. We make three distinctive contributions in this volume: a) we propose transferable practical workflows for applying a combination of established metadata and repository standards in developing a weblog repository, b) we provide an automated approach to identifying significant properties of weblog content that uses the notion of communities and how this affects previous strategies, c) we propose a sustainability plan that draws upon community knowledge through innovative repository design

    Fine-grained Incident Video Retrieval with Video Similarity Learning.

    Get PDF
    PhD ThesesIn this thesis, we address the problem of Fine-grained Incident Video Retrieval (FIVR) using video similarity learning methods. FIVR is a video retrieval task that aims to retrieve all videos that depict the same incident given a query video { related video retrieval tasks adopt either very narrow or very broad scopes, considering only nearduplicate or same event videos. To formulate the case of same incident videos, we de ne three video associations taking into account the spatio-temporal spans captured by video pairs. To cover the benchmarking needs of FIVR, we construct a large-scale dataset, called FIVR-200K, consisting of 225,960 YouTube videos from major news events crawled from Wikipedia. The dataset contains four annotation labels according to FIVR de nitions; hence, it can simulate several retrieval scenarios with the same video corpus. To address FIVR, we propose two video-level approaches leveraging features extracted from intermediate layers of Convolutional Neural Networks (CNN). The rst is an unsupervised method that relies on a modi ed Bag-of-Word scheme, which generates video representations from the aggregation of the frame descriptors based on learned visual codebooks. The second is a supervised method based on Deep Metric Learning, which learns an embedding function that maps videos in a feature space where relevant video pairs are closer than the irrelevant ones. However, videolevel approaches generate global video representations, losing all spatial and temporal relations between compared videos. Therefore, we propose a video similarity learning approach that captures ne-grained relations between videos for accurate similarity calculation. We train a CNN architecture to compute video-to-video similarity from re ned frame-to-frame similarity matrices derived from a pairwise region-level similarity function. The proposed approaches have been extensively evaluated on FIVR- 200K and other large-scale datasets, demonstrating their superiority over other video retrieval methods and highlighting the challenging aspect of the FIVR problem

    Multimedia Forensics

    Get PDF
    This book is open access. Media forensics has never been more relevant to societal life. Not only media content represents an ever-increasing share of the data traveling on the net and the preferred communications means for most users, it has also become integral part of most innovative applications in the digital information ecosystem that serves various sectors of society, from the entertainment, to journalism, to politics. Undoubtedly, the advances in deep learning and computational imaging contributed significantly to this outcome. The underlying technologies that drive this trend, however, also pose a profound challenge in establishing trust in what we see, hear, and read, and make media content the preferred target of malicious attacks. In this new threat landscape powered by innovative imaging technologies and sophisticated tools, based on autoencoders and generative adversarial networks, this book fills an important gap. It presents a comprehensive review of state-of-the-art forensics capabilities that relate to media attribution, integrity and authenticity verification, and counter forensics. Its content is developed to provide practitioners, researchers, photo and video enthusiasts, and students a holistic view of the field

    Multimedia Forensics

    Get PDF
    This book is open access. Media forensics has never been more relevant to societal life. Not only media content represents an ever-increasing share of the data traveling on the net and the preferred communications means for most users, it has also become integral part of most innovative applications in the digital information ecosystem that serves various sectors of society, from the entertainment, to journalism, to politics. Undoubtedly, the advances in deep learning and computational imaging contributed significantly to this outcome. The underlying technologies that drive this trend, however, also pose a profound challenge in establishing trust in what we see, hear, and read, and make media content the preferred target of malicious attacks. In this new threat landscape powered by innovative imaging technologies and sophisticated tools, based on autoencoders and generative adversarial networks, this book fills an important gap. It presents a comprehensive review of state-of-the-art forensics capabilities that relate to media attribution, integrity and authenticity verification, and counter forensics. Its content is developed to provide practitioners, researchers, photo and video enthusiasts, and students a holistic view of the field

    Using Web Archives to Enrich the Live Web Experience Through Storytelling

    Get PDF
    Much of our cultural discourse occurs primarily on the Web. Thus, Web preservation is a fundamental precondition for multiple disciplines. Archiving Web pages into themed collections is a method for ensuring these resources are available for posterity. Services such as Archive-It exists to allow institutions to develop, curate, and preserve collections of Web resources. Understanding the contents and boundaries of these archived collections is a challenge for most people, resulting in the paradox of the larger the collection, the harder it is to understand. Meanwhile, as the sheer volume of data grows on the Web, storytelling is becoming a popular technique in social media for selecting Web resources to support a particular narrative or story . In this dissertation, we address the problem of understanding the archived collections through proposing the Dark and Stormy Archive (DSA) framework, in which we integrate storytelling social media and Web archives. In the DSA framework, we identify, evaluate, and select candidate Web pages from archived collections that summarize the holdings of these collections, arrange them in chronological order, and then visualize these pages using tools that users already are familiar with, such as Storify. To inform our work of generating stories from archived collections, we start by building a baseline for the structural characteristics of popular (i.e., receiving the most views) human-generated stories through investigating stories from Storify. Furthermore, we checked the entire population of Archive-It collections for better understanding the characteristics of the collections we intend to summarize. We then filter off-topic pages from the collections the using different methods to detect when an archived page in a collection has gone off-topic. We created a gold standard dataset from three Archive-It collections to evaluate the proposed methods at different thresholds. From the gold standard dataset, we identified five behaviors for the TimeMaps (a list of archived copies of a page) based on the page’s aboutness. Based on a dynamic slicing algorithm, we divide the collection and cluster the pages in each slice. We then select the best representative page from each cluster based on different quality metrics (e.g., the replay quality, and the quality of the generated snippet from the page). At the end, we put the selected pages in chronological order and visualize them using Storify. For evaluating the DSA framework, we obtained a ground truth dataset of hand-crafted stories from Archive-It collections generated by expert archivists. We used Amazon’s Mechanical Turk to evaluate the automatically generated stories against the stories that were created by domain experts. The results show that the automatically generated stories by the DSA are indistinguishable from those created by human subject domain experts, while at the same time both kinds of stories (automatic and human) are easily distinguished from randomly generated storie
    corecore