10 research outputs found

    Analisis Interaksi Pengguna di Media Sosial Dalam Mencegah Video Hoax dan Model Arsitektur Deteksi Tingkat Tinggi

    Get PDF
    Penyebaran berita hoaks dengan konten video yang berulang pada media sosial merupakan fenomenayang sangat luar biasa dan muncul bukan hanya pada kalangan pengguna dewasa saja namun sudah kesegalalapisan usia, Efek yang paling terasa adalah timbulnya perpecahan di masyarakat karena penggunaan videoyang sudah pernah tayang atau ada sebelumnya menjadi bukti kuat untuk memvalidasi konten yang dilihatnya.Penting untuk mendeteksi berita hoaks dengan konten video yang berulang dan menghentikan efek negatifnyapada individu dan masyarakat. Pada penelitian ini pembuatan model arsitektur deteksi tingkat tinggi untuksistem analisis berita hoaks dengan konten video yang digunakan kembali atau berulang pada media sosial dikenalkan, dengan menggunakan deep learning video processing, speech to text dan beberapa fitur content-baseddan context-based rancangan model arsitektur ini dibuat. Konten hoaks dengan video yang berulang diharapkandapat dicegah penyebarannya jika bisa di filter terlebih dahulu sebelum muncul di lini masa. Diharapkan modelarsitektur ini dapat menjadi referensi untuk di buat menjadi real syste

    Graph Based Video Sequence Matching & BoF Method for Video Copy detection

    Get PDF
    In this paper we propose video copy detection method using Bag-of-Features and showing acyclic graph of matching frames of videos. This include use of both local (line, texture, color) and global (Scale Invariant Feature Transform i.e. SIFT) features. This process includes dividing video into small frames using dual threshold method which eliminates the redundant frames and select unique key frames. After that from each key frame binary features are extracted which known as Bag of Features (BoF) which are get stored into the database in format of matrix. When any query video is being uploading, same features are extracted and compared with stored database to detect copied video. If video detected as copied then using Graph Based Sequence Matching Method, actual matched sequence between key frames is displayed in acyclic graph. DOI: 10.17762/ijritcc2321-8169.15067

    TransVCL: Attention-enhanced Video Copy Localization Network with Flexible Supervision

    Full text link
    Video copy localization aims to precisely localize all the copied segments within a pair of untrimmed videos in video retrieval applications. Previous methods typically start from frame-to-frame similarity matrix generated by cosine similarity between frame-level features of the input video pair, and then detect and refine the boundaries of copied segments on similarity matrix under temporal constraints. In this paper, we propose TransVCL: an attention-enhanced video copy localization network, which is optimized directly from initial frame-level features and trained end-to-end with three main components: a customized Transformer for feature enhancement, a correlation and softmax layer for similarity matrix generation, and a temporal alignment module for copied segments localization. In contrast to previous methods demanding the handcrafted similarity matrix, TransVCL incorporates long-range temporal information between feature sequence pair using self- and cross- attention layers. With the joint design and optimization of three components, the similarity matrix can be learned to present more discriminative copied patterns, leading to significant improvements over previous methods on segment-level labeled datasets (VCSL and VCDB). Besides the state-of-the-art performance in fully supervised setting, the attention architecture facilitates TransVCL to further exploit unlabeled or simply video-level labeled data. Additional experiments of supplementing video-level labeled datasets including SVD and FIVR reveal the high flexibility of TransVCL from full supervision to semi-supervision (with or without video-level annotation). Code is publicly available at https://github.com/transvcl/TransVCL.Comment: Accepted by the Thirty-Seventh AAAI Conference on Artificial Intelligence(AAAI2023

    Perceptual Video Hashing for Content Identification and Authentication

    Get PDF
    Perceptual hashing has been broadly used in the literature to identify similar contents for video copy detection. It has also been adopted to detect malicious manipulations for video authentication. However, targeting both applications with a single system using the same hash would be highly desirable as this saves the storage space and reduces the computational complexity. This paper proposes a perceptual video hashing system for content identification and authentication. The objective is to design a hash extraction technique that can withstand signal processing operations on one hand and detect malicious attacks on the other hand. The proposed system relies on a new signal calibration technique for extracting the hash using the discrete cosine transform (DCT) and the discrete sine transform (DST). This consists of determining the number of samples, called the normalizing shift, that is required for shifting a digital signal so that the shifted version matches a certain pattern according to DCT/DST coefficients. The rationale for the calibration idea is that the normalizing shift resists signal processing operations while it exhibits sensitivity to local tampering (i.e., replacing a small portion of the signal with a different one). While the same hash serves both applications, two different similarity measures have been proposed for video identification and authentication, respectively. Through intensive experiments with various types of video distortions and manipulations, the proposed system has been shown to outperform related state-of-the art video hashing techniques in terms of identification and authentication with the advantageous ability to locate tampered regions

    Perceptual Video Hashing for Content Identification and Authentication

    Full text link

    DnS: Distill-and-Select for Efficient and Accurate Video Indexing and Retrieval

    Get PDF
    In this paper, we address the problem of high performance and computationally efficient content-based video retrieval in large-scale datasets. Current methods typically propose either: (i) fine-grained approaches employing spatio-temporal representations and similarity calculations, achieving high performance at a high computational cost or (ii) coarse-grained approaches representing/indexing videos as global vectors, where the spatio-temporal structure is lost, providing low performance but also having low computational cost. In this work, we propose a Knowledge Distillation framework, which we call Distill-and-Select (DnS), that starting from a well-performing fine-grained Teacher Network learns: a) Student Networks at different retrieval performance and computational efficiency trade-offs and b) a Selection Network that at test time rapidly directs samples to the appropriate student to maintain both high retrieval performance and high computational efficiency. We train several students with different architectures and arrive at different trade-offs of performance and efficiency, i.e., speed and storage requirements, including fine-grained students that store index videos using binary representations. Importantly, the proposed scheme allows Knowledge Distillation in large, unlabelled datasets -- this leads to good students. We evaluate DnS on five public datasets on three different video retrieval tasks and demonstrate a) that our students achieve state-of-the-art performance in several cases and b) that our DnS framework provides an excellent trade-off between retrieval performance, computational speed, and storage space. In specific configurations, our method achieves similar mAP with the teacher but is 20 times faster and requires 240 times less storage space. Our collected dataset and implementation are publicly available: https://github.com/mever-team/distill-and-select

    An image-based approach to video copy detection with spatio-temporal post-filtering

    Get PDF
    International audienceThis paper introduces a video copy detection system which efficiently matches individual frames and then verifies their spatio-temporal consistency. The approach for matching frames relies on a recent local feature indexing method, which is at the same time robust to significant video transformations and efficient in terms of memory usage and computation time. We match either keyframes or uniformly sampled frames. To further improve the results, a verification step robustly estimates a spatio-temporal model between the query video and the potentially corresponding video segments. Experimental results evaluate the different parameters of our system and measure the trade-off between accuracy and efficiency. We show that our system obtains excellent results for the TRECVID 2008 copy detection task

    R-Forest for Approximate Nearest Neighbor Queries in High Dimensional Space

    Get PDF
    Searching high dimensional space has been a challenge and an area of intense research for many years. The dimensionality curse has rendered most existing index methods all but useless causing people to research other techniques. In my dissertation I will try to resurrect one of the best known index structures, R-Tree, which most have given up on as a viable method of answering high dimensional queries. I have pointed out the various advantages of R-Tree as a method for answering approximate nearest neighbor queries, and the advantages of locality sensitive hashing and locality sensitive B-Tree, which are the most successful methods today. I started by looking at improving the maintenance of R-Tree by the use of bulk loading and insertion. I proposed and implemented a new method that bulk loads the index which was an improvement of standard method. I then turned my attention to nearest neighbor queries, which is a much more challenging problem especially in high dimensional space. Initially I developed a set of heuristics, easily implemented in R-Tree, which improved the efficiency of high dimensional approximate nearest neighbor queries. To further refine my method I took another approach, by developing a new model, known as R-Forest, which takes advantage of space partitioning while still using R-Tree as its index structure. With this new approach I was able to implement new heuristics and can show that R-Forest, comprised of a set of R-Trees, is a viable solution tohigh dimensional approximate nearest neighbor queries when compared to established methods
    corecore