166 research outputs found

    A Deep learning based food recognition system for lifelog images

    Get PDF
    In this paper, we propose a deep learning based system for food recognition from personal life archive im- ages. The system first identifies the eating moments based on multi-modal information, then tries to focus and enhance the food images available in these moments, and finally, exploits GoogleNet as the core of the learning process to recognise the food category of the images. Preliminary results, experimenting on the food recognition module of the proposed system, show that the proposed system achieves 95.97% classification accuracy on the food images taken from the personal life archive from several lifeloggers, which potentially can be extended and applied in broader scenarios and for different types of food categories

    Deepfake Detection: Analyzing Model Generalization Across Architectures, Datasets, and Pre-Training Paradigms

    Get PDF
    As deepfake technology gains traction, the need for reliable detection systems is crucial. Recent research has introduced various deep learning-based detection systems, yet they exhibit limitations in generalising effectively across diverse data distributions that differ from the training data. Our study focuses on understanding the generalisation challenge by exploring different aspects such as deep learning model architectures, pre-training strategies and datasets. Through a comprehensive comparative analysis, we evaluate multiple supervised and self-supervised deep learning models for deepfake detection. Specifically, we evaluate eight supervised deep learning architectures and two transformer-based models pre-trained using self-supervised strategies (DINO, CLIP) on four different deepfake detection benchmarks (FakeAVCeleb, CelebDF-V2, DFDC and FaceForensics++). Our analysis encompasses both intra-dataset and inter-dataset evaluations, with the objective of identifying the top-performing models, datasets that equip trained models with optimal generalisation capabilities, and assessing the influence of image augmentations on model performance. We also investigate the trade-off between model size, efficiency and performance. Our main goal is to provide insights into the effectiveness of different deep learning architectures (transformers, CNNs), training strategies (supervised, self-supervised) and deepfake detection benchmarks. Following an extensive empirical analysis, we conclude that Transformer models surpass CNN models in deepfake detection. Furthermore, we show that FaceForensics++ and DFDC datasets equip models with comparably better generalisation capabilities, as compared to FakeAVCeleb and CelebDF-V2 datasets. Our analysis also demonstrates that image augmentations can be beneficial in achieving improved performance, particularly for Transformer models.publishedVersio

    CLIPping the Deception: Adapting Vision-Language Models for Universal Deepfake Detection

    Full text link
    The recent advancements in Generative Adversarial Networks (GANs) and the emergence of Diffusion models have significantly streamlined the production of highly realistic and widely accessible synthetic content. As a result, there is a pressing need for effective general purpose detection mechanisms to mitigate the potential risks posed by deepfakes. In this paper, we explore the effectiveness of pre-trained vision-language models (VLMs) when paired with recent adaptation methods for universal deepfake detection. Following previous studies in this domain, we employ only a single dataset (ProGAN) in order to adapt CLIP for deepfake detection. However, in contrast to prior research, which rely solely on the visual part of CLIP while ignoring its textual component, our analysis reveals that retaining the text part is crucial. Consequently, the simple and lightweight Prompt Tuning based adaptation strategy that we employ outperforms the previous SOTA approach by 5.01% mAP and 6.61% accuracy while utilizing less than one third of the training data (200k images as compared to 720k). To assess the real-world applicability of our proposed models, we conduct a comprehensive evaluation across various scenarios. This involves rigorous testing on images sourced from 21 distinct datasets, including those generated by GANs-based, Diffusion-based and Commercial tools

    Hybrid Transformer Network for Deepfake Detection

    Get PDF
    Deepfake media is becoming widespread nowadays because of the easily available tools and mobile apps which can generate realistic looking deepfake videos/images without requiring any technical knowledge. With further advances in this field of technology in the near future, the quantity and quality of deepfake media is also expected to flourish, while making deepfake media a likely new practical tool to spread mis/disinformation. Because of these concerns, the deepfake media detection tools are becoming a necessity. In this study, we propose a novel hybrid transformer network utilizing early feature fusion strategy for deepfake video detection. Our model employs two different CNN networks, i.e., (1) XceptionNet and (2) EfficientNet-B4 as feature extractors. We train both feature extractors along with the transformer in an end-to-end manner on FaceForensics++, DFDC benchmarks. Our model, while having relatively straightforward architecture, achieves comparable results to other more advanced state-of-the-art approaches when evaluated on FaceForensics++ and DFDC benchmarks. Besides this, we also propose novel face cut-out augmentations, as well as random cut-out augmentations. We show that the proposed augmentations improve the detection performance of our model and reduce overfitting. In addition to that, we show that our model is capable of learning from considerably small amount of data.publishedVersio

    Detecting Out-of-Context Image-Caption Pairs in News: A Counter-Intuitive Method

    Full text link
    The growth of misinformation and re-contextualized media in social media and news leads to an increasing need for fact-checking methods. Concurrently, the advancement in generative models makes cheapfakes and deepfakes both easier to make and harder to detect. In this paper, we present a novel approach using generative image models to our advantage for detecting Out-of-Context (OOC) use of images-caption pairs in news. We present two new datasets with a total of 68006800 images generated using two different generative models including (1) DALL-E 2, and (2) Stable-Diffusion. We are confident that the method proposed in this paper can further research on generative models in the field of cheapfake detection, and that the resulting datasets can be used to train and evaluate new models aimed at detecting cheapfakes. We run a preliminary qualitative and quantitative analysis to evaluate the performance of each image generation model for this task, and evaluate a handful of methods for computing image similarity.Comment: ACM International Conference on Content-Based Multimedia Indexing (CBMI '23

    A usage analytics model for analysing user behaviour in IBM academic cloud

    Get PDF
    Usage in the software domain refers to the knowledge about how end- users use the application and how the application responds to the user’s actions. Usage can be revealed by monitoring the user’s interaction with the application. However, in the cloud environment, it is non-trivial to understand the interactions of the users by using only the monitoring solutions. For example, user’s behaviour, user’s usage pattern, which features of a cloud application are critical for a user, to name a few, cannot be extracted using only the existing monitoring tools. Understanding these information require additional analysis, which can be done by using usage analytics. For this purpose, in this paper, we propose a novel process model design for incorporating Usage Analytics in a cloud environment. We evaluate this proposed process model in the context of academic applications and services in the cloud, with the focus on IBM Academic Cloud

    An Overview of User-level Usage Monitoring in Cloud Environment

    Get PDF
    Cloud computing monitors applications, virtual and physical resources to ensure performance capacity, workload management, optimize future application updates and so on. Current state-of-the-art monitoring solutions in the cloud focus on monitoring in application/service level, virtual and physical (infrastructure) level. While some of the researchers have identified the importance of monitoring users, there is still need for developing solutions, implementation and evaluation in this domain. In this paper, we propose a novel approach to extract end-user usage of cloud services from their interactions with the interfaces provided to access the services called User-level Usage Monitoring. We provide the principles necessary for the usage data extraction process and analyse existing cloud monitoring techniques from the identified principles. Understanding end-user usage patterns and behaviour can help developers and architects to assess how applications work and which features of the application are critical for the users

    DCU at the NTCIR-13 Lifelog-2 Task

    Get PDF
    In this work, we outline the submissions of Dublin City University (DCU) team, the organisers, to the NTCIR-13 Lifelog-2 Task. We submitted runs to the Lifelog Semantics Access (LSAT) and the Lifelog Insight (LIT) sub-tasks

    Replay detection and multi-stream synchronization in CS:GO game streams using content-based Image retrieval and Image signature matching

    Get PDF
    In GameStory: The 2019 Video Game Analytics Challenge, two main tasks are nominated to solve in the challenge, which are replay detection - multi-stream synchronization, and game story summarization. In this paper, we propose a data-driven based approach to solve the first task: replay detection - multi-stream synchronization. Our solution aims to determine the replays which lie between two logo-transitional endpoints and synchronize them with their sources by extracting frames from videos, then applying image processing and retrieval remedies. In detail, we use the Bag of Visual Words approach to detect the logo-transitional endpoints, which contains multiple replays in between, then employ an Image Signature Matching algorithm for multi-stream synchronization and replay boundaries refinement. The best configuration of our proposed solution manages to achieve the second-highest scores in all evaluation metrics of the challenge

    A RESEARCH ON MULTI-OBJECTIVE OPTIMIZATION OF THE GRINDING PROCESS USING SEGMENTED GRINDING WHEEL BY TAGUCHI-DEAR METHOD

    Get PDF
    In this study, the mutil-objective optimization was applied for the surface grinding process of SAE420 steel. The aluminum oxide grinding wheels that were grooved by 15 grooves, 18 grooves, and 20 grooves were used in the experimental process. The Taguchi method was applied to design the experimental matrix. Four input parameters that were chosen for each experiment were the number of grooves in cylinder surface of grinding wheel, workpiece velocity, feed rate, and cutting depth. Four output parameters that were measured for each experimental were the machining surface roughness, the system vibrations in the three directions (X, Y, Z). The DEAR technique was applied to determine the values of the input parameters to obtaine the minimum values of machining surface roughness and vibrations in three directions. By using this technique, the optimum values of grinding wheel groove number, workpiece velocity, feed-rate, cutting depth were 18 grooves, 15 m/min, 2 mm/stroke, and 0.005 mm, respectively. The verified experimental was performed by using the optimum values of input parameters. The validation results of surface roughness and vibrations in X, Y, Z directions were 0.826 (µm), 0.531 (µm), 0.549 (µm), and 0. 646 (µm), respectively. These results were great improved in comparing to the normal experimental results. Taguchi method and DEAR technique can be applied to improve the quality of grinding surface and reduce the vibrations of the technology system to restrain the increasing of the cutting forces in the grinding process. Finally, the research direction was also proposed in this stud
    corecore