Search CORE

776 research outputs found

A Visual Computing Unified Application Using Deep Learning and Computer Vision Techniques

Author: G. Shruthi
J. Sowmya B.
Meeradevi
P Dayananda
Rohith S.
S. Supreeth
Seema S.
Publication venue: International Federation of Engineering Education Societies (IFEES)
Publication date: 12/01/2024
Field of study

Vision Studio aims to utilize a diverse range of modern deep learning and computer vision principles and techniques to provide a broad array of functionalities in image and video processing. Deep learning is a distinct class of machine learning algorithms that utilize multiple layers to gradually extract more advanced features from raw input. This is beneficial when using a matrix as input for pixels in a photo or frames in a video. Computer vision is a field of artificial intelligence that teaches computers to interpret and comprehend the visual domain. The main functions implemented include deepfake creation, digital ageing (de-ageing), image animation, and deepfake detection. Deepfake creation allows users to utilize deep learning methods, particularly autoencoders, to overlay source images onto a target video. This creates a video of the source person imitating or saying things that the target person does. Digital aging utilizes generative adversarial networks (GANs) to digitally simulate the aging process of an individual. Image animation utilizes first-order motion models to create highly realistic animations from a source image and driving video. Deepfake detection is achieved by using advanced and highly efficient convolutional neural networks (CNNs), primarily employing the EfficientNet family of models

Online-Journals.org (International Association of Online Engineering)

Towards Understanding of Deepfake Videos in the Wild

Author: Abuadbba Alsharif
Cho Beomsang
Kim Jiwon
Le Binh M.
Moore Kristen
Tariq Shahroz
Woo Simon
Publication venue
Publication date: 06/09/2023
Field of study

Deepfakes have become a growing concern in recent years, prompting researchers to develop benchmark datasets and detection algorithms to tackle the issue. However, existing datasets suffer from significant drawbacks that hamper their effectiveness. Notably, these datasets fail to encompass the latest deepfake videos produced by state-of-the-art methods that are being shared across various platforms. This limitation impedes the ability to keep pace with the rapid evolution of generative AI techniques employed in real-world deepfake production. Our contributions in this IRB-approved study are to bridge this knowledge gap from current real-world deepfakes by providing in-depth analysis. We first present the largest and most diverse and recent deepfake dataset (RWDF-23) collected from the wild to date, consisting of 2,000 deepfake videos collected from 4 platforms targeting 4 different languages span created from 21 countries: Reddit, YouTube, TikTok, and Bilibili. By expanding the dataset's scope beyond the previous research, we capture a broader range of real-world deepfake content, reflecting the ever-evolving landscape of online platforms. Also, we conduct a comprehensive analysis encompassing various aspects of deepfakes, including creators, manipulation strategies, purposes, and real-world content production methods. This allows us to gain valuable insights into the nuances and characteristics of deepfakes in different contexts. Lastly, in addition to the video content, we also collect viewer comments and interactions, enabling us to explore the engagements of internet users with deepfake content. By considering this rich contextual information, we aim to provide a holistic understanding of the {evolving} deepfake phenomenon and its impact on online platforms

arXiv.org e-Print Archive

Deep Insights of Deepfake Technology : A Review

Author: Mahmud Bahar Uddin
Sharmin Afsana
Publication venue
Publication date: 07/01/2023
Field of study

Under the aegis of computer vision and deep learning technology, a new emerging techniques has introduced that anyone can make highly realistic but fake videos, images even can manipulates the voices. This technology is widely known as Deepfake Technology. Although it seems interesting techniques to make fake videos or image of something or some individuals but it could spread as misinformation via internet. Deepfake contents could be dangerous for individuals as well as for our communities, organizations, countries religions etc. As Deepfake content creation involve a high level expertise with combination of several algorithms of deep learning, it seems almost real and genuine and difficult to differentiate. In this paper, a wide range of articles have been examined to understand Deepfake technology more extensively. We have examined several articles to find some insights such as what is Deepfake, who are responsible for this, is there any benefits of Deepfake and what are the challenges of this technology. We have also examined several creation and detection techniques. Our study revealed that although Deepfake is a threat to our societies, proper measures and strict regulations could prevent this

arXiv.org e-Print Archive

Evaluating the Performance of Vision Transformer Architecture for Deepfake Image Classification

Author: Govindasamy Devesan
Publication venue: Technological University Dublin
Publication date: 01/01/2022
Field of study

Deepfake classification has seen some impressive results lately, with the experimentation of various deep learning methodologies, researchers were able to design some state-of-the art techniques. This study attempts to use an existing technology “Transformers” in the field of Natural Language Processing (NLP) which has been a de-facto standard in text processing for the purposes of Computer Vision. Transformers use a mechanism called “self-attention”, which is different from CNN and LSTM. This study uses a novel technique that considers images as 16x16 words (Dosovitskiy et al., 2021) to train a deep neural network with “self-attention” blocks to detect deepfakes. It creates position embeddings of the image patches which can be passed to the Transformer block to classify the modified images from the CELEB-DF-v2 dataset. Furthermore, the difference between the mean accuracy of this model and an existing state-of-the-art detection technique that uses the Residual CNN network is compared for statistical significance. Both these models are compared on their performances mainly Accuracy and loss. This study shows the state-of-the-art results obtained using this novel technique. The Vision Transformer based model achieved state-of-the-art performance with 97.07% accuracy when compared to the ResNet-18 model which achieved 91.78% accuracy

Arrow@TUDublin