776 research outputs found
A Visual Computing Unified Application Using Deep Learning and Computer Vision Techniques
Vision Studio aims to utilize a diverse range of modern deep learning and computer vision principles and techniques to provide a broad array of functionalities in image and video processing. Deep learning is a distinct class of machine learning algorithms that utilize multiple layers to gradually extract more advanced features from raw input. This is beneficial when using a matrix as input for pixels in a photo or frames in a video. Computer vision is a field of artificial intelligence that teaches computers to interpret and comprehend the visual domain. The main functions implemented include deepfake creation, digital ageing (de-ageing), image animation, and deepfake detection. Deepfake creation allows users to utilize deep learning methods, particularly autoencoders, to overlay source images onto a target video. This creates a video of the source person imitating or saying things that the target person does. Digital aging utilizes generative adversarial networks (GANs) to digitally simulate the aging process of an individual. Image animation utilizes first-order motion models to create highly realistic animations from a source image and driving video. Deepfake detection is achieved by using advanced and highly efficient convolutional neural networks (CNNs), primarily employing the EfficientNet family of models
Towards Understanding of Deepfake Videos in the Wild
Deepfakes have become a growing concern in recent years, prompting
researchers to develop benchmark datasets and detection algorithms to tackle
the issue. However, existing datasets suffer from significant drawbacks that
hamper their effectiveness. Notably, these datasets fail to encompass the
latest deepfake videos produced by state-of-the-art methods that are being
shared across various platforms. This limitation impedes the ability to keep
pace with the rapid evolution of generative AI techniques employed in
real-world deepfake production. Our contributions in this IRB-approved study
are to bridge this knowledge gap from current real-world deepfakes by providing
in-depth analysis. We first present the largest and most diverse and recent
deepfake dataset (RWDF-23) collected from the wild to date, consisting of 2,000
deepfake videos collected from 4 platforms targeting 4 different languages span
created from 21 countries: Reddit, YouTube, TikTok, and Bilibili. By expanding
the dataset's scope beyond the previous research, we capture a broader range of
real-world deepfake content, reflecting the ever-evolving landscape of online
platforms. Also, we conduct a comprehensive analysis encompassing various
aspects of deepfakes, including creators, manipulation strategies, purposes,
and real-world content production methods. This allows us to gain valuable
insights into the nuances and characteristics of deepfakes in different
contexts. Lastly, in addition to the video content, we also collect viewer
comments and interactions, enabling us to explore the engagements of internet
users with deepfake content. By considering this rich contextual information,
we aim to provide a holistic understanding of the {evolving} deepfake
phenomenon and its impact on online platforms
Deep Insights of Deepfake Technology : A Review
Under the aegis of computer vision and deep learning technology, a new
emerging techniques has introduced that anyone can make highly realistic but
fake videos, images even can manipulates the voices. This technology is widely
known as Deepfake Technology. Although it seems interesting techniques to make
fake videos or image of something or some individuals but it could spread as
misinformation via internet. Deepfake contents could be dangerous for
individuals as well as for our communities, organizations, countries religions
etc. As Deepfake content creation involve a high level expertise with
combination of several algorithms of deep learning, it seems almost real and
genuine and difficult to differentiate. In this paper, a wide range of articles
have been examined to understand Deepfake technology more extensively. We have
examined several articles to find some insights such as what is Deepfake, who
are responsible for this, is there any benefits of Deepfake and what are the
challenges of this technology. We have also examined several creation and
detection techniques. Our study revealed that although Deepfake is a threat to
our societies, proper measures and strict regulations could prevent this
Evaluating the Performance of Vision Transformer Architecture for Deepfake Image Classification
Deepfake classification has seen some impressive results lately, with the experimentation of various deep learning methodologies, researchers were able to design some state-of-the art techniques. This study attempts to use an existing technology “Transformers” in the field of Natural Language Processing (NLP) which has been a de-facto standard in text processing for the purposes of Computer Vision. Transformers use a mechanism called “self-attention”, which is different from CNN and LSTM. This study uses a novel technique that considers images as 16x16 words (Dosovitskiy et al., 2021) to train a deep neural network with “self-attention” blocks to detect deepfakes. It creates position embeddings of the image patches which can be passed to the Transformer block to classify the modified images from the CELEB-DF-v2 dataset. Furthermore, the difference between the mean accuracy of this model and an existing state-of-the-art detection technique that uses the Residual CNN network is compared for statistical significance. Both these models are compared on their performances mainly Accuracy and loss. This study shows the state-of-the-art results obtained using this novel technique.
The Vision Transformer based model achieved state-of-the-art performance with 97.07% accuracy when compared to the ResNet-18 model which achieved 91.78% accuracy
- …