Deepfake videos have grown to be a big concern in the modern digital media landscape as they cause difficulties undermining the legitimacy of channels of information and communication. Humans often find it challenging to tell the difference between a fake and a genuine video due to the increasing realism of facial deepfakes. Identification of these misleading materials is the first step in preventing deepfakes from spreading through social media. This work introduces Spatio-temporal Intelligent Deepfake Detector (STIDD), a deep learning system including enhanced spatial and temporal modeling techniques. By means of a pre-trained EfficientNetV2-B0 model, the proposed framework efficiently extracts spatial characteristics from each frame, subsequently, and Bidirectional Long Short-Term Memory layers help to capture temporal relationships from video sequences. We evaluate STIDD on the FaceForensics++ (FF++) dataset encompassing all five manipulation techniques (DeepFakes, FaceSwap, Face2Face, FaceShifter, and NeuralTextures). The experimental results reveal that STIDD achieved precision, recall, and F1-scores all higher than 0.99 and a final test accuracy of 99.51% on the combined FF++ test set. The results demonstrate that the integration of sophisticated spatial extraction and strong temporal modeling allows STIDD to achieve high detection performance while maintaining computing efficiency at just 0.39 Giga Floating-Point Operations (GFLOPs) per inference.
Is data on this page outdated, violates copyrights or anything else? Report the problem now and we will take corresponding actions after reviewing your request.