201,543 research outputs found

    Detecting Deepfake Videos in Data Scarcity Conditions by Means of Video Coding Features

    Get PDF
    The most powerful deepfake detection methods developed so far are based on deep learning, requiring that large amounts of training data representative of the specific task are available to the trainer. In this paper, we propose a feature-based method for video deepfake detection that can work in data scarcity conditions, that is, when only very few examples are available to the forensic analyst. The proposed method is based on video coding analysis and relies on a simple footprint obtained from the motion prediction modes in the video sequence. The footprint is extracted from video sequences and used to train a simple linear Support Vector Machine classifier. The effectiveness of the proposed method is validated experimentally on three different datasets, namely, a synthetic street video dataset and two datasets of Deepfake face videos

    Lend Me a Hand: Auxiliary Image Data Helps Interaction Detection

    Get PDF
    In social settings, people interact in close proximity. When analyzing such encounters from video, we are typically interested in distinguishing between a large number of different interactions. Here, we address training deformable part models (DPMs) for the detection of such interactions from video, in both space and time. When we consider a large number of interaction classes, we face two challenges. First, we need to distinguish between interactions that are visually more similar. Second, it becomes more difficult to obtain sufficient specific training examples for each interaction class. In this paper, we address both challenges and focus on the latter. Specifically, we introduce a method to train body part detectors from nonspecific images with pose information. Such resources are widely available. We introduce a training scheme and an adapted DPM formulation to allow for the inclusion of this auxiliary data. We perform cross-dataset experiments to evaluate the generalization performance of our method. We demonstrate that our method can still achieve decent performance, from as few as five training examples

    Deep Poselets for Human Detection

    Full text link
    We address the problem of detecting people in natural scenes using a part approach based on poselets. We propose a bootstrapping method that allows us to collect millions of weakly labeled examples for each poselet type. We use these examples to train a Convolutional Neural Net to discriminate different poselet types and separate them from the background class. We then use the trained CNN as a way to represent poselet patches with a Pose Discriminative Feature (PDF) vector -- a compact 256-dimensional feature vector that is effective at discriminating pose from appearance. We train the poselet model on top of PDF features and combine them with object-level CNNs for detection and bounding box prediction. The resulting model leads to state-of-the-art performance for human detection on the PASCAL datasets

    One-shot learning of object categories

    Get PDF
    Learning visual models of object categories notoriously requires hundreds or thousands of training examples. We show that it is possible to learn much information about a category from just one, or a handful, of images. The key insight is that, rather than learning from scratch, one can take advantage of knowledge coming from previously learned categories, no matter how different these categories might be. We explore a Bayesian implementation of this idea. Object categories are represented by probabilistic models. Prior knowledge is represented as a probability density function on the parameters of these models. The posterior model for an object category is obtained by updating the prior in the light of one or more observations. We test a simple implementation of our algorithm on a database of 101 diverse object categories. We compare category models learned by an implementation of our Bayesian approach to models learned from by maximum likelihood (ML) and maximum a posteriori (MAP) methods. We find that on a database of more than 100 categories, the Bayesian approach produces informative models when the number of training examples is too small for other methods to operate successfully

    Incremental Training of a Detector Using Online Sparse Eigen-decomposition

    Full text link
    The ability to efficiently and accurately detect objects plays a very crucial role for many computer vision tasks. Recently, offline object detectors have shown a tremendous success. However, one major drawback of offline techniques is that a complete set of training data has to be collected beforehand. In addition, once learned, an offline detector can not make use of newly arriving data. To alleviate these drawbacks, online learning has been adopted with the following objectives: (1) the technique should be computationally and storage efficient; (2) the updated classifier must maintain its high classification accuracy. In this paper, we propose an effective and efficient framework for learning an adaptive online greedy sparse linear discriminant analysis (GSLDA) model. Unlike many existing online boosting detectors, which usually apply exponential or logistic loss, our online algorithm makes use of LDA's learning criterion that not only aims to maximize the class-separation criterion but also incorporates the asymmetrical property of training data distributions. We provide a better alternative for online boosting algorithms in the context of training a visual object detector. We demonstrate the robustness and efficiency of our methods on handwriting digit and face data sets. Our results confirm that object detection tasks benefit significantly when trained in an online manner.Comment: 14 page

    Deep Feature-based Face Detection on Mobile Devices

    Full text link
    We propose a deep feature-based face detector for mobile devices to detect user's face acquired by the front facing camera. The proposed method is able to detect faces in images containing extreme pose and illumination variations as well as partial faces. The main challenge in developing deep feature-based algorithms for mobile devices is the constrained nature of the mobile platform and the non-availability of CUDA enabled GPUs on such devices. Our implementation takes into account the special nature of the images captured by the front-facing camera of mobile devices and exploits the GPUs present in mobile devices without CUDA-based frameorks, to meet these challenges.Comment: ISBA 201
    • …
    corecore