15 research outputs found

    An end-to-end convolutional selective autoencoder approach to Soybean Cyst Nematode eggs detection

    Get PDF
    This paper proposes a novel selective autoencoder approach within the framework of deep convolutional networks. The crux of the idea is to train a deep convolutional autoencoder to suppress undesired parts of an image frame while allowing the desired parts resulting in efficient object detection. The efficacy of the framework is demonstrated on a critical plant science problem. In the United States, approximately $1 billion is lost per annum due to a nematode infection on soybean plants. Currently, plant-pathologists rely on labor-intensive and time-consuming identification of Soybean Cyst Nematode (SCN) eggs in soil samples via manual microscopy. The proposed framework attempts to significantly expedite the process by using a series of manually labeled microscopic images for training followed by automated high-throughput egg detection. The problem is particularly difficult due to the presence of a large population of non-egg particles (disturbances) in the image frames that are very similar to SCN eggs in shape, pose and illumination. Therefore, the selective autoencoder is trained to learn unique features related to the invariant shapes and sizes of the SCN eggs without handcrafting. After that, a composite non-maximum suppression and differencing is applied at the post-processing stage.Comment: A 10 pages, 8 figures International Conference on Machine Leaning(ICML) Submissio

    Sharp Multiple Instance Learning for DeepFake Video Detection

    Full text link
    With the rapid development of facial manipulation techniques, face forgery has received considerable attention in multimedia and computer vision community due to security concerns. Existing methods are mostly designed for single-frame detection trained with precise image-level labels or for video-level prediction by only modeling the inter-frame inconsistency, leaving potential high risks for DeepFake attackers. In this paper, we introduce a new problem of partial face attack in DeepFake video, where only video-level labels are provided but not all the faces in the fake videos are manipulated. We address this problem by multiple instance learning framework, treating faces and input video as instances and bag respectively. A sharp MIL (S-MIL) is proposed which builds direct mapping from instance embeddings to bag prediction, rather than from instance embeddings to instance prediction and then to bag prediction in traditional MIL. Theoretical analysis proves that the gradient vanishing in traditional MIL is relieved in S-MIL. To generate instances that can accurately incorporate the partially manipulated faces, spatial-temporal encoded instance is designed to fully model the intra-frame and inter-frame inconsistency, which further helps to promote the detection performance. We also construct a new dataset FFPMS for partially attacked DeepFake video detection, which can benefit the evaluation of different methods at both frame and video levels. Experiments on FFPMS and the widely used DFDC dataset verify that S-MIL is superior to other counterparts for partially attacked DeepFake video detection. In addition, S-MIL can also be adapted to traditional DeepFake image detection tasks and achieve state-of-the-art performance on single-frame datasets.Comment: Accepted at ACM MM 2020. 11 pages, 8 figures, with appendi

    Is object localization for free? – Weakly-supervised learning with convolutional neural networks

    Get PDF
    International audienceSuccessful methods for visual object recognition typically rely on training datasets containing lots of richly annotatedimages. Detailed image annotation, e.g. by object bounding boxes, however, is both expensive and often subjective.We describe a weakly supervised convolutional neural network (CNN) for object classification that relies onlyon image-level labels, yet can learn from cluttered scenes containing multiple objects. We quantify its object classification and object location prediction performance on the Pascal VOC 2012 (20 object classes) and the much larger Microsoft COCO (80 object classes) datasets. We find that the network (i) outputs accurate image-level labels, (ii) predicts approximate locations (but not extents) of objects, and (iii) performs comparably to its fully-supervised counterparts using object bounding box annotation for training

    Master of Science

    Get PDF
    thesisMultiple Instance Learning (MIL) is a type of supervised learning with missing data. Here, each example (a.k.a. bag) has one or more instances. In the training set, we have only labels at bag level. The task is to label both bags and instances from the test set. In most practical MIL problems, there is a relationship between the instances of a bag. Capturing this relationship may help learn the underlying concept better. We present an algorithm that uses the structure of bags along with the features of instances. The key idea is to allow a structured support vector machine (SVM) to "guess" at the true underlying structure, so long as it is consistent with the bag labels. This idea is formalized and a new cutting plane algorithm is proposed for optimization. To verify this idea, we implemented our algorithm for a particular kind of structure - hidden markov models. We performed experiments on three datasets and found this algorithm to work better than the existing algorithms in MIL. We present the details of these experiments and the effects of varying different hyperparameters in detail. The key contribution from our work is a very simple loss function with only one hyperparameter that needs to be tuned using a small portion of the training set. The thesis of this work is that it is possible and desirable to exploit the structural relationship between instances in a bag, even though that structure is not observed at training time (i.e., correct labels for all the instances are unknown). Our work opens a new direction to solving the MIL problem. We suggest a few ideas to further our work in this direction

    Learning from Ambiguity

    Get PDF
    There are many learning problems for which the examples given by the teacher are ambiguously labeled. In this thesis, we will examine one framework of learning from ambiguous examples known as Multiple-Instance learning. Each example is a bag, consisting of any number of instances. A bag is labeled negative if all instances in it are negative. A bag is labeled positive if at least one instance in it is positive. Because the instances themselves are not labeled, each positive bag is an ambiguous example. We would like to learn a concept which will correctly classify unseen bags. We have developed a measure called Diverse Density and algorithms for learning from multiple-instance examples. We have applied these techniques to problems in drug design, stock prediction, and image database retrieval. These serve as examples of how to translate the ambiguity in the application domain into bags, as well as successful examples of applying Diverse Density techniques
    corecore