15 research outputs found
An end-to-end convolutional selective autoencoder approach to Soybean Cyst Nematode eggs detection
This paper proposes a novel selective autoencoder approach within the
framework of deep convolutional networks. The crux of the idea is to train a
deep convolutional autoencoder to suppress undesired parts of an image frame
while allowing the desired parts resulting in efficient object detection. The
efficacy of the framework is demonstrated on a critical plant science problem.
In the United States, approximately $1 billion is lost per annum due to a
nematode infection on soybean plants. Currently, plant-pathologists rely on
labor-intensive and time-consuming identification of Soybean Cyst Nematode
(SCN) eggs in soil samples via manual microscopy. The proposed framework
attempts to significantly expedite the process by using a series of manually
labeled microscopic images for training followed by automated high-throughput
egg detection. The problem is particularly difficult due to the presence of a
large population of non-egg particles (disturbances) in the image frames that
are very similar to SCN eggs in shape, pose and illumination. Therefore, the
selective autoencoder is trained to learn unique features related to the
invariant shapes and sizes of the SCN eggs without handcrafting. After that, a
composite non-maximum suppression and differencing is applied at the
post-processing stage.Comment: A 10 pages, 8 figures International Conference on Machine
Leaning(ICML) Submissio
Sharp Multiple Instance Learning for DeepFake Video Detection
With the rapid development of facial manipulation techniques, face forgery
has received considerable attention in multimedia and computer vision community
due to security concerns. Existing methods are mostly designed for single-frame
detection trained with precise image-level labels or for video-level prediction
by only modeling the inter-frame inconsistency, leaving potential high risks
for DeepFake attackers. In this paper, we introduce a new problem of partial
face attack in DeepFake video, where only video-level labels are provided but
not all the faces in the fake videos are manipulated. We address this problem
by multiple instance learning framework, treating faces and input video as
instances and bag respectively. A sharp MIL (S-MIL) is proposed which builds
direct mapping from instance embeddings to bag prediction, rather than from
instance embeddings to instance prediction and then to bag prediction in
traditional MIL. Theoretical analysis proves that the gradient vanishing in
traditional MIL is relieved in S-MIL. To generate instances that can accurately
incorporate the partially manipulated faces, spatial-temporal encoded instance
is designed to fully model the intra-frame and inter-frame inconsistency, which
further helps to promote the detection performance. We also construct a new
dataset FFPMS for partially attacked DeepFake video detection, which can
benefit the evaluation of different methods at both frame and video levels.
Experiments on FFPMS and the widely used DFDC dataset verify that S-MIL is
superior to other counterparts for partially attacked DeepFake video detection.
In addition, S-MIL can also be adapted to traditional DeepFake image detection
tasks and achieve state-of-the-art performance on single-frame datasets.Comment: Accepted at ACM MM 2020. 11 pages, 8 figures, with appendi
Is object localization for free? – Weakly-supervised learning with convolutional neural networks
International audienceSuccessful methods for visual object recognition typically rely on training datasets containing lots of richly annotatedimages. Detailed image annotation, e.g. by object bounding boxes, however, is both expensive and often subjective.We describe a weakly supervised convolutional neural network (CNN) for object classification that relies onlyon image-level labels, yet can learn from cluttered scenes containing multiple objects. We quantify its object classification and object location prediction performance on the Pascal VOC 2012 (20 object classes) and the much larger Microsoft COCO (80 object classes) datasets. We find that the network (i) outputs accurate image-level labels, (ii) predicts approximate locations (but not extents) of objects, and (iii) performs comparably to its fully-supervised counterparts using object bounding box annotation for training
Master of Science
thesisMultiple Instance Learning (MIL) is a type of supervised learning with missing data. Here, each example (a.k.a. bag) has one or more instances. In the training set, we have only labels at bag level. The task is to label both bags and instances from the test set. In most practical MIL problems, there is a relationship between the instances of a bag. Capturing this relationship may help learn the underlying concept better. We present an algorithm that uses the structure of bags along with the features of instances. The key idea is to allow a structured support vector machine (SVM) to "guess" at the true underlying structure, so long as it is consistent with the bag labels. This idea is formalized and a new cutting plane algorithm is proposed for optimization. To verify this idea, we implemented our algorithm for a particular kind of structure - hidden markov models. We performed experiments on three datasets and found this algorithm to work better than the existing algorithms in MIL. We present the details of these experiments and the effects of varying different hyperparameters in detail. The key contribution from our work is a very simple loss function with only one hyperparameter that needs to be tuned using a small portion of the training set. The thesis of this work is that it is possible and desirable to exploit the structural relationship between instances in a bag, even though that structure is not observed at training time (i.e., correct labels for all the instances are unknown). Our work opens a new direction to solving the MIL problem. We suggest a few ideas to further our work in this direction
Learning from Ambiguity
There are many learning problems for which the examples given by the teacher are ambiguously labeled. In this thesis, we will examine one framework of learning from ambiguous examples known as Multiple-Instance learning. Each example is a bag, consisting of any number of instances. A bag is labeled negative if all instances in it are negative. A bag is labeled positive if at least one instance in it is positive. Because the instances themselves are not labeled, each positive bag is an ambiguous example. We would like to learn a concept which will correctly classify unseen bags. We have developed a measure called Diverse Density and algorithms for learning from multiple-instance examples. We have applied these techniques to problems in drug design, stock prediction, and image database retrieval. These serve as examples of how to translate the ambiguity in the application domain into bags, as well as successful examples of applying Diverse Density techniques