40,461 research outputs found

    Improving Consistency and Correctness of Sequence Inpainting using Semantically Guided Generative Adversarial Network

    Full text link
    Contemporary benchmark methods for image inpainting are based on deep generative models and specifically leverage adversarial loss for yielding realistic reconstructions. However, these models cannot be directly applied on image/video sequences because of an intrinsic drawback- the reconstructions might be independently realistic, but, when visualized as a sequence, often lacks fidelity to the original uncorrupted sequence. The fundamental reason is that these methods try to find the best matching latent space representation near to natural image manifold without any explicit distance based loss. In this paper, we present a semantically conditioned Generative Adversarial Network (GAN) for sequence inpainting. The conditional information constrains the GAN to map a latent representation to a point in image manifold respecting the underlying pose and semantics of the scene. To the best of our knowledge, this is the first work which simultaneously addresses consistency and correctness of generative model based inpainting. We show that our generative model learns to disentangle pose and appearance information; this independence is exploited by our model to generate highly consistent reconstructions. The conditional information also aids the generator network in GAN to produce sharper images compared to the original GAN formulation. This helps in achieving more appealing inpainting performance. Though generic, our algorithm was targeted for inpainting on faces. When applied on CelebA and Youtube Faces datasets, the proposed method results in a significant improvement over the current benchmark, both in terms of quantitative evaluation (Peak Signal to Noise Ratio) and human visual scoring over diversified combinations of resolutions and deformations

    CNN-Based Joint Clustering and Representation Learning with Feature Drift Compensation for Large-Scale Image Data

    Full text link
    Given a large unlabeled set of images, how to efficiently and effectively group them into clusters based on extracted visual representations remains a challenging problem. To address this problem, we propose a convolutional neural network (CNN) to jointly solve clustering and representation learning in an iterative manner. In the proposed method, given an input image set, we first randomly pick k samples and extract their features as initial cluster centroids using the proposed CNN with an initial model pre-trained from the ImageNet dataset. Mini-batch k-means is then performed to assign cluster labels to individual input samples for a mini-batch of images randomly sampled from the input image set until all images are processed. Subsequently, the proposed CNN simultaneously updates the parameters of the proposed CNN and the centroids of image clusters iteratively based on stochastic gradient descent. We also proposed a feature drift compensation scheme to mitigate the drift error caused by feature mismatch in representation learning. Experimental results demonstrate the proposed method outperforms start-of-the-art clustering schemes in terms of accuracy and storage complexity on large-scale image sets containing millions of images.Comment: 9 pages to appear in IEEE Transactions on Multimedia (Special Issue on Large-Scale Multimedia Data Retrieval, Classification, and Understanding

    Cobalt: BFT Governance in Open Networks

    Full text link
    We present Cobalt, a novel atomic broadcast algorithm that works in networks with non-uniform trust and no global agreement on participants, and is probabilistically guaranteed to make forward progress even in the presence of maximal faults and arbitrary asynchrony. The exact properties that Cobalt satisfies makes it particularly applicable to designing an efficient decentralized "voting network" that allows a public, open-entry group of nodes to agree on changes to some shared set of rules in a fair and consistent manner while tolerating some trusted nodes and arbitrarily many untrusted nodes behaving maliciously. We also define a new set of properties which must be satisfied by any safe decentralized governance algorithm, and all of which Cobalt satisfies.Comment: 49 pages, 0 figure

    Composite Shape Modeling via Latent Space Factorization

    Full text link
    We present a novel neural network architecture, termed Decomposer-Composer, for semantic structure-aware 3D shape modeling. Our method utilizes an auto-encoder-based pipeline, and produces a novel factorized shape embedding space, where the semantic structure of the shape collection translates into a data-dependent sub-space factorization, and where shape composition and decomposition become simple linear operations on the embedding coordinates. We further propose to model shape assembly using an explicit learned part deformation module, which utilizes a 3D spatial transformer network to perform an in-network volumetric grid deformation, and which allows us to train the whole system end-to-end. The resulting network allows us to perform part-level shape manipulation, unattainable by existing approaches. Our extensive ablation study, comparison to baseline methods and qualitative analysis demonstrate the improved performance of the proposed method

    Shape of the Cloak: Formal Analysis of Clock Skew-Based Intrusion Detection System in Controller Area Networks

    Full text link
    This paper presents a new masquerade attack called the cloaking attack and provides formal analyses for clock skew-based Intrusion Detection Systems (IDSs) that detect masquerade attacks in the Controller Area Network (CAN) in automobiles. In the cloaking attack, the adversary manipulates the message inter-transmission times of spoofed messages by adding delays so as to emulate a desired clock skew and avoid detection. In order to predict and characterize the impact of the cloaking attack in terms of the attack success probability on a given CAN bus and IDS, we develop formal models for two clock skew-based IDSs, i.e., the state-of-the-art (SOTA) IDS and its adaptation to the widely used Network Time Protocol (NTP), using parameters of the attacker, the detector, and the hardware platform. To the best of our knowledge, this is the first paper that provides formal analyses of clock skew-based IDSs in automotive CAN. We implement the cloaking attack on two hardware testbeds, a prototype and a real vehicle (the University of Washington (UW) EcoCAR), and demonstrate its effectiveness against both the SOTA and NTP-based IDSs. We validate our formal analyses through extensive experiments for different messages, IDS settings, and vehicles. By comparing each predicted attack success probability curve against its experimental curve, we find that the average prediction error is within 3.0% for the SOTA IDS and 5.7% for the NTP-based IDS.Comment: Part of this work was presented at ACM/IEEE ICCPS 2018; to be published in IEEE Transactions on Information Forensics & Securit

    Universal consistency and minimax rates for online Mondrian Forests

    Full text link
    We establish the consistency of an algorithm of Mondrian Forests, a randomized classification algorithm that can be implemented online. First, we amend the original Mondrian Forest algorithm, that considers a fixed lifetime parameter. Indeed, the fact that this parameter is fixed hinders the statistical consistency of the original procedure. Our modified Mondrian Forest algorithm grows trees with increasing lifetime parameters λn\lambda_n, and uses an alternative updating rule, allowing to work also in an online fashion. Second, we provide a theoretical analysis establishing simple conditions for consistency. Our theoretical analysis also exhibits a surprising fact: our algorithm achieves the minimax rate (optimal rate) for the estimation of a Lipschitz regression function, which is a strong extension of previous results to an arbitrary dimension.Comment: NIPS 201

    MINE: Mutual Information Neural Estimation

    Full text link
    We argue that the estimation of mutual information between high dimensional continuous random variables can be achieved by gradient descent over neural networks. We present a Mutual Information Neural Estimator (MINE) that is linearly scalable in dimensionality as well as in sample size, trainable through back-prop, and strongly consistent. We present a handful of applications on which MINE can be used to minimize or maximize mutual information. We apply MINE to improve adversarially trained generative models. We also use MINE to implement Information Bottleneck, applying it to supervised classification; our results demonstrate substantial improvement in flexibility and performance in these settings.Comment: 19 pages, 6 figure

    Learning Condensed and Aligned Features for Unsupervised Domain Adaptation Using Label Propagation

    Full text link
    Unsupervised domain adaptation aiming to learn a specific task for one domain using another domain data has emerged to address the labeling issue in supervised learning, especially because it is difficult to obtain massive amounts of labeled data in practice. The existing methods have succeeded by reducing the difference between the embedded features of both domains, but the performance is still unsatisfactory compared to the supervised learning scheme. This is attributable to the embedded features that lay around each other but do not align perfectly and establish clearly separable clusters. We propose a novel domain adaptation method based on label propagation and cycle consistency to let the clusters of the features from the two domains overlap exactly and become clear for high accuracy. Specifically, we introduce cycle consistency to enforce the relationship between each cluster and exploit label propagation to achieve the association between the data from the perspective of the manifold structure instead of a one-to-one relation. Hence, we successfully formed aligned and discriminative clusters. We present the empirical results of our method for various domain adaptation scenarios and visualize the embedded features to prove that our method is critical for better domain adaptation

    Structure-Aware Shape Synthesis

    Full text link
    We propose a new procedure to guide training of a data-driven shape generative model using a structure-aware loss function. Complex 3D shapes often can be summarized using a coarsely defined structure which is consistent and robust across variety of observations. However, existing synthesis techniques do not account for structure during training, and thus often generate implausible and structurally unrealistic shapes. During training, we enforce structural constraints in order to enforce consistency and structure across the entire manifold. We propose a novel methodology for training 3D generative models that incorporates structural information into an end-to-end training pipeline.Comment: Accepted to 3DV 201

    Fighting Fake News: Image Splice Detection via Learned Self-Consistency

    Full text link
    Advances in photo editing and manipulation tools have made it significantly easier to create fake imagery. Learning to detect such manipulations, however, remains a challenging problem due to the lack of sufficient amounts of manipulated training data. In this paper, we propose a learning algorithm for detecting visual image manipulations that is trained only using a large dataset of real photographs. The algorithm uses the automatically recorded photo EXIF metadata as supervisory signal for training a model to determine whether an image is self-consistent -- that is, whether its content could have been produced by a single imaging pipeline. We apply this self-consistency model to the task of detecting and localizing image splices. The proposed method obtains state-of-the-art performance on several image forensics benchmarks, despite never seeing any manipulated images at training. That said, it is merely a step in the long quest for a truly general purpose visual forensics tool
    corecore