40,461 research outputs found
Improving Consistency and Correctness of Sequence Inpainting using Semantically Guided Generative Adversarial Network
Contemporary benchmark methods for image inpainting are based on deep
generative models and specifically leverage adversarial loss for yielding
realistic reconstructions. However, these models cannot be directly applied on
image/video sequences because of an intrinsic drawback- the reconstructions
might be independently realistic, but, when visualized as a sequence, often
lacks fidelity to the original uncorrupted sequence. The fundamental reason is
that these methods try to find the best matching latent space representation
near to natural image manifold without any explicit distance based loss. In
this paper, we present a semantically conditioned Generative Adversarial
Network (GAN) for sequence inpainting. The conditional information constrains
the GAN to map a latent representation to a point in image manifold respecting
the underlying pose and semantics of the scene. To the best of our knowledge,
this is the first work which simultaneously addresses consistency and
correctness of generative model based inpainting. We show that our generative
model learns to disentangle pose and appearance information; this independence
is exploited by our model to generate highly consistent reconstructions. The
conditional information also aids the generator network in GAN to produce
sharper images compared to the original GAN formulation. This helps in
achieving more appealing inpainting performance. Though generic, our algorithm
was targeted for inpainting on faces. When applied on CelebA and Youtube Faces
datasets, the proposed method results in a significant improvement over the
current benchmark, both in terms of quantitative evaluation (Peak Signal to
Noise Ratio) and human visual scoring over diversified combinations of
resolutions and deformations
CNN-Based Joint Clustering and Representation Learning with Feature Drift Compensation for Large-Scale Image Data
Given a large unlabeled set of images, how to efficiently and effectively
group them into clusters based on extracted visual representations remains a
challenging problem. To address this problem, we propose a convolutional neural
network (CNN) to jointly solve clustering and representation learning in an
iterative manner. In the proposed method, given an input image set, we first
randomly pick k samples and extract their features as initial cluster centroids
using the proposed CNN with an initial model pre-trained from the ImageNet
dataset. Mini-batch k-means is then performed to assign cluster labels to
individual input samples for a mini-batch of images randomly sampled from the
input image set until all images are processed. Subsequently, the proposed CNN
simultaneously updates the parameters of the proposed CNN and the centroids of
image clusters iteratively based on stochastic gradient descent. We also
proposed a feature drift compensation scheme to mitigate the drift error caused
by feature mismatch in representation learning. Experimental results
demonstrate the proposed method outperforms start-of-the-art clustering schemes
in terms of accuracy and storage complexity on large-scale image sets
containing millions of images.Comment: 9 pages to appear in IEEE Transactions on Multimedia (Special Issue
on Large-Scale Multimedia Data Retrieval, Classification, and Understanding
Cobalt: BFT Governance in Open Networks
We present Cobalt, a novel atomic broadcast algorithm that works in networks
with non-uniform trust and no global agreement on participants, and is
probabilistically guaranteed to make forward progress even in the presence of
maximal faults and arbitrary asynchrony. The exact properties that Cobalt
satisfies makes it particularly applicable to designing an efficient
decentralized "voting network" that allows a public, open-entry group of nodes
to agree on changes to some shared set of rules in a fair and consistent manner
while tolerating some trusted nodes and arbitrarily many untrusted nodes
behaving maliciously. We also define a new set of properties which must be
satisfied by any safe decentralized governance algorithm, and all of which
Cobalt satisfies.Comment: 49 pages, 0 figure
Composite Shape Modeling via Latent Space Factorization
We present a novel neural network architecture, termed Decomposer-Composer,
for semantic structure-aware 3D shape modeling. Our method utilizes an
auto-encoder-based pipeline, and produces a novel factorized shape embedding
space, where the semantic structure of the shape collection translates into a
data-dependent sub-space factorization, and where shape composition and
decomposition become simple linear operations on the embedding coordinates. We
further propose to model shape assembly using an explicit learned part
deformation module, which utilizes a 3D spatial transformer network to perform
an in-network volumetric grid deformation, and which allows us to train the
whole system end-to-end. The resulting network allows us to perform part-level
shape manipulation, unattainable by existing approaches. Our extensive ablation
study, comparison to baseline methods and qualitative analysis demonstrate the
improved performance of the proposed method
Shape of the Cloak: Formal Analysis of Clock Skew-Based Intrusion Detection System in Controller Area Networks
This paper presents a new masquerade attack called the cloaking attack and
provides formal analyses for clock skew-based Intrusion Detection Systems
(IDSs) that detect masquerade attacks in the Controller Area Network (CAN) in
automobiles. In the cloaking attack, the adversary manipulates the message
inter-transmission times of spoofed messages by adding delays so as to emulate
a desired clock skew and avoid detection. In order to predict and characterize
the impact of the cloaking attack in terms of the attack success probability on
a given CAN bus and IDS, we develop formal models for two clock skew-based
IDSs, i.e., the state-of-the-art (SOTA) IDS and its adaptation to the widely
used Network Time Protocol (NTP), using parameters of the attacker, the
detector, and the hardware platform. To the best of our knowledge, this is the
first paper that provides formal analyses of clock skew-based IDSs in
automotive CAN. We implement the cloaking attack on two hardware testbeds, a
prototype and a real vehicle (the University of Washington (UW) EcoCAR), and
demonstrate its effectiveness against both the SOTA and NTP-based IDSs. We
validate our formal analyses through extensive experiments for different
messages, IDS settings, and vehicles. By comparing each predicted attack
success probability curve against its experimental curve, we find that the
average prediction error is within 3.0% for the SOTA IDS and 5.7% for the
NTP-based IDS.Comment: Part of this work was presented at ACM/IEEE ICCPS 2018; to be
published in IEEE Transactions on Information Forensics & Securit
Universal consistency and minimax rates for online Mondrian Forests
We establish the consistency of an algorithm of Mondrian Forests, a
randomized classification algorithm that can be implemented online. First, we
amend the original Mondrian Forest algorithm, that considers a fixed lifetime
parameter. Indeed, the fact that this parameter is fixed hinders the
statistical consistency of the original procedure. Our modified Mondrian Forest
algorithm grows trees with increasing lifetime parameters , and uses
an alternative updating rule, allowing to work also in an online fashion.
Second, we provide a theoretical analysis establishing simple conditions for
consistency. Our theoretical analysis also exhibits a surprising fact: our
algorithm achieves the minimax rate (optimal rate) for the estimation of a
Lipschitz regression function, which is a strong extension of previous results
to an arbitrary dimension.Comment: NIPS 201
MINE: Mutual Information Neural Estimation
We argue that the estimation of mutual information between high dimensional
continuous random variables can be achieved by gradient descent over neural
networks. We present a Mutual Information Neural Estimator (MINE) that is
linearly scalable in dimensionality as well as in sample size, trainable
through back-prop, and strongly consistent. We present a handful of
applications on which MINE can be used to minimize or maximize mutual
information. We apply MINE to improve adversarially trained generative models.
We also use MINE to implement Information Bottleneck, applying it to supervised
classification; our results demonstrate substantial improvement in flexibility
and performance in these settings.Comment: 19 pages, 6 figure
Learning Condensed and Aligned Features for Unsupervised Domain Adaptation Using Label Propagation
Unsupervised domain adaptation aiming to learn a specific task for one domain
using another domain data has emerged to address the labeling issue in
supervised learning, especially because it is difficult to obtain massive
amounts of labeled data in practice. The existing methods have succeeded by
reducing the difference between the embedded features of both domains, but the
performance is still unsatisfactory compared to the supervised learning scheme.
This is attributable to the embedded features that lay around each other but do
not align perfectly and establish clearly separable clusters. We propose a
novel domain adaptation method based on label propagation and cycle consistency
to let the clusters of the features from the two domains overlap exactly and
become clear for high accuracy. Specifically, we introduce cycle consistency to
enforce the relationship between each cluster and exploit label propagation to
achieve the association between the data from the perspective of the manifold
structure instead of a one-to-one relation. Hence, we successfully formed
aligned and discriminative clusters. We present the empirical results of our
method for various domain adaptation scenarios and visualize the embedded
features to prove that our method is critical for better domain adaptation
Structure-Aware Shape Synthesis
We propose a new procedure to guide training of a data-driven shape
generative model using a structure-aware loss function. Complex 3D shapes often
can be summarized using a coarsely defined structure which is consistent and
robust across variety of observations. However, existing synthesis techniques
do not account for structure during training, and thus often generate
implausible and structurally unrealistic shapes. During training, we enforce
structural constraints in order to enforce consistency and structure across the
entire manifold. We propose a novel methodology for training 3D generative
models that incorporates structural information into an end-to-end training
pipeline.Comment: Accepted to 3DV 201
Fighting Fake News: Image Splice Detection via Learned Self-Consistency
Advances in photo editing and manipulation tools have made it significantly
easier to create fake imagery. Learning to detect such manipulations, however,
remains a challenging problem due to the lack of sufficient amounts of
manipulated training data. In this paper, we propose a learning algorithm for
detecting visual image manipulations that is trained only using a large dataset
of real photographs. The algorithm uses the automatically recorded photo EXIF
metadata as supervisory signal for training a model to determine whether an
image is self-consistent -- that is, whether its content could have been
produced by a single imaging pipeline. We apply this self-consistency model to
the task of detecting and localizing image splices. The proposed method obtains
state-of-the-art performance on several image forensics benchmarks, despite
never seeing any manipulated images at training. That said, it is merely a step
in the long quest for a truly general purpose visual forensics tool
- …