190,202 research outputs found
Constraint-Based Visual Generation
In the last few years the systematic adoption of deep learning to visual
generation has produced impressive results that, amongst others, definitely
benefit from the massive exploration of convolutional architectures. In this
paper, we propose a general approach to visual generation that combines
learning capabilities with logic descriptions of the target to be generated.
The process of generation is regarded as a constrained satisfaction problem,
where the constraints describe a set of properties that characterize the
target. Interestingly, the constraints can also involve logic variables, while
all of them are converted into real-valued functions by means of the t-norm
theory. We use deep architectures to model the involved variables, and propose
a computational scheme where the learning process carries out a satisfaction of
the constraints. We propose some examples in which the theory can naturally be
used, including the modeling of GAN and auto-encoders, and report promising
results in problems with the generation of handwritten characters and face
transformations
Adaptive laser link reconfiguration using constraint propagation
This paper describes Harris AI research performed on the Adaptive Link Reconfiguration (ALR) study for Rome Lab, and focuses on the application of constraint propagation to the problem of link reconfiguration for the proposed space based Strategic Defense System (SDS) Brilliant Pebbles (BP) communications system. According to the concept of operations at the time of the study, laser communications will exist between BP's and to ground entry points. Long-term links typical of RF transmission will not exist. This study addressed an initial implementation of BP's based on the Global Protection Against Limited Strikes (GPALS) SDI mission. The number of satellites and rings studied was representative of this problem. An orbital dynamics program was used to generate line-of-site data for the modeled architecture. This was input into a discrete event simulation implemented in the Harris developed COnstraint Propagation Expert System (COPES) Shell, developed initially on the Rome Lab BM/C3 study. Using a model of the network and several heuristics, the COPES shell was used to develop the Heuristic Adaptive Link Ordering (HALO) Algorithm to rank and order potential laser links according to probability of communication. A reduced set of links based on this ranking would then be used by a routing algorithm to select the next hop. This paper includes an overview of Constraint Propagation as an Artificial Intelligence technique and its embodiment in the COPES shell. It describes the design and implementation of both the simulation of the GPALS BP network and the HALO algorithm in COPES. This is described using a 59 Data Flow Diagram, State Transition Diagrams, and Structured English PDL. It describes a laser communications model and the heuristics involved in rank-ordering the potential communication links. The generation of simulation data is described along with its interface via COPES to the Harris developed View Net graphical tool for visual analysis of communications networks. Conclusions are presented, including a graphical analysis of results depicting the ordered set of links versus the set of all possible links based on the computed Bit Error Rate (BER). Finally, future research is discussed which includes enhancements to the HALO algorithm, network simulation, and the addition of an intelligent routing algorithm for BP
Discrete Multi-modal Hashing with Canonical Views for Robust Mobile Landmark Search
Mobile landmark search (MLS) recently receives increasing attention for its
great practical values. However, it still remains unsolved due to two important
challenges. One is high bandwidth consumption of query transmission, and the
other is the huge visual variations of query images sent from mobile devices.
In this paper, we propose a novel hashing scheme, named as canonical view based
discrete multi-modal hashing (CV-DMH), to handle these problems via a novel
three-stage learning procedure. First, a submodular function is designed to
measure visual representativeness and redundancy of a view set. With it,
canonical views, which capture key visual appearances of landmark with limited
redundancy, are efficiently discovered with an iterative mining strategy.
Second, multi-modal sparse coding is applied to transform visual features from
multiple modalities into an intermediate representation. It can robustly and
adaptively characterize visual contents of varied landmark images with certain
canonical views. Finally, compact binary codes are learned on intermediate
representation within a tailored discrete binary embedding model which
preserves visual relations of images measured with canonical views and removes
the involved noises. In this part, we develop a new augmented Lagrangian
multiplier (ALM) based optimization method to directly solve the discrete
binary codes. We can not only explicitly deal with the discrete constraint, but
also consider the bit-uncorrelated constraint and balance constraint together.
Experiments on real world landmark datasets demonstrate the superior
performance of CV-DMH over several state-of-the-art methods
PDANet: Polarity-consistent Deep Attention Network for Fine-grained Visual Emotion Regression
Existing methods on visual emotion analysis mainly focus on coarse-grained
emotion classification, i.e. assigning an image with a dominant discrete
emotion category. However, these methods cannot well reflect the complexity and
subtlety of emotions. In this paper, we study the fine-grained regression
problem of visual emotions based on convolutional neural networks (CNNs).
Specifically, we develop a Polarity-consistent Deep Attention Network (PDANet),
a novel network architecture that integrates attention into a CNN with an
emotion polarity constraint. First, we propose to incorporate both spatial and
channel-wise attentions into a CNN for visual emotion regression, which jointly
considers the local spatial connectivity patterns along each channel and the
interdependency between different channels. Second, we design a novel
regression loss, i.e. polarity-consistent regression (PCR) loss, based on the
weakly supervised emotion polarity to guide the attention generation. By
optimizing the PCR loss, PDANet can generate a polarity preserved attention map
and thus improve the emotion regression performance. Extensive experiments are
conducted on the IAPS, NAPS, and EMOTIC datasets, and the results demonstrate
that the proposed PDANet outperforms the state-of-the-art approaches by a large
margin for fine-grained visual emotion regression. Our source code is released
at: https://github.com/ZizhouJia/PDANet.Comment: Accepted by ACM Multimedia 201
- …