19 research outputs found
Deconvolutional Latent-Variable Model for Text Sequence Matching
A latent-variable model is introduced for text matching, inferring sentence
representations by jointly optimizing generative and discriminative objectives.
To alleviate typical optimization challenges in latent-variable models for
text, we employ deconvolutional networks as the sequence decoder (generator),
providing learned latent codes with more semantic information and better
generalization. Our model, trained in an unsupervised manner, yields stronger
empirical predictive performance than a decoder based on Long Short-Term Memory
(LSTM), with less parameters and considerably faster training. Further, we
apply it to text sequence-matching problems. The proposed model significantly
outperforms several strong sentence-encoding baselines, especially in the
semi-supervised setting.Comment: Accepted by AAAI-201
On convergence and accuracy of Gaussian belief propagation
Gaussian belief propagation (BP) is known to be an efficient message-passing algorithm for calculating approximate marginal probability density function (PDF) from a high dimensional Gaussian PDF. When Gaussian BP converges, it is known that the mean calculated by Gaussian BP (Gaussian BP mean) is the mean of exact marginal PDF, while the variance calculated by Gaussian BP (Gaussian BP variance) is an approximation to the variance of exact marginal PDF. Since Gaussian BP is not guaranteed to converge, it is important to know under what conditions Gaussian BP converges. Moreover, due to the unknown accuracy of Gaussian BP variance, it is also meaningful to analyze and improve its accuracy. In this thesis, the issues of convergence and accuracy in Gaussian BP are focused on.
First, the convergence condition of Gaussian BP variances is investigated. In particular, by analyzing the message updating functions of Gaussian BP, the necessary and sufficient convergence condition of Gaussian BP variances is derived under both synchronous and asynchronous schedulings. The relationship between the proposed convergence condition and the existing one is also established analytically. Comparing to the existing best convergence condition which is sufficient only and requires computing the spectral radius of an infinite dimensional matrix, the proposed condition not only fills in the necessary part of the existing condition, but also can be verified efficiently.
Next, based on the convergence condition of Gaussian BP variances, the necessary and sufficient convergence conditions of beliefs (parameterized by Gaussian BP variance and mean) are derived in the scenarios of synchronous scheduling with or without damping. The results theoretically confirm the extensively reported conjecture that damping is helpful to improve the convergence of Gaussian BP. Under asynchronous scheduling, a sufficient convergence condition of beliefs is also derived. Relationships between the proposed convergence conditions and existing ones are established analytically, demonstrating that the existing conditions are implied by the proposed ones.
Finally, the accuracy of Gaussian BP variances is analyzed and improved. In particular, an explicit error expression of Gaussian BP variances is first derived. By novel representation of this error expression, a distributed message-passing algorithm is proposed to improve the accuracy of Gaussian BP variance. It is proved that the upper bound of the residual error in the improved variance monotonically decreases as the number of nodes in a particular set increases, and eventually vanishes to zero as the remaining graph becomes loop-free after removal of the selected nodes.published_or_final_versionElectrical and Electronic EngineeringDoctoralDoctor of Philosoph
Leveraging Contaminated Datasets to Learn Clean-Data Distribution with Purified Generative Adversarial Networks
Generative adversarial networks (GANs) are known for their strong abilities on capturing the underlying distribution of training instances. Since the seminal work of GAN, many variants of GAN have been proposed. However, existing GANs are almost established on the assumption that the training dataset is clean. But in many real-world applications, this may not hold, that is, the training dataset may be contaminated by a proportion of undesired instances. When training on such datasets, existing GANs will learn a mixture distribution of desired and contaminated instances, rather than the desired distribution of desired data only (target distribution). To learn the target distribution from contaminated datasets, two purified generative adversarial networks (PuriGAN) are developed, in which the discriminators are augmented with the capability to distinguish between target and contaminated instances by leveraging an extra dataset solely composed of contamination instances. We prove that under some mild conditions, the proposed PuriGANs are guaranteed to converge to the distribution of desired instances. Experimental results on several datasets demonstrate that the proposed PuriGANs are able to generate much better images from the desired distribution than comparable baselines when trained on contaminated datasets. In addition, we also demonstrate the usefulness of PuriGAN on downstream applications by applying it to the tasks of semi-supervised anomaly detection on contaminated datasets and PU-learning. Experimental results show that PuriGAN is able to deliver the best performance over comparable baselines on both tasks
Modeling Semantic Composition with Syntactic Hypergraph for Video Question Answering
A key challenge in video question answering is how to realize the cross-modal
semantic alignment between textual concepts and corresponding visual objects.
Existing methods mostly seek to align the word representations with the video
regions. However, word representations are often not able to convey a complete
description of textual concepts, which are in general described by the
compositions of certain words. To address this issue, we propose to first build
a syntactic dependency tree for each question with an off-the-shelf tool and
use it to guide the extraction of meaningful word compositions. Based on the
extracted compositions, a hypergraph is further built by viewing the words as
nodes and the compositions as hyperedges. Hypergraph convolutional networks
(HCN) are then employed to learn the initial representations of word
compositions. Afterwards, an optimal transport based method is proposed to
perform cross-modal semantic alignment for the textual and visual semantic
space. To reflect the cross-modal influences, the cross-modal information is
incorporated into the initial representations, leading to a model named
cross-modality-aware syntactic HCN. Experimental results on three benchmarks
show that our method outperforms all strong baselines. Further analyses
demonstrate the effectiveness of each component, and show that our model is
good at modeling different levels of semantic compositions and filtering out
irrelevant information.Comment: 11pages, 7 figure