2,151 research outputs found
Dirichlet Bayesian Network Scores and the Maximum Relative Entropy Principle
A classic approach for learning Bayesian networks from data is to identify a
maximum a posteriori (MAP) network structure. In the case of discrete Bayesian
networks, MAP networks are selected by maximising one of several possible
Bayesian Dirichlet (BD) scores; the most famous is the Bayesian Dirichlet
equivalent uniform (BDeu) score from Heckerman et al (1995). The key properties
of BDeu arise from its uniform prior over the parameters of each local
distribution in the network, which makes structure learning computationally
efficient; it does not require the elicitation of prior knowledge from experts;
and it satisfies score equivalence.
In this paper we will review the derivation and the properties of BD scores,
and of BDeu in particular, and we will link them to the corresponding entropy
estimates to study them from an information theoretic perspective. To this end,
we will work in the context of the foundational work of Giffin and Caticha
(2007), who showed that Bayesian inference can be framed as a particular case
of the maximum relative entropy principle. We will use this connection to show
that BDeu should not be used for structure learning from sparse data, since it
violates the maximum relative entropy principle; and that it is also
problematic from a more classic Bayesian model selection perspective, because
it produces Bayes factors that are sensitive to the value of its only
hyperparameter. Using a large simulation study, we found in our previous work
(Scutari, 2016) that the Bayesian Dirichlet sparse (BDs) score seems to provide
better accuracy in structure learning; in this paper we further show that BDs
does not suffer from the issues above, and we recommend to use it for sparse
data instead of BDeu. Finally, will show that these issues are in fact
different aspects of the same problem and a consequence of the distributional
assumptions of the prior.Comment: 20 pages, 4 figures; extended version submitted to Behaviormetrik
Pairwise alignment incorporating dipeptide covariation
Motivation: Standard algorithms for pairwise protein sequence alignment make
the simplifying assumption that amino acid substitutions at neighboring sites
are uncorrelated. This assumption allows implementation of fast algorithms for
pairwise sequence alignment, but it ignores information that could conceivably
increase the power of remote homolog detection. We examine the validity of this
assumption by constructing extended substitution matrixes that encapsulate the
observed correlations between neighboring sites, by developing an efficient and
rigorous algorithm for pairwise protein sequence alignment that incorporates
these local substitution correlations, and by assessing the ability of this
algorithm to detect remote homologies. Results: Our analysis indicates that
local correlations between substitutions are not strong on the average.
Furthermore, incorporating local substitution correlations into pairwise
alignment did not lead to a statistically significant improvement in remote
homology detection. Therefore, the standard assumption that individual residues
within protein sequences evolve independently of neighboring positions appears
to be an efficient and appropriate approximation
Gibbs Max-margin Topic Models with Data Augmentation
Max-margin learning is a powerful approach to building classifiers and
structured output predictors. Recent work on max-margin supervised topic models
has successfully integrated it with Bayesian topic models to discover
discriminative latent semantic structures and make accurate predictions for
unseen testing data. However, the resulting learning problems are usually hard
to solve because of the non-smoothness of the margin loss. Existing approaches
to building max-margin supervised topic models rely on an iterative procedure
to solve multiple latent SVM subproblems with additional mean-field assumptions
on the desired posterior distributions. This paper presents an alternative
approach by defining a new max-margin loss. Namely, we present Gibbs max-margin
supervised topic models, a latent variable Gibbs classifier to discover hidden
topic representations for various tasks, including classification, regression
and multi-task learning. Gibbs max-margin supervised topic models minimize an
expected margin loss, which is an upper bound of the existing margin loss
derived from an expected prediction rule. By introducing augmented variables
and integrating out the Dirichlet variables analytically by conjugacy, we
develop simple Gibbs sampling algorithms with no restricting assumptions and no
need to solve SVM subproblems. Furthermore, each step of the
"augment-and-collapse" Gibbs sampling algorithms has an analytical conditional
distribution, from which samples can be easily drawn. Experimental results
demonstrate significant improvements on time efficiency. The classification
performance is also significantly improved over competitors on binary,
multi-class and multi-label classification tasks.Comment: 35 page
- …