107,407 research outputs found
Incorporating prior knowledge induced from stochastic differential equations in the classification of stochastic observations
In classification, prior knowledge is incorporated in a Bayesian framework by assuming that the feature-label distribution belongs to an uncertainty class of feature-label distributions governed by a prior distribution. A posterior
distribution is then derived from the prior and the sample data. An optimal Bayesian classifier (OBC) minimizes the expected misclassification error relative to the posterior distribution. From an application perspective, prior
construction is critical
EEF: Exponentially Embedded Families with Class-Specific Features for Classification
In this letter, we present a novel exponentially embedded families (EEF)
based classification method, in which the probability density function (PDF) on
raw data is estimated from the PDF on features. With the PDF construction, we
show that class-specific features can be used in the proposed classification
method, instead of a common feature subset for all classes as used in
conventional approaches. We apply the proposed EEF classifier for text
categorization as a case study and derive an optimal Bayesian classification
rule with class-specific feature selection based on the Information Gain (IG)
score. The promising performance on real-life data sets demonstrates the
effectiveness of the proposed approach and indicates its wide potential
applications.Comment: 9 pages, 3 figures, to be published in IEEE Signal Processing Letter.
IEEE Signal Processing Letter, 201
Testing hypotheses via a mixture estimation model
We consider a novel paradigm for Bayesian testing of hypotheses and Bayesian
model comparison. Our alternative to the traditional construction of posterior
probabilities that a given hypothesis is true or that the data originates from
a specific model is to consider the models under comparison as components of a
mixture model. We therefore replace the original testing problem with an
estimation one that focus on the probability weight of a given model within a
mixture model. We analyze the sensitivity on the resulting posterior
distribution on the weights of various prior modeling on the weights. We stress
that a major appeal in using this novel perspective is that generic improper
priors are acceptable, while not putting convergence in jeopardy. Among other
features, this allows for a resolution of the Lindley-Jeffreys paradox. When
using a reference Beta B(a,a) prior on the mixture weights, we note that the
sensitivity of the posterior estimations of the weights to the choice of a
vanishes with the sample size increasing and avocate the default choice a=0.5,
derived from Rousseau and Mengersen (2011). Another feature of this easily
implemented alternative to the classical Bayesian solution is that the speeds
of convergence of the posterior mean of the weight and of the corresponding
posterior probability are quite similar.Comment: 25 pages, 6 figures, 2 table
Two methods for constructing a gene ontology-based feature network for a Bayesian network classifier and applications to datasets of aging-related genes.
In the context of the classification task of data mining or machine learning, hierarchical feature selection methods exploit hierarchical relationships among features in order to select a subset of features without hierarchical redundancy. Hierarchical feature selection is a new research area in classification research, since nearly all feature selection methods ignore hierarchical relationships among features. This paper proposes two methods for constructing a network of features to be used by a Bayesian Network Augmented Naïve Bayes (BAN) classifier, in datasets of aging-related genes where Gene Ontology (GO) terms are used as hierarchically related predictive features. One of the BAN network construction method relies on a hierarchical feature selection method to detect and remove hierarchical redundancies among features (GO terms); whilst the other BAN network construction method simply uses a conventional, flat feature selection method to select features, without removing the hierarchical redundancies associated with the GO. Both BAN network construction methods may create new edges among nodes (features) in the BAN network that did not exist in the original GO DAG (Directed Acyclic Graph), in order to preserve the generalization-specialization (ancestor-descendant) relationship among selected features. Experiments comparing these two BAN network construction methods, when using two different hierarchical feature selection methods and one at feature selection method, have shown that the best results are obtained by the BAN network construction method using one type of hierarchical feature selection method, i.e., select Hierarchical Information-Preserving features (HIP)
Poisson Latent Feature Calculus for Generalized Indian Buffet Processes
The purpose of this work is to describe a unified, and indeed simple,
mechanism for non-parametric Bayesian analysis, construction and generative
sampling of a large class of latent feature models which one can describe as
generalized notions of Indian Buffet Processes(IBP). This is done via the
Poisson Process Calculus as it now relates to latent feature models. The IBP
was ingeniously devised by Griffiths and Ghahramani in (2005) and its
generative scheme is cast in terms of customers entering sequentially an Indian
Buffet restaurant and selecting previously sampled dishes as well as new
dishes. In this metaphor dishes corresponds to latent features, attributes,
preferences shared by individuals. The IBP, and its generalizations, represent
an exciting class of models well suited to handle high dimensional statistical
problems now common in this information age. The IBP is based on the usage of
conditionally independent Bernoulli random variables, coupled with completely
random measures acting as Bayesian priors, that are used to create sparse
binary matrices. This Bayesian non-parametric view was a key insight due to
Thibaux and Jordan (2007). One way to think of generalizations is to to use
more general random variables. Of note in the current literature are models
employing Poisson and Negative-Binomial random variables. However, unlike their
closely related counterparts, generalized Chinese restaurant processes, the
ability to analyze IBP models in a systematic and general manner is not yet
available. The limitations are both in terms of knowledge about the effects of
different priors and in terms of models based on a wider choice of random
variables. This work will not only provide a thorough description of the
properties of existing models but also provide a simple template to devise and
analyze new models.Comment: This version provides more details for the multivariate extensions in
section 5. We highlight the case of a simple multinomial distribution and
showcase a multivariate Levy process prior we call a stable-Beta Dirichlet
process. Section 4.1.1 expande
- …