5,546 research outputs found
Nonparametric Bayes dynamic modeling of relational data
Symmetric binary matrices representing relations among entities are commonly
collected in many areas. Our focus is on dynamically evolving binary relational
matrices, with interest being in inference on the relationship structure and
prediction. We propose a nonparametric Bayesian dynamic model, which reduces
dimensionality in characterizing the binary matrix through a lower-dimensional
latent space representation, with the latent coordinates evolving in continuous
time via Gaussian processes. By using a logistic mapping function from the
probability matrix space to the latent relational space, we obtain a flexible
and computational tractable formulation. Employing P\`olya-Gamma data
augmentation, an efficient Gibbs sampler is developed for posterior
computation, with the dimension of the latent space automatically inferred. We
provide some theoretical results on flexibility of the model, and illustrate
performance via simulation experiments. We also consider an application to
co-movements in world financial markets
A Bayesian Approach for Model-Based Clustering of Several Binary Dissimilarity Matrices: The dmbc Package in R
We introduce the new package dmbc that implements a Bayesian algorithm for clustering a set of binary dissimilarity matrices within a model-based framework. Specifically, we consider the case when S matrices are available, each describing the dissimilarities among the same n objects, possibly expressed by S subjects (judges), or measured under different experimental conditions, or with reference to different characteristics of the objects themselves. In particular, we focus on binary dissimilarities, taking values 0 or 1 depending on whether or not two objects are deemed as dissimilar. We are interested in analyzing such data using multidimensional scaling (MDS). Differently from standard MDS algorithms, our goal is to cluster the dissimilarity matrices and, simultaneously, to extract an MDS configuration specific for each cluster. To this end, we develop a fully Bayesian three-way MDS approach, where the elements of each dissimilarity matrix are modeled as a mixture of Bernoulli random vectors. The parameter estimates and the MDS configurations are derived using a hybrid Metropolis-Gibbs Markov Chain Monte Carlo algorithm. We also propose a BIC-like criterion for jointly selecting the optimal number of clusters and latent space dimensions. We illustrate our approach referring both to synthetic data and to a publicly available data set taken from the literature. For the sake of efficiency, the core computations in the package are implemented in C/C++. The package also allows the simulation of multiple chains through the support of the parallel package
General method of pattern classification using the two-domain theory
Human beings judge patterns (such as images) by complex mental processes, some of which may not be known, while computing machines extract features. By representing the human judgements with simple measurements and reducing them and the machine extracted features to a common metric space and fitting them by regression, the judgements of human experts rendered on a sample of patterns may be imposed on a pattern population to provide automatic classification
The Data Big Bang and the Expanding Digital Universe: High-Dimensional, Complex and Massive Data Sets in an Inflationary Epoch
Recent and forthcoming advances in instrumentation, and giant new surveys,
are creating astronomical data sets that are not amenable to the methods of
analysis familiar to astronomers. Traditional methods are often inadequate not
merely because of the size in bytes of the data sets, but also because of the
complexity of modern data sets. Mathematical limitations of familiar algorithms
and techniques in dealing with such data sets create a critical need for new
paradigms for the representation, analysis and scientific visualization (as
opposed to illustrative visualization) of heterogeneous, multiresolution data
across application domains. Some of the problems presented by the new data sets
have been addressed by other disciplines such as applied mathematics,
statistics and machine learning and have been utilized by other sciences such
as space-based geosciences. Unfortunately, valuable results pertaining to these
problems are mostly to be found only in publications outside of astronomy. Here
we offer brief overviews of a number of concepts, techniques and developments,
some "old" and some new. These are generally unknown to most of the
astronomical community, but are vital to the analysis and visualization of
complex datasets and images. In order for astronomers to take advantage of the
richness and complexity of the new era of data, and to be able to identify,
adopt, and apply new solutions, the astronomical community needs a certain
degree of awareness and understanding of the new concepts. One of the goals of
this paper is to help bridge the gap between applied mathematics, artificial
intelligence and computer science on the one side and astronomy on the other.Comment: 24 pages, 8 Figures, 1 Table. Accepted for publication: "Advances in
Astronomy, special issue "Robotic Astronomy
- …