Search CORE

38,537 research outputs found

Clustering Boolean Tensors

Author: Metzler Saskia
Miettinen Pauli
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

Tensor factorizations are computationally hard problems, and in particular, are often significantly harder than their matrix counterparts. In case of Boolean tensor factorizations -- where the input tensor and all the factors are required to be binary and we use Boolean algebra -- much of that hardness comes from the possibility of overlapping components. Yet, in many applications we are perfectly happy to partition at least one of the modes. In this paper we investigate what consequences does this partitioning have on the computational complexity of the Boolean tensor factorizations and present a new algorithm for the resulting clustering problem. This algorithm can alternatively be seen as a particularly regularized clustering algorithm that can handle extremely high-dimensional observations. We analyse our algorithms with the goal of maximizing the similarity and argue that this is more meaningful than minimizing the dissimilarity. As a by-product we obtain a PTAS and an efficient 0.828-approximation algorithm for rank-1 binary factorizations. Our algorithm for Boolean tensor clustering achieves high scalability, high similarity, and good generalization to unseen data with both synthetic and real-world data sets

arXiv.org e-Print Archive

CiteSeerX

MPG.PuRe

Multimodal nested sampling: an efficient and robust alternative to MCMC methods for astronomical data analysis

Author: Alfano
Allanach
Basset
Beltran
Bennett
Bridges
Bryan
Dunkley
F. Feroz
Girshick
Hobson
Hobson
Jeffreys
Liddle
M. P. Hobson
MacKay
Marshall
Mukherjee
Niarchou
O'Ruanaidh
Shaw
Sivia
Skilling
Slosar
Trotta
Verde
Publication venue: 'Wiley'
Publication date: 23/07/2007
Field of study

In performing a Bayesian analysis of astronomical data, two difficult problems often emerge. First, in estimating the parameters of some model for the data, the resulting posterior distribution may be multimodal or exhibit pronounced (curving) degeneracies, which can cause problems for traditional MCMC sampling methods. Second, in selecting between a set of competing models, calculation of the Bayesian evidence for each model is computationally expensive. The nested sampling method introduced by Skilling (2004), has greatly reduced the computational expense of calculating evidences and also produces posterior inferences as a by-product. This method has been applied successfully in cosmological applications by Mukherjee et al. (2006), but their implementation was efficient only for unimodal distributions without pronounced degeneracies. Shaw et al. (2007), recently introduced a clustered nested sampling method which is significantly more efficient in sampling from multimodal posteriors and also determines the expectation and variance of the final evidence from a single run of the algorithm, hence providing a further increase in efficiency. In this paper, we build on the work of Shaw et al. and present three new methods for sampling and evidence evaluation from distributions that may contain multiple modes and significant degeneracies; we also present an even more efficient technique for estimating the uncertainty on the evaluated evidence. These methods lead to a further substantial improvement in sampling efficiency and robustness, and are applied to toy problems to demonstrate the accuracy and economy of the evidence calculation and parameter estimation. Finally, we discuss the use of these methods in performing Bayesian object detection in astronomical datasets.Comment: 14 pages, 11 figures, submitted to MNRAS, some major additions to the previous version in response to the referee's comment

arXiv.org e-Print Archive

Crossref