187 research outputs found
A Bayesian Approach for Clustering Constant-wise Change-point Data
Change-point models deal with ordered data sequences. Their primary goal is
to infer the locations where an aspect of the data sequence changes. In this
paper, we propose and implement a nonparametric Bayesian model for clustering
observations based on their constant-wise change-point profiles via Gibbs
sampler. Our model incorporates a Dirichlet Process on the constant-wise
change-point structures to cluster observations while performing change-point
estimation simultaneously. Additionally, our approach controls the number of
clusters in the model, not requiring the specification of the number of
clusters a priori. Our method's performance is evaluated on simulated data
under various scenarios and on a publicly available single-cell copy-number
dataset.Comment: 30 pages, 12 figure
Bayesian Adaptive Selection of Variables for Function-on-Scalar Regression Models
Considering the field of functional data analysis, we developed a new
Bayesian method for variable selection in function-on-scalar regression (FOSR).
Our approach uses latent variables, allowing an adaptive selection since it can
determine the number of variables and which ones should be selected for a
function-on-scalar regression model. Simulation studies show the proposed
method's main properties, such as its accuracy in estimating the coefficients
and high capacity to select variables correctly. Furthermore, we conducted
comparative studies with the main competing methods, such as the BGLSS method
as well as the group LASSO, the group MCP and the group SCAD. We also used a
COVID-19 dataset and some socioeconomic data from Brazil for real data
application. In short, the proposed Bayesian variable selection model is
extremely competitive, showing significant predictive and selective quality
Variational Bayesian analysis of survival data using a log-logistic accelerated failure time model
The log-logistic regression model is one of the most commonly used
accelerated failure time (AFT) models in survival analysis, for which
statistical inference methods are mainly established under the frequentist
framework. Recently, Bayesian inference for log-logistic AFT models using
Markov chain Monte Carlo (MCMC) techniques has also been widely developed. In
this work, we develop an alternative approach to MCMC methods and infer the
parameters of the log-logistic AFT model via a mean-field variational Bayes
(VB) algorithm. A piecewise approximation technique is embedded in deriving the
VB algorithm to achieve conjugacy. The proposed VB algorithm is evaluated and
compared with typical frequentist inferences and MCMC inference using simulated
data under various scenarios. A publicly available dataset is employed for
illustration. We demonstrate that the proposed VB algorithm can achieve good
estimation accuracy and has a lower computational cost compared with MCMC
methods
J-PLUS : a catalogue of globular cluster candidates around the M 81/M 82/NGC 3077 triplet of galaxies
Globular clusters (GCs) are proxies of the formation assemblies of their host galaxies. However, few studies exist targeting GC systems of spiral galaxies up to several effective radii. Through 12-band Javalambre Photometric Local Universe Survey (J-PLUS) imaging, we study the point sources around the M 81/M 82/NGC 3077 triplet in search of new GC candidates. We develop a tailored classification scheme to search for GC candidates based on their similarity to known GCs via a principal component analysis projection. Our method accounts for missing data and photometric errors. We report 642 new GC candidates in a region of 3.5 deg2 around the triplet, ranked according to their Gaia astrometric proper motions when available. We find tantalizing evidence for an overdensity of GC candidate sources forming a bridge connecting M 81 and M 82. Finally, the spatial distribution of the GC candidates (g − i) colours is consistent with halo/intra-cluster GCs, i.e. it gets bluer as they get further from the closest galaxy in the field. We further employ a regression-tree-based model to estimate the metallicity distribution of the GC candidates based on their J-PLUS bands. The metallicity distribution of the sample candidates is broad and displays a bump towards the metal-rich end. Our list increases the population of GC candidates around the triplet by threefold, stresses the usefulness of multiband surveys in finding these objects, and provides a testbed for further studies analysing their spatial distribution around nearby (spirals) galaxies
- …