91,287 research outputs found
Recommended from our members
VARIATIONAL APPROXIMATIONS FOR DENSITY DECONVOLUTION
This thesis considers the problem of density estimation when the variables of interest are subject to measurement error. The measurement error is assumed to be additive and homoscedastic. We specify the density of interest by a Dirichlet Process Mixture Model and establish variational approximation approaches to the density deconvolution problem. Gaussian and Laplacian error distributions are considered, which are representatives of supersmooth and ordinary smooth distributions, respectively. We develop two variational approximation algorithms for Gaussian error deconvolution and one variational approximation algorithm for Laplacian error deconvolution. Their performances are compared to deconvoluting kernels and Monte Carlo Markov Chain method by simulation experiments. A conjecture based on hidden variables categorization is proposed to explain why two variational approximation algorithms for Gaussian error deconvolution perform differently. We establish a stochastic variational approximation algorithm for Gaussian error deconvolution, which improves the performance of variational approximation algorithm and performs as well as MCMC method at faster speed. The stochastic variational approximation algorithm is applied to simulation experiments and an example of physical activity measurements
Dealing with uncertainty: A rough-set-based approach with the background of classical logic
The representative-based approximation has been widely studied in rough set theory. Hence, rough set approximations can be defined by the system of representatives, which plays a crucial role in set approximation. In the authors’ previous research a possible use of the similarity-based rough set in first-order logic was investigated. Now our focus has changed to representative-based approximation systems. In this article, the authors show a logical system relying on representative-based set approximation. In our approach, a three-valued partial logic system is introduced. Based on the properties of the approximation space, our theorems prove that in some cases, there exists an efficient way to evaluate the first-order formulae
Ultra-Scalable Spectral Clustering and Ensemble Clustering
This paper focuses on scalability and robustness of spectral clustering for
extremely large-scale datasets with limited resources. Two novel algorithms are
proposed, namely, ultra-scalable spectral clustering (U-SPEC) and
ultra-scalable ensemble clustering (U-SENC). In U-SPEC, a hybrid representative
selection strategy and a fast approximation method for K-nearest
representatives are proposed for the construction of a sparse affinity
sub-matrix. By interpreting the sparse sub-matrix as a bipartite graph, the
transfer cut is then utilized to efficiently partition the graph and obtain the
clustering result. In U-SENC, multiple U-SPEC clusterers are further integrated
into an ensemble clustering framework to enhance the robustness of U-SPEC while
maintaining high efficiency. Based on the ensemble generation via multiple
U-SEPC's, a new bipartite graph is constructed between objects and base
clusters and then efficiently partitioned to achieve the consensus clustering
result. It is noteworthy that both U-SPEC and U-SENC have nearly linear time
and space complexity, and are capable of robustly and efficiently partitioning
ten-million-level nonlinearly-separable datasets on a PC with 64GB memory.
Experiments on various large-scale datasets have demonstrated the scalability
and robustness of our algorithms. The MATLAB code and experimental data are
available at https://www.researchgate.net/publication/330760669.Comment: To appear in IEEE Transactions on Knowledge and Data Engineering,
201
Convex Clustering via Optimal Mass Transport
We consider approximating distributions within the framework of optimal mass
transport and specialize to the problem of clustering data sets. Distances
between distributions are measured in the Wasserstein metric. The main problem
we consider is that of approximating sample distributions by ones with sparse
support. This provides a new viewpoint to clustering. We propose different
relaxations of a cardinality function which penalizes the size of the support
set. We establish that a certain relaxation provides the tightest convex lower
approximation to the cardinality penalty. We compare the performance of
alternative relaxations on a numerical study on clustering.Comment: 12 pages, 12 figure
Faster Clustering via Preprocessing
We examine the efficiency of clustering a set of points, when the
encompassing metric space may be preprocessed in advance. In computational
problems of this genre, there is a first stage of preprocessing, whose input is
a collection of points ; the next stage receives as input a query set
, and should report a clustering of according to some
objective, such as 1-median, in which case the answer is a point
minimizing .
We design fast algorithms that approximately solve such problems under
standard clustering objectives like -center and -median, when the metric
has low doubling dimension. By leveraging the preprocessing stage, our
algorithms achieve query time that is near-linear in the query size ,
and is (almost) independent of the total number of points .Comment: 24 page
Dealing with uncertainty: A rough-set-based approach with the background of classical logic
The representative-based approximation has been widely studied in rough
set theory. Hence, rough set approximations can be defined by the system
of representatives, which plays a crucial role in set approximation. In
the authors’ previous research a possible use of the similarity-based rough
set in first-order logic was investigated. Now our focus has changed to
representative-based approximation systems. In this article, the authors show
a logical system relying on representative-based set approximation. In our
approach, a three-valued partial logic system is introduced. Based on the
properties of the approximation space, our theorems prove that in some cases,
there exists an efficient way to evaluate the first-order formulae
- …