50,193 research outputs found
Relabelling Algorithms for Large Dataset Mixture Models
Mixture models are flexible tools in density estimation and classification
problems. Bayesian estimation of such models typically relies on sampling from
the posterior distribution using Markov chain Monte Carlo. Label switching
arises because the posterior is invariant to permutations of the component
parameters. Methods for dealing with label switching have been studied fairly
extensively in the literature, with the most popular approaches being those
based on loss functions. However, many of these algorithms turn out to be too
slow in practice, and can be infeasible as the size and dimension of the data
grow. In this article, we review earlier solutions which can scale up well for
large data sets, and compare their performances on simulated and real datasets.
In addition, we propose a new, and computationally efficient algorithm based on
a loss function interpretation, and show that it can scale up well in larger
problems. We conclude with some discussions and recommendations of all the
methods studied
Complexity and Algorithms for the Discrete Fr\'echet Distance Upper Bound with Imprecise Input
We study the problem of computing the upper bound of the discrete Fr\'{e}chet
distance for imprecise input, and prove that the problem is NP-hard. This
solves an open problem posed in 2010 by Ahn \emph{et al}. If shortcuts are
allowed, we show that the upper bound of the discrete Fr\'{e}chet distance with
shortcuts for imprecise input can be computed in polynomial time and we present
several efficient algorithms.Comment: 15 pages, 8 figure
El Niño-related summer precipitation anomalies in Southeast Asia modulated by the Atlantic multidecadal oscillation
AbstractHow the Atlantic Multidecadal Oscillation (AMO) affects El Niño-related signals in Southeast Asia is investigated in this study on a subseasonal scale. Based on observational and reanalysis data, as well as numerical model simulations, El Niño-related precipitation anomalies are analyzed for AMO positive and negative phases, which reveals a time-dependent modulation of the AMO: (i) In May?June, the AMO influences the precipitation in Southern China (SC) and the Indochina peninsula (ICP) by modulating the El Niño-related air-sea interaction over the western North Pacific (WNP). During negative AMO phases, cold sea surface temperature anomalies (SSTAs) over the WNP favor the maintaining of the WNP anomalous anticyclone (WNPAC). The associated southerly (westerly) anomalies on the northwest (southwest) flank of the WNPAC enhance (reduce) the climatological moisture transport to SC (the ICP) and result in wetter (drier) than normal conditions. In contrast, during positive AMO phases, weak SSTAs over the WNP lead to limited influence of El Niño on precipitation in Southeast Asia. (ii) In July?August, the teleconnection impact from the North Atlantic is more manifest than that in May?June. During positive AMO phases, the warmer than normal North Atlantic favors anomalous wave trains, which propagate along the ?great circle route? and result in positive pressure anomalies over SC, consequently suppressing precipitation in SC and the ICP. During negative AMO phases, the anomalous wave trains tend to propagate eastward from Europe to Northeast Asia along the summer Asian jet, exerting limited influence on Southeast Asia
Mining top-k granular association rules for recommendation
Recommender systems are important for e-commerce companies as well as
researchers. Recently, granular association rules have been proposed for
cold-start recommendation. However, existing approaches reserve only globally
strong rules; therefore some users may receive no recommendation at all. In
this paper, we propose to mine the top-k granular association rules for each
user. First we define three measures of granular association rules. These are
the source coverage which measures the user granule size, the target coverage
which measures the item granule size, and the confidence which measures the
strength of the association. With the confidence measure, rules can be ranked
according to their strength. Then we propose algorithms for training the
recommender and suggesting items to each user. Experimental are undertaken on a
publicly available data set MovieLens. Results indicate that the appropriate
setting of granule can avoid over-fitting and at the same time, help obtaining
high recommending accuracy.Comment: 12 pages, 5 figures, submitted to Advances in Granular Computing and
Advances in Rough Sets, 2013. arXiv admin note: substantial text overlap with
arXiv:1305.137
- …