Search CORE

50,193 research outputs found

Relabelling Algorithms for Large Dataset Mixture Models

Author: Fan Yanan
Zhu Wanchuang
Publication venue
Publication date: 10/03/2014
Field of study

Mixture models are flexible tools in density estimation and classification problems. Bayesian estimation of such models typically relies on sampling from the posterior distribution using Markov chain Monte Carlo. Label switching arises because the posterior is invariant to permutations of the component parameters. Methods for dealing with label switching have been studied fairly extensively in the literature, with the most popular approaches being those based on loss functions. However, many of these algorithms turn out to be too slow in practice, and can be infeasible as the size and dimension of the data grow. In this article, we review earlier solutions which can scale up well for large data sets, and compare their performances on simulated and real datasets. In addition, we propose a new, and computationally efficient algorithm based on a loss function interpretation, and show that it can scale up well in larger problems. We conclude with some discussions and recommendations of all the methods studied

arXiv.org e-Print Archive

CiteSeerX

Complexity and Algorithms for the Discrete Fr\'echet Distance Upper Bound with Imprecise Input

Author: Fan Chenglin
Zhu Binhai
Publication venue
Publication date: 10/09/2015
Field of study

We study the problem of computing the upper bound of the discrete Fr\'{e}chet distance for imprecise input, and prove that the problem is NP-hard. This solves an open problem posed in 2010 by Ahn \emph{et al}. If shortcuts are allowed, we show that the upper bound of the discrete Fr\'{e}chet distance with shortcuts for imprecise input can be computed in polynomial time and we present several efficient algorithms.Comment: 15 pages, 8 figure

arXiv.org e-Print Archive

CiteSeerX

El Niño-related summer precipitation anomalies in Southeast Asia modulated by the Atlantic multidecadal oscillation

Author: Fan K.
Fan Y.
Fraedrich K.
Zhu X.
Publication venue: 'American Meteorological Society'
Publication date: 30/10/2019
Field of study

AbstractHow the Atlantic Multidecadal Oscillation (AMO) affects El Niño-related signals in Southeast Asia is investigated in this study on a subseasonal scale. Based on observational and reanalysis data, as well as numerical model simulations, El Niño-related precipitation anomalies are analyzed for AMO positive and negative phases, which reveals a time-dependent modulation of the AMO: (i) In May?June, the AMO influences the precipitation in Southern China (SC) and the Indochina peninsula (ICP) by modulating the El Niño-related air-sea interaction over the western North Pacific (WNP). During negative AMO phases, cold sea surface temperature anomalies (SSTAs) over the WNP favor the maintaining of the WNP anomalous anticyclone (WNPAC). The associated southerly (westerly) anomalies on the northwest (southwest) flank of the WNPAC enhance (reduce) the climatological moisture transport to SC (the ICP) and result in wetter (drier) than normal conditions. In contrast, during positive AMO phases, weak SSTAs over the WNP lead to limited influence of El Niño on precipitation in Southeast Asia. (ii) In July?August, the teleconnection impact from the North Atlantic is more manifest than that in May?June. During positive AMO phases, the warmer than normal North Atlantic favors anomalous wave trains, which propagate along the ?great circle route? and result in positive pressure anomalies over SC, consequently suppressing precipitation in SC and the ICP. During negative AMO phases, the anomalous wave trains tend to propagate eastward from Europe to Northeast Asia along the summer Asian jet, exerting limited influence on Southeast Asia

MPG.PuRe

Mining top-k granular association rules for recommendation

Author: Min Fan
Zhu William
Publication venue
Publication date: 21/05/2013
Field of study

Recommender systems are important for e-commerce companies as well as researchers. Recently, granular association rules have been proposed for cold-start recommendation. However, existing approaches reserve only globally strong rules; therefore some users may receive no recommendation at all. In this paper, we propose to mine the top-k granular association rules for each user. First we define three measures of granular association rules. These are the source coverage which measures the user granule size, the target coverage which measures the item granule size, and the confidence which measures the strength of the association. With the confidence measure, rules can be ranked according to their strength. Then we propose algorithms for training the recommender and suggesting items to each user. Experimental are undertaken on a publicly available data set MovieLens. Results indicate that the appropriate setting of granule can avoid over-fitting and at the same time, help obtaining high recommending accuracy.Comment: 12 pages, 5 figures, submitted to Advances in Granular Computing and Advances in Rough Sets, 2013. arXiv admin note: substantial text overlap with arXiv:1305.137

arXiv.org e-Print Archive

Crossref