368,753 research outputs found
Quasi-random numbers for copula models
The present work addresses the question how sampling algorithms for commonly
applied copula models can be adapted to account for quasi-random numbers.
Besides sampling methods such as the conditional distribution method (based on
a one-to-one transformation), it is also shown that typically faster sampling
methods (based on stochastic representations) can be used to improve upon
classical Monte Carlo methods when pseudo-random number generators are replaced
by quasi-random number generators. This opens the door to quasi-random numbers
for models well beyond independent margins or the multivariate normal
distribution. Detailed examples (in the context of finance and insurance),
illustrations and simulations are given and software has been developed and
provided in the R packages copula and qrng
Faster Random Walks By Rewiring Online Social Networks On-The-Fly
Many online social networks feature restrictive web interfaces which only
allow the query of a user's local neighborhood through the interface. To enable
analytics over such an online social network through its restrictive web
interface, many recent efforts reuse the existing Markov Chain Monte Carlo
methods such as random walks to sample the social network and support analytics
based on the samples. The problem with such an approach, however, is the large
amount of queries often required (i.e., a long "mixing time") for a random walk
to reach a desired (stationary) sampling distribution.
In this paper, we consider a novel problem of enabling a faster random walk
over online social networks by "rewiring" the social network on-the-fly.
Specifically, we develop Modified TOpology (MTO)-Sampler which, by using only
information exposed by the restrictive web interface, constructs a "virtual"
overlay topology of the social network while performing a random walk, and
ensures that the random walk follows the modified overlay topology rather than
the original one. We show that MTO-Sampler not only provably enhances the
efficiency of sampling, but also achieves significant savings on query cost
over real-world online social networks such as Google Plus, Epinion etc.Comment: 15 pages, 14 figure, technical report for ICDE2013 paper. Appendix
has all the theorems' proofs; ICDE'201
Large Scale Spectral Clustering Using Approximate Commute Time Embedding
Spectral clustering is a novel clustering method which can detect complex
shapes of data clusters. However, it requires the eigen decomposition of the
graph Laplacian matrix, which is proportion to and thus is not
suitable for large scale systems. Recently, many methods have been proposed to
accelerate the computational time of spectral clustering. These approximate
methods usually involve sampling techniques by which a lot information of the
original data may be lost. In this work, we propose a fast and accurate
spectral clustering approach using an approximate commute time embedding, which
is similar to the spectral embedding. The method does not require using any
sampling technique and computing any eigenvector at all. Instead it uses random
projection and a linear time solver to find the approximate embedding. The
experiments in several synthetic and real datasets show that the proposed
approach has better clustering quality and is faster than the state-of-the-art
approximate spectral clustering methods
Variance Reduction Techniques in Monte Carlo Methods
Monte Carlo methods are simulation algorithms to estimate a numerical quantity in a statistical model of a real system. These algorithms are executed by computer programs. Variance reduction techniques (VRT) are needed, even though computer speed has been increasing dramatically, ever since the introduction of computers. This increased computer power has stimulated simulation analysts to develop ever more realistic models, so that the net result has not been faster execution of simulation experiments; e.g., some modern simulation models need hours or days for a single ’run’ (one replication of one scenario or combination of simulation input values). Moreover there are some simulation models that represent rare events which have extremely small probabilities of occurrence), so even modern computer would take ’for ever’ (centuries) to execute a single run - were it not that special VRT can reduce theses excessively long runtimes to practical magnitudes.common random numbers;antithetic random numbers;importance sampling;control variates;conditioning;stratied sampling;splitting;quasi Monte Carlo
- …