21 research outputs found

    Regenerative Simulation for Queueing Networks with Exponential or Heavier Tail Arrival Distributions

    Full text link
    Multiclass open queueing networks find wide applications in communication, computer and fabrication networks. Often one is interested in steady-state performance measures associated with these networks. Conceptually, under mild conditions, a regenerative structure exists in multiclass networks, making them amenable to regenerative simulation for estimating the steady-state performance measures. However, typically, identification of a regenerative structure in these networks is difficult. A well known exception is when all the interarrival times are exponentially distributed, where the instants corresponding to customer arrivals to an empty network constitute a regenerative structure. In this paper, we consider networks where the interarrival times are generally distributed but have exponential or heavier tails. We show that these distributions can be decomposed into a mixture of sums of independent random variables such that at least one of the components is exponentially distributed. This allows an easily implementable embedded regenerative structure in the Markov process. We show that under mild conditions on the network primitives, the regenerative mean and standard deviation estimators are consistent and satisfy a joint central limit theorem useful for constructing asymptotically valid confidence intervals. We also show that amongst all such interarrival time decompositions, the one with the largest mean exponential component minimizes the asymptotic variance of the standard deviation estimator.Comment: A preliminary version of this paper will appear in Proceedings of Winter Simulation Conference, Washington, DC, 201

    Column Subset Selection and Nystr\"om Approximation via Continuous Optimization

    Full text link
    We propose a continuous optimization algorithm for the Column Subset Selection Problem (CSSP) and Nystr\"om approximation. The CSSP and Nystr\"om method construct low-rank approximations of matrices based on a predetermined subset of columns. It is well known that choosing the best column subset of size kk is a difficult combinatorial problem. In this work, we show how one can approximate the optimal solution by defining a penalized continuous loss function which is minimized via stochastic gradient descent. We show that the gradients of this loss function can be estimated efficiently using matrix-vector products with a data matrix XX in the case of the CSSP or a kernel matrix KK in the case of the Nystr\"om approximation. We provide numerical results for a number of real datasets showing that this continuous optimization is competitive against existing methods

    Generalized Linear Models via the Lasso: To Scale or Not to Scale?

    Full text link
    The Lasso regression is a popular regularization method for feature selection in statistics. Prior to computing the Lasso estimator in both linear and generalized linear models, it is common to conduct a preliminary rescaling of the feature matrix to ensure that all the features are standardized. Without this standardization, it is argued, the Lasso estimate will unfortunately depend on the units used to measure the features. We propose a new type of iterative rescaling of the features in the context of generalized linear models. Whilst existing Lasso algorithms perform a single scaling as a preprocessing step, the proposed rescaling is applied iteratively throughout the Lasso computation until convergence. We provide numerical examples, with both real and simulated data, illustrating that the proposed iterative rescaling can significantly improve the statistical performance of the Lasso estimator without incurring any significant additional computational cost

    Variance Reduction for Matrix Computations with Applications to Gaussian Processes

    Full text link
    In addition to recent developments in computing speed and memory, methodological advances have contributed to significant gains in the performance of stochastic simulation. In this paper, we focus on variance reduction for matrix computations via matrix factorization. We provide insights into existing variance reduction methods for estimating the entries of large matrices. Popular methods do not exploit the reduction in variance that is possible when the matrix is factorized. We show how computing the square root factorization of the matrix can achieve in some important cases arbitrarily better stochastic performance. In addition, we propose a factorized estimator for the trace of a product of matrices and numerically demonstrate that the estimator can be up to 1,000 times more efficient on certain problems of estimating the log-likelihood of a Gaussian process. Additionally, we provide a new estimator of the log-determinant of a positive semi-definite matrix where the log-determinant is treated as a normalizing constant of a probability density.Comment: 20 pages, 3 figure

    COMBSS: Best Subset Selection via Continuous Optimization

    Full text link
    The problem of best subset selection in linear regression is considered with the aim to find a fixed size subset of features that best fits the response. This is particularly challenging when the total available number of features is very large compared to the number of data samples. Existing optimal methods for solving this problem tend to be slow while fast methods tend to have low accuracy. Ideally, new methods perform best subset selection faster than existing optimal methods but with comparable accuracy, or, being more accurate than methods of comparable computational speed. Here, we propose a novel continuous optimization method that identifies a subset solution path, a small set of models of varying size, that consists of candidates for the single best subset of features, that is optimal in a specific sense in linear regression. Our method turns out to be fast, making the best subset selection possible when the number of features is well in excess of thousands. Because of the outstanding overall performance, framing the best subset selection challenge as a continuous optimization problem opens new research directions for feature extraction for a large variety of regression models

    Rare Events in Random Geometric Graphs

    Get PDF
    This work introduces and compares approaches for estimating rare-event probabilities related to the number of edges in the random geometric graph on a Poisson point process. In the one-dimensional setting, we derive closed-form expressions for a variety of conditional probabilities related to the number of edges in the random geometric graph and develop conditional Monte Carlo algorithms for estimating rare-event probabilities on this basis. We prove rigorously a reduction in variance when compared to the crude Monte Carlo estimators and illustrate the magnitude of the improvements in a simulation study. In higher dimensions, we use conditional Monte Carlo to remove the fluctuations in the estimator coming from the randomness in the Poisson number of nodes. Finally, building on conceptual insights from large-deviations theory, we illustrate that importance sampling using a Gibbsian point process can further substantially reduce the estimation variance
    corecore