Search CORE

1,920 research outputs found

Catching the head, tail, and everything in between: a streaming algorithm for the degree distribution

Author: McGregor Andrew
Seshadhri C.
Simpson Olivia
Publication venue
Publication date: 25/11/2015
Field of study

The degree distribution is one of the most fundamental graph properties of interest for real-world graphs. It has been widely observed in numerous domains that graphs typically have a tailed or scale-free degree distribution. While the average degree is usually quite small, the variance is quite high and there are vertices with degrees at all scales. We focus on the problem of approximating the degree distribution of a large streaming graph, with small storage. We design an algorithm headtail, whose main novelty is a new estimator of infrequent degrees using truncated geometric random variables. We give a mathematical analysis of headtail and show that it has excellent behavior in practice. We can process streams will millions of edges with storage less than 1% and get extremely accurate approximations for all scales in the degree distribution. We also introduce a new notion of Relative Hausdorff distance between tailed histograms. Existing notions of distances between distributions are not suitable, since they ignore infrequent degrees in the tail. The Relative Hausdorff distance measures deviations at all scales, and is a more suitable distance for comparing degree distributions. By tracking this new measure, we are able to give strong empirical evidence of the convergence of headtail

arXiv.org e-Print Archive

Crossref

Comparison of ruin probability approximations in case of real data

Author: Sergidou E.K.
Publication venue
Publication date: 01/01/2015
Field of study

Repository TU/e

Pure OAI Repository

On finite-time ruin probabilities with reinsurance cycles influenced by large claims

Author: Mathieu Bargès
Stéphane Loisel
Xavier Venel
Publication venue
Publication date
Field of study

Market cycles play a great role in reinsurance. Cycle transitions are not independent from the claim arrival process : a large claim or a high number of claims may accelerate cycle transitions. To take this into account, a semi-Markovian risk model is proposed and analyzed. A refined Erlangization method is developed to compute the finite-time ruin probability of a reinsurance company. As this model needs the claim amounts to be Phase-type distributed, we explain how to fit mixtures of Erlang distributions to long-tailed distributions. Numerical applications and comparisons to results obtained from simulation methods are given. The impact of dependency between claim amounts and phase changes is studied.

Research Papers in Economics

Markov Chain Modeling for Multi-Server Clusters

Author: Hua Zhili
Publication venue: W&M ScholarWorks
Publication date: 01/01/2005
Field of study

College of William & Mary: W&M Publish

Bayesian prediction of the transient behaviour and busy period in short and long-tailed GI/G/1 queueing systems

Author: Ausín M. Concepción
Lillo Rosa E.
Wiper Michael P.
Publication venue: 'Elsevier BV'
Publication date: 01/01/2007
Field of study

Bayesian inference for the transient behavior and duration of a busy period in a single server queueing system with general, unknown distributions for the interarrival and service times is investigated. Both the interarrival and service time distributions are approximated using the dense family of Coxian distributions. A suitable reparameterization allows the definition of a non-informative prior and Bayesian inference is then undertaken using reversible jump Markov chain Monte Carlo methods. An advantage of the proposed procedure is that heavy tailed interarrival and service time distributions such as the Pareto can be well approximated. The proposed procedure for estimating the system measures is based on recent theoretical results for the Coxian/Coxian/1 system. A numerical technique is developed for every MCMC iteration so that the transient queue length and waiting time distributions and the duration of a busy period can be estimated. The approach is illustrated with both simulated and real data

Repositorio da Universidade da Coruña

A note on marginal posterior simulation via higher-order tail area approximations

Author: Ruli Erlis
Sartori Nicola
Ventura Laura
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2012
Field of study

We explore the use of higher-order tail area approximations for Bayesian simulation. These approximations give rise to an alternative simulation scheme to MCMC for Bayesian computation of marginal posterior distributions for a scalar parameter of interest, in the presence of nuisance parameters. Its advantage over MCMC methods is that samples are drawn independently with lower computational time and the implementation requires only standard maximum likelihood routines. The method is illustrated by a genetic linkage model, a normal regression with censored data and a logistic regression model

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Università di Padova

Statistical distributions for service times

Author: Adedigba Adebolanle Iyabo
Publication venue: 'University of Saskatchewan Library'
Publication date
Field of study

Queueing models have been used extensively in the design of call centres. In particular, a queueing model will be used to describe a help desk which is a form of a call centre. The design of the queueing model involves modelling the arrival an service processes of the system.Conventionally, the arrival process is assumed to be Poisson and service times are assumed to be exponentially distributed. But it has been proposed that practically these are seldom the case. Past research reveals that the log-normal distribution can be used to model the service times in call centres. Also, services may involve stages/tasks before completion. This motivates the use of a phase-type distribution to model the underlying stages of service.This research work focuses on developing statistical models for the overall service times and the service times by job types in a particular help desk. The assumption of exponential service times was investigated and a log-normal distribution was fitted to service times of this help desk. Each stage of the service in this help desk was modelled as a phase in the phase-type distribution.Results from the analysis carried out in this work confirmed the irrelevance of the assumption of exponential service times to this help desk and it was apparent that log-normal distributions provided a reasonable fit to the service times. A phase-type distribution with three phases fitted the overall service times and the service times of administrative and miscellaneous jobs very well. For the service times of e-mail and network jobs, a phase-type distribution with two phases served as a good model.Finally, log-normal models of service times in this help desk were approximated using an order three phase-type distribution

eCommons@USASK

University of Saskatchewan Research Archive

Aggregate matrix-analytic techniques and their applications

Author: Riska Alma
Publication venue: W&M ScholarWorks
Publication date: 01/01/2002
Field of study

The complexity of computer systems affects the complexity of modeling techniques that can be used for their performance analysis. In this dissertation, we develop a set of techniques that are based on tractable analytic models and enable efficient performance analysis of computer systems. Our approach is three pronged: first, we propose new techniques to parameterize measurement data with Markovian-based stochastic processes that can be further used as input into queueing systems; second, we propose new methods to efficiently solve complex queueing models; and third, we use the proposed methods to evaluate the performance of clustered Web servers and propose new load balancing policies based on this analysis.;We devise two new techniques for fitting measurement data that exhibit high variability into Phase-type (PH) distributions. These techniques apply known fitting algorithms in a divide-and-conquer fashion. We evaluate the accuracy of our methods from both the statistics and the queueing systems perspective. In addition, we propose a new methodology for fitting measurement data that exhibit long-range dependence into Markovian Arrival Processes (MAPs).;We propose a new methodology, ETAQA, for the exact solution of M/G/1-type processes, (GI/M/1-type processes, and their intersection, i.e., quasi birth-death (QBD) processes. ETAQA computes an aggregate steady state probability distribution and a set of measures of interest. E TAQA is numerically stable and computationally superior to alternative solution methods. Apart from ETAQA, we propose a new methodology for the exact solution of a class of GI/G/1-type processes based on aggregation/decomposition.;Finally, we demonstrate the applicability of the proposed techniques by evaluating load balancing policies in clustered Web servers. We address the high variability in the service process of Web servers by dedicating the servers of a cluster to requests of similar sizes and propose new, content-aware load balancing policies. Detailed analysis shows that the proposed policies achieve high user-perceived performance and, by continuously adapting their scheduling parameters to the current workload characteristics, provide good performance under conditions of transient overload

College of William & Mary: W&M Publish