35 research outputs found
On identifiability of MAP processes
Two types of transitions can be found in the Markovian Arrival process or MAP: with
and without arrivals. In transient transitions the chain jumps from one state to another
with no arrival; in effective transitions, a single arrival occurs. We assume that in
practice, only arrival times are observed in a MAP. This leads us to define and study the
Effective Markovian Arrival process or E-MAP. In this work we define identifiability of
MAPs in terms of equivalence between the corresponding E-MAPs and study conditions
under which two sets of parameters induce identical laws for the observable process, in
the case of 2 and 3-states MAP. We illustrate and discuss our results with examples
Call center data modeling: a queueing science approach based on Markovian arrival processes
In this paper we analyze the well-known “Anonymous bank” call center dataset from a queueing science viewpoint. For this purpose, fitted distributions for both the inter-arrival and service times as well as for customers patiences are integrated in a simulator to infer quantities of interest related to call centers managerial de- cisions as waiting times, abandonment rates and queue lengths. In particular, it is shown how a type of Markov renewal process, the Markovian arrival process (MAP), is able to capture some of the characterizing properties of arrivals in a modern call center as overdispersion and positive correlation between arrival counts. The work provides a new inference approach for the MAP based on the count process de- scriptors and presents new properties concerning the dependence structure of the cumulated number of arrivals in a MAP
Analysis of an aggregate loss model in a Markov renewal regime
In this article we consider an aggregate loss model with dependent losses. The losses occurrence process is governed by a two-state Markovian arrival process (MAP2), a Markov renewal process process that allows for (1) correlated inter-losses times, (2) non-exponentially distributed inter-losses times and, (3) overdisperse losses counts. Some quantities of interest to measure persistence in the loss occurrence process are obtained. Given a real operational risk database, the aggregate loss model is estimated by fitting separately the inter-losses times and severities. The MAP2 is estimated via direct maximization of the likelihood function, and severities are modeled by the heavy-tailed, double-Pareto Lognormal distribution. In comparison with the fit provided by the Poisson process, the results point out that taking into account the dependence and overdispersion in the inter-losses times distribution leads to higher capital charges
Empirical Bayes fairness in linear regression
Bias in data may lead to prediction procedures which discriminate individuals from sensitive groups. In this paper we propose a Bayesian method for parameter estimation in the linear regression model taking fairness into ac- count. For a given prior structure, namely, the Normal-Gamma prior, for a specific choice of unfairness measure (which does not require the use of private -sensitive- information), our method computes a constrained empirical Bayes estimator that induces fairness in posterior predictions. Specifically, the model hyper-parameters are calculated as the optimal solution of a constrained optimization problem: that in which the marginal likelihood is maximized in a region where the unfairness measure is upper bounded and thus, controlled. This guarantees that average posterior predictions are, at most, as unfair as the upper bound establishes. Experi- ments with synthetic and real data sets show the competitiveness of our approach in terms of fairness in the obtained solution and prediction error
Inference for double Pareto lognormal queues with applications
In this article we describe a method for carrying out Bayesian inference for the double
Pareto lognormal (dPlN) distribution which has recently been proposed as a model for
heavy-tailed phenomena. We apply our approach to inference for the dPlN/M/1 and
M/dPlN/1 queueing systems. These systems cannot be analyzed using standard
techniques due to the fact that the dPlN distribution does not posses a Laplace transform
in closed form. This difficulty is overcome using some recent approximations for the
Laplace transform for the Pareto/M/1 system. Our procedure is illustrated with
applications in internet traffic analysis and risk theory
Fitting procedure for the two-state Batch Markov modulated Poisson process
The Batch Markov Modulated Poisson Process (BMMPP) is a subclass of the versatile Batch Markovian Arrival process (BMAP) which has been proposed for the modeling of dependent events occurring in batches (as group arrivals, failures or risk events). This paper focuses on exploring the possibilities of the \bmmpp for the modeling of real phenomena involving point processes with group arrivals. The first result in this sense is the characterization of the two-state BMMPP with maximum batch size equal to K, the BMMPP2(K), by a set of moments related to the inter-event time and batch size distributions. This characterization leads to a sequential fitting approach via a moments matching method. The performance of the novel fitting approach is illustrated on both simulated and a real teletraffic data set, and compared to that of the EM algorithm. In addition, as an extension of the inference approach, the queue length distributions at departures in the queueing system BMMPP/M/1 is also estimated
Bayesian analysis of the stationary MAP2
In this article we describe a method for carrying out Bayesian estimation for the two-state stationary Markov arrival process (MAP(2)), which has been proposed as a versatile model in a number of contexts. The approach is illustrated on both simulated and real data sets, where the performance of the MAP(2) is compared against that of the well-known MMPP2. As an extension of the method, we estimate the queue length and virtual waiting time distributions of a stationary MAP(2)/G/1 queueing system, a matrix generalization of the M/G/1 queue that allows for dependent inter-arrival times. Our procedure is illustrated with applications in Internet traffic analysis.Research partially supported by research grants and projects MTM2015-65915-R, ECO2015-
66593-P (Ministerio de EconomĂa y Competitividad, Spain) and P11-FQM-7603, FQM-329 (Junta de AndalucĂa, Spain). The authors thank both the Associate Editor and referee for their constructive comments from which the paper greatly benefited
Cost-sensitive probabilistic predictions for support vector machines
Support vector machines (SVMs) are widely used and constitute one of the best examined and used machine learning models for two-class classification. Classification in SVM is based on a score procedure, yielding a deterministic classification rule, which can be transformed into a probabilistic rule (as implemented in off-the-shelf SVM libraries), but is not probabilistic in nature. On the other hand, the tuning of the regularization parameters in SVM is known to imply a high computational effort and generates pieces of information that are not fully exploited, not being used to build a probabilistic classification rule. In this paper we propose a novel approach to generate probabilistic outputs for the SVM. The new method has the following three properties. First, it is designed to be cost-sensitive, and thus the different importance of sensitivity (or true positive rate, TPR) and specificity (true negative rate, TNR) is readily accommodated in the model. As a result, the model can deal with imbalanced datasets which are common in operational business problems as churn prediction or credit scoring. Second, the SVM is embedded in an ensemble method to improve its performance, making use of the valuable information generated in the parameters tuning process. Finally, the probabilities estimation is done via bootstrap estimates, avoiding the use of parametric models as competing approaches. Numerical tests on a wide range of datasets show the advantages of our approach over benchmark procedures
Non-identifiability of the two state Markovian Arrival process
In this paper we consider the problem of identifiability of the two-state Markovian
Arrival process (MAP2). In particular, we show that the MAP2 is not identifiable and
conditions are given under which two different sets of parameters, induce identical
stationary laws for the observable process
Robust newsvendor problem with autoregressive demand
This paper explores the classic single-item newsvendor problem under a novel setting which combines temporal dependence and tractable robust optimization. First, the demand is modeled as a time series which follows an autoregressive process AR(p), p ≥ 1. Second, a robust approach to maximize the worst-case revenue is proposed: a robust distribution-free autoregressive forecasting method, which copes with non-stationary time series, is formulated. A closed-form expression for the optimal solution is found for the problem for p = 1; for the remaining values of p, the problem is expressed as a nonlinear convex optimization program, to be solved numerically. The optimal solution under the robust method is compared with those obtained under two versions of the classic approach, in which either the demand distribution is unknown, and assumed to have no autocorrelation, or it is assumed to follow an AR(p) process with normal error terms. Numerical experiments show that our proposal usually outperforms the previous benchmarks, not only with regard to robustness, but also in terms of the average revenue.Ministerio de EconomĂa y CompetitividadJunta de AndalucĂ