1,132 research outputs found
A bivariate count model with discrete Weibull margins
Multivariate discrete data arise in many fields (statistical quality control, epidemiology, failure and reliability analysis, etc.) and modelling such data is a relevant task. Here we consider the construction of a bivariate model with discrete Weibull margins, based on Farlie-Gumbel-Morgenstern copula, analyse its properties especially in terms of attainable correlation, and propose several methods for the point estimation of its parameters. Two of them are the standard one-step and two-step maximum likelihood procedures; the other two are based on an approximate method of moments and on the method of proportion, which represent intuitive alternatives for estimating the dependence parameter. A Monte Carlo simulation study is presented, comprising more than one hundred artificial settings, which empirically assesses the performance of the different estimation techniques in terms of statistical properties and computational cost. For illustrative purposes, the model and related inferential procedures are fitted and applied to two datasets taken from the literature, concerning failure data, presenting either positive or negative correlation between the two observed variables. The applications show that the proposed bivariate discrete Weibull distribution can model correlated counts even better than existing and well-established joint distributions
Symmetric and Asymmetric Distributions
In recent years, the advances and abilities of computer software have substantially increased the number of scientific publications that seek to introduce new probabilistic modelling frameworks, including continuous and discrete approaches, and univariate and multivariate models. Many of these theoretical and applied statistical works are related to distributions that try to break the symmetry of the normal distribution and other similar symmetric models, mainly using Azzalini's scheme. This strategy uses a symmetric distribution as a baseline case, then an extra parameter is added to the parent model to control the skewness of the new family of probability distributions. The most widespread and popular model is the one based on the normal distribution that produces the skewed normal distribution. In this Special Issue on symmetric and asymmetric distributions, works related to this topic are presented, as well as theoretical and applied proposals that have connections with and implications for this topic. Immediate applications of this line of work include different scenarios such as economics, environmental sciences, biometrics, engineering, health, etc. This Special Issue comprises nine works that follow this methodology derived using a simple process while retaining the rigor that the subject deserves. Readers of this Issue will surely find future lines of work that will enable them to achieve fruitful research results
Asymmetric multivariate normal mixture GARCH
An asymmetric multivariate generalization of the recently proposed class of normal mixture GARCH models is developed. Issues of parametrization and estimation are discussed. Conditions for covariance stationarity and the existence of the fourth moment are derived, and expressions for the dynamic correlation structure of the process are provided. In an application to stock market returns, it is shown that the disaggregation of the conditional (co)variance process generated by the model provides substantial intuition. Moreover, the model exhibits a strong performance in calculating outâofâsample ValueâatâRisk measures
A Review of Probabilistic Methods of Assessment of Load Effects in Bridges
This paper reviews a range of statistical approaches to illustrate the influence of data quality and quantity on the probabilistic modelling of traffic load effects. It also aims to demonstrate the importance of long-run simulations in calculating characteristic traffic load effects. The popular methods of Peaks Over Threshold and Generalized Extreme Value are considered but also other methods including the Box-Cox approach, fitting to a Normal distribution and the Rice formula. For these five methods, curves are fitted to the tails of the daily maximum data. Bayesian Updating and Predictive Likelihood are also assessed, which require the entire data for fittings. The accuracy of each method in calculating 75-year characteristic values and probability of failure, using different quantities of data, is assessed. The nature of the problem is first introduced by a simple numerical example with a known theoretical answer. It is then extended to more realistic problems, where long-run simulations are used to provide benchmark results, against which each method is compared. Increasing the number of data in the sample results in higher accuracy of approximations but it is not able to completely eliminate the uncertainty associated with the extrapolation. Results also show that the accuracy of estimations of characteristic value and probabilities of failure are more a function of data quality than extrapolation technique. This highlights the importance of long-run simulations as a means of reducing the errors associated with the extrapolation process
Recommended from our members
Validation of three new measure-correlate-predict models for the long-term prospection of the wind resource
The estimation of the long-term wind resource at a prospective site based on a relatively short on-site measurement campaign is an indispensable task in the development of a commercial wind farm. The typical industry approach is based on the measure-correlate-predict �MCP� method where a relational model between the site wind velocity data and the data obtained from a suitable reference site is built from concurrent records. In a subsequent step, a long-term prediction for the prospective
site is obtained from a combination of the relational model and the historic reference data. In the present paper, a systematic study is presented where three new MCP models, together with two published reference models �a simple linear
regression and the variance ratio methodďż˝, have been evaluated based on concurrent synthetic wind speed time series for two sites, simulating the prospective and the
reference site. The synthetic method has the advantage of generating time series with the desired statistical properties, including Weibull scale and shape factors,
required to evaluate the five methods under all plausible conditions. In this work, first a systematic discussion of the statistical fundamentals behind MCP methods is
provided and three new models, one based on a nonlinear regression and two �termed kernel methods� derived from the use of conditional probability density functions, are proposed. All models are evaluated by using five metrics under a wide range of values of the correlation coefficient, the Weibull scale, and the Weibull shape factor. Only one of all models, a kernel method based on bivariate Weibull probability functions, is capable of accurately predicting all performance metrics studied
A joint probability approach for the confluence flood frequency analysis
The flood frequency analysis at or nearby the confluence of two tributaries is of interest because it is necessary for the design of the highway drainage structures. However, The shortage of the hydrological data at the confluence point makes the flood estimation challenging. This thesis presents a practical procedure for the flood frequency analysis at the confluence of two streams by multivariate simulation of the annual peak flow of the tributaries based on joint probability and Monte Carlo simulation. Copulas are introduced to identify the joint probability. The results of two case studies are compared with the flood estimated by the univariate flood frequency analysis based on the observation data. The results are also compared with the ones by the National Flood Frequency program developed by United State Geological Survey. The results by the proposed model are very close to ones by the unvariate flood frequency analysis
Statistical modeling of skewed data using newly formed parametric distributions
Several newly formed continuous parametric distributions are introduced to analyze skewed data. Firstly, a two-parameter smooth continuous lognormal-Pareto composite distribution is introduced for modeling highly positively skewed data. The new density is a lognormal density up to an unknown threshold value and a Pareto density for the remainder. The resulting density is similar in shape to the lognormal density, yet its upper tail is larger than the lognormal density and the tail behavior is quite similar to the Pareto density. Parameter estimation methods and the goodness-of-fit criterion for the new distribution are presented. A large actuarial data set is analyzed to illustrate the better fit and applicability of the new distribution over other leading distributions. Secondly, the Odd Weibull family is introduced for modeling data with a wide variety of hazard functions. This three-parameter family is derived by considering the distributions of the odds of the Weibull and inverse Weibull families. As a result, the Odd Weibull family is not only useful for testing goodness-of-fit of the Weibull and inverse Weibull as submodels, but it is also convenient for modeling and fitting different data sets, especially in the presence of censoring and truncation. This newly formed family not only possesses all five major hazard shapes: constant, increasing, decreasing, bathtub-shaped and unimodal failure rates, but also has wide variety of density shapes. The model parameters for exact, grouped, censored and truncated data are estimated in two different ways due to the fact that the inverse transformation of the Odd Weibull family does not change its density function. Examples are provided based on survival, reliability, and environmental sciences data to illustrate the variety of density and hazard shapes by analyzing complete and incomplete data. Thirdly, the two-parameter logistic-sinh distribution is introduced for modeling highly negatively skewed data with extreme observations. The resulting family provides not only negatively skewed densities with thick tails, but also variety of monotonic density shapes. The advantages of using the proposed family are demonstrated and compared by illustrating well-known examples. Finally, the folded parametric families are introduced to model the positively skewed data with zero data values
A new flexible family of continuous distributions: the additive Odd-G family
This paper introduces a new family of distributions based on the additive model structure. Three submodels of the proposed family are studied in detail. Two simulation studies were performed to discuss the maximum likelihood estimators of the model parameters. The log location-scale regression model based on a new generalization of the Weibull distribution is introduced. Three datasets were used to show the importance of the proposed family. Based on the empirical results, we concluded that the proposed family is quite competitive compared to other models
- âŚ