Pakistan Journal of Statistics and Operation Research
Not a member yet
    861 research outputs found

    Identifying an Emerging HIV Epidemic in Punjab, Pakistan: Forecasting Trends using Prophet Model and Classical ARIMA Model (2020–2025)

    No full text
    Pakistan has witnessed concerning shifts in HIV epidemic especially in Punjab, where HIV and AIDS incidence continues to rise. This study compares the predictive accuracy of the Prophet model, machine learning model with classical ARIMA configurations for monthly HIV and AIDS case forecasting in Punjab. Methods: Monthly surveillance data (January 2020–October 2025) from Punjab AIDS Control Program (PACP) was used to train and validate Prophet and multiple ARIMA models. The modelling performance was assessed using RMSE, MAE, MAPE, BIC and also Ljung Box Q tests. Forward forecasts were generated for HIV reactive and AIDS (CD4 < 200) cases through 2026. Results: Machine learning model (Prophet) outperformed all ARIMA models in forecasting HIV reactive cases by achieving the lowest RMSE (132.6) and MAPE (16.4%), for AIDS cases projection, all models exhibited high error rates (Prophet MAPE > 300%) with ARIMA (0,1,0)(0,1,1)₁₂ better performance (MAPE ~174%). Forecasted outputs estimates approximately 8,490 new HIV cases in 2026 with uncertainty bounds reaching nearly 15,000 cases, indicating a continued upward trajectory and for AIDS the count in 2026 may rise to 25,596 new cases, thou, forecasting AIDS remains a challenge. The results demonstrate superior ability of Prophet model to capture non-linear trends and seasonality in HIV surveillance data. Conclusion: Prophet model superior performance reflects its ability to model nonlinear and seasonally irregular HIV surveillance data. Integration of machine learning techniques such as Prophet model into provincial HIV programs can enhance planning and accelerate progress toward achieving UNAIDS 95-95-95 targets

    Parameter Estimation for the Bivariate Compound Zero-Truncated Poisson-Gamma Model under Different Data Scenarios

    No full text
    The bivariate compound zero-truncated Poisson-gamma distribution models the sum of a random number of bivariate Gamma variables, where the count follows a zero-truncated Poisson distribution, which makes it well-suited for applications in actuarial science, climatology, and reliability engineering, where zero outcomes are inherently absent. Owing to the intractable nature of the probability density function, which involves an infinite sum, the direct maximum likelihood estimation is computationally challenging. In this study, we used a standard (exact) maximum likelihood estimation when event counts were observed (complete data and Scenario~A) and employed the saddle-point approximation only when counts were latent (Scenario~B). We developed a stable maximum likelihood estimation based on saddle-point approximation. We derived the cumulative distribution function from the cumulant generating function and obtained the probability density function using numerical differentiation. Detailed derivations, implementation guidelines in the \textsf{R} programming language, and a parameter initialization strategy using the method of moments are provided. A simulation study using various sample sizes demonstrated the accuracy, consistency, and superiority of this method over the moment-based estimators. Computational challenges and limitations are discussed, along with potential extensions to model the dependence structures using copulas. In addition, we develop a likelihood ratio test and a formal symmetry test (for example, H0:α1=α2, β1=β2H_0:\alpha_1=\alpha_2,\ \beta_1=\beta_2) to compare nested specifications, enabling principled inference on symmetry and overall model adequacy

    An Alternative Exponential Model for Skewed Real Data: Characterizations, Bayesian, Non- Bayesian Estimation and Distributional Validation Testing

    No full text
    This paper presents a novel exponential model with two parameters, placing particular attention on its practical applications to skewed data as the central area of investigation. The mathematical characteristics of this atypical distribution are established, in a lucid and succinct manner, by the discoveries made in this investigation. Furthermore, it is worth noting that there exist three distinct approaches to describing the distribution. The process of estimating the parameters of the novel model involves employing a range of established methodologies, including the Bayesian technique. When confronted with censored data, the maximum likelihood technique is commonly considered as a viable approach. Pitman's closeness criteria areemployed as the comparative tool when assessing the probability estimate in relation to Bayesian estimation approaches. During the computation of Bayesian estimations, three distinct loss functions, namely generalized quadratic, Linex, and entropy, are employed. A multitude of simulated experiments are conducted to assess the efficacy of various estimation methodologies. The BB algorithm is employed to facilitate the comparison and contrast between the Bayesian technique and the censored maximum likelihood strategy. The Nikulin-Rao-Robson (NKRR) statistic was derived by conducting two empirical studies using real-world data sets characterized by skewed distributions, along with simulation research conducted in an unfiltered environment. Furthermore, this paper delineates two other uses within the same context. The study's findings illustrate the efficacy of the approaches presented for the purposes of distribution and estimation

    A Novel Chen Extension for Risk Analysis with MOOP and PORT-VAR Assessments under Hydrological Flow Data and Financial Case Study

    No full text
    This paper introduces a new extension of the Chen distribution, designed to better model extreme low-flow events in hydrology and rare events in the medical field. The proposed model incorporates asymmetrical and heavy-tailed behavior, making it particularly useful for analyzing extreme values in complex real datasets. We derive the mathematical properties of the BGC distribution and apply two advanced analytical techniques: the Mean-of-Order-P (MOOP) method to determine the optimal value of P (referred to as Opt-P), and the Peaks Over Threshold Value-at-Risk (PORT-VaR) approach to identify and assess critical extreme events. These methods are applied to real datasets including relief times, minimum river flow data from the Cuiabá River, and U.S. indemnity losses from general liability claims. The MOOP analysis shows that increasing the order P leads to reduced Mean Squared Error (MSE) and Bias, indicating improved estimation accuracy. For example, in the relief times dataset, MSE decreases from 0.64 at P=1 to 0.3844 at P=5. Similarly, for the minimum flow data, MSE drops from 4402.88 to 3684.27 with increasing P, highlighting the benefits of higher-order statistics in capturing central tendencies. Using PORT-VaR, we analyze extreme peaks under varying confidence levels (50%, 70%, 90%, and 99%) and compute key risk indicators such as Value-at-Risk (VaR) , Tail Value-at-Risk (TVaR) , Mean Excess Loss (MEXL) , Tail Variance (TV) , and Tail Mean Variance (TMV) . In the relief times dataset, VaR increases from 1.70 at 50% confidence to 3.055 at 99% confidence, demonstrating growing risk exposure at higher confidence levels. For the minimum flow data, VaR rises from 115.925 at 50% to 157.169 at 99%, underscoring the importance of adaptive risk thresholds in managing water scarcity and dam safety. A financial case study using U.S. indemnity loss data further validates the robustness of the BGC model in capturing tail behavior and estimating extreme risks. At the 99% confidence level, VaR reaches 170400 (in thousands of USD), and MEXL is 203411, illustrating the nonlinear growth of risk in heavy-tailed insurance claims. Finally, a comparative study under a historical financial claims data through an application

    The Kth-Order Equilibrium Rayleigh Distribution: Characterization and Estimation

    No full text
    This paper introduces a new extension of Rayleigh distribution named as the Kth-order equilibrium Rayleigh distribution (KERD), by employing the concept of Kth order equilibrium method.  Various statistical properties of the new distribution, including its aging behavior and stochastic ordering relations, are analyzed. Explicit expressions are derived for moments, conditional moments, incomplete moments, the mean residualfunction, the mean waiting function, entropy measures, and order statistics. Distribution characterization has been examined. Maximum likelihood estimation method is used to estimate the parameters. A simulation study using the Anderson–Darling test statistic is carried out to analyze the asymptotic behavior of maximum likelihood estimators. The behaviors of bias and mean square error are observed with the increase in sample size. The applications of new distribution are demonstrated using two different real life datasets. Ultimately, a comparison is conducted amongKERD and its sub-models regarding their fit using Information Criterion tools

    On the Uniqueness and Structural Identification of Some Univariate Continuous Probability Distributions

    No full text
    This paper examines the characterizations of five recent univariate continuous probability distributions (2022-2025) that were proposed relatively recently. These characterizations are based on: (i) a simple relationshipbetween two truncated moments; (ii) reverse hazard function. It should be mentioned that for the characterization(i) the cumulative distribution function need not have a closed form and depends on the solutionof a first order differential equation, which provides a bridge between probability and differential equatio

    A closed-form estimator of R = P(X < Y) based on ranked set sampling for a family of statistical distributions with application in agriculture

    No full text
    In this article, we will derive a closed-form estimator for the probability R = P (X &lt; Y) based on a ranked set sampling (RSS) scheme when the he random variables X and Y are assumed to follow the Lehmann Type-II (L-II) family of distributions. Estimating R through the maximum likelihood (ML) method within the RSS framework does not yield an analytical solution because of the non-linear components present in the likelihood equations. In this context, we employ a modified maximum likelihood (MML) estimation approach to derive a closed-form estimator for R. Estimates of R under both ML and MML techniques along with their corresponding asymptotic confidence intervals are determined and compared in a simulation study under one of the distributions of the L-II family called the inverse Topp-Leone distribution. At the end, the simulation results are strengthened using a real example in the field of agriculture

    A Novel Generated G Family for Risk Analysis and Assessment under Different Non-Bayesian Methods: Properties, Characterizations and Applications to USA House Prices and UK Insurance Claims Data

    No full text
    This study proposes a new and versatile family of continuous probability models known as the log-exponential generated (LEG) distributions, with particular emphasis on the log-exponential generated Weibull (LEGW) model as its prominent representative. By introducing an additional layer of parameterization, the family offers improved adaptability in shaping distributional forms, especially regarding skewness and heavy-tailed behavior. The LEGW formulation proves especially relevant for reliability data and for capturing rare but impactful events where asymmetry plays a major role. The work details the theoretical framework of the family through explicit expressions for its cumulative distribution function (CDF) and probability density function (PDF), alongside the corresponding hazard rate function (HRF). Several analytical characteristics are also investigated, including series representations and behavior in the extreme tail. To demonstrate practical value, the paper conducts risk evaluations employing sophisticated key risk indicators (KRIs) such as Value-at-Risk (VaR), Tail Value-at-Risk (TVaR), and tail mean-variance measure (TMVq) across multiple quantile levels. Parameter estimation is addressed using several techniques, including maximum likelihood estimation (MLE), the Cramér–von Mises approach (CVM), and the Anderson–Darling estimator (ADE), in addition to their right-tail adjusted (RTADE) and left-tail adjusted variants (LTADE) to better capture extreme behaviors. Comparative performance analyses are carried out using both controlled simulation scenarios and real data from the insurance and housing sectors to test robustness under heavy-tail conditions. The findings highlight the effectiveness of the LEGW model in applied risk assessment, supported by evidence from insurance claims and economic datasets

    Enhancing Food Security Analysis in South Sulawesi Using Robust Mixed Geographically and Temporally Weighted Regression with M-Estimator

    No full text
    MGTWR (Mixed Geographically and Temporally Weighted Regression) combines a global linear regression model with GTWR by incorporating spatial and temporal dimensions. However, it remains sensitive to outliers, which can reduce accuracy. To address this limitation, a robust regression approach with the M-Estimator was applied to model the food security index in South Sulawesi Province from 2018 to 2022. The resulting Robust MGTWR (RMGTWR) model demonstrated improved performance, with a lower AIC ( ) and a high explanatory power ( ). Key factors influencing food security include the ratio of normative consumption per capita to net production, the percentage of households with a proportion of expenditure on food more significant than 65% of total spending, the percentage of households without access to electricity, the percentage of households without access to clean water, and the percentage of stunting toddlers. These findings highlight the effectiveness of RMGTWR with M-Estimator in addressing data irregularities and provide valuable insights for policymakers in designing targeted strategies to strengthen food security in South Sulawesi Province

    A New Odd-Burr Pareto Distribution: Statistical Properties, Estimation, and Applications

    No full text
    This study introduces the Odd-Burr Pareto (OBu-P) distribution as a novel and flexible model, which is developed by combining the Burr and Pareto distributions using the T-X generator approach (Alizadeh et al. 2017). The OBu-P distribution can be used for modelling complex phenomenon characterized by heavy tails. The paper provides the OBu-P distribution’s statistical properties, including its moments, incomplete moments, quantile functions, and limiting behaviours, as well as its generating functions and order statistics. Maximum likelihood estimation is applied to facilitate efficient parameter estimation of the OBu-P. The flexibility of distribution is shown in a real-life example versus its alternatives

    0

    full texts

    861

    metadata records
    Updated in last 30 days.
    Pakistan Journal of Statistics and Operation Research
    Access Repository Dashboard
    Do you manage Open Research Online? Become a CORE Member to access insider analytics, issue reports and manage access to outputs from your repository in the CORE Repository Dashboard! 👇