12 research outputs found
Diagnostic in Poisson Regression Models
Poisson regression model is one of the most frequently used statistical methods as a standard method of data analysis in many fields. Our focus in this paper is on the identification of outliers, we mainly discuss the deviance and Pearson as diagnostic statistics in identification. Simulation and real data are presented to assess the performance of the diagnostic statistics
A New Ridge-Type Estimator for the Gamma Regression Model
The known linear regression model (LRM) is used mostly for modelling the QSAR relationship between the response variable (biological activity) and one or more physiochemical or structural properties which serve as the explanatory variables mainly when the distribution of the response variable is normal. The gamma regression model is employed often for a skewed dependent variable. The parameters in both models are estimated using the maximum likelihood estimator (MLE). However, the MLE becomes unstable in the presence of multicollinearity for both models. In this study, we propose a new estimator and suggest some biasing parameters to estimate the regression parameter for the gamma regression model when there is multicollinearity. A simulation study and a real-life application were performed for evaluating the estimators\u27 performance via the mean squared error criterion. The results from simulation and the real-life application revealed that the proposed gamma estimator produced lower MSE values than other considered estimators
Reliability Estimation of Three Parameters Weibull Distribution based on Particle Swarm Optimization
The three-parameter Weibull distribution is a continuous distribution widely used in the study of reliability and life data. The estimation of the distribution parameters is an important problem that has received a lot of attention by researchers because of theirs effects in several measurements. In this research, we propose a particle swarm optimization (PSO) to estimate the three-parameter Weibull distribution and then to estimate the reliability and hazard functions. The real data results indicate that our proposed estimation method is significantly consistent in estimation compared to the maximum likelihood method. In terms of log likelihood and mean time to failure (MTTF). 
Regularized logistic regression with adjusted adaptive elastic net for gene selection in high dimensional cancer classification
Cancer classification and gene selection in high-dimensional data have been popular research topics in genetics and molecular biology. Recently, adaptive regularized logistic regression using the elastic net regularization, which is called the adaptive elastic net, has been successfully applied in high-dimensional cancer classification to tackle both estimating the gene coefficients and performing gene selection simultaneously. The adaptive elastic net originally used elastic net estimates as the initial weight, however, using this weight may not be preferable for certain reasons: First, the elastic net estimator is biased in selecting genes. Second, it does not perform well when the pairwise correlations between variables are not high. Adjusted adaptive regularized logistic regression (AAElastic) is proposed to address these issues and encourage grouping effects simultaneously. The real data results indicate that AAElastic is significantly consistent in selecting genes compared to the other three competitor regularization methods. Additionally, the classification performance of AAElastic is comparable to the adaptive elastic net and better than other regularization methods. Thus, we can conclude that AAElastic is a reliable adaptive regularized logistic regression method in the field of high-dimensional cancer classification
A new Jackknifing ridge estimator for logistic regression model
The ridge regression model has been consistently demonstrated to be an attractive shrinkage method to reduce the effects of multicollinearity. The logistic regression model is a well-known model in application when the response variable is binary data. However, it is known that multicollinearity negatively affects the variance of maximum likelihood estimator of the logistic regression coefficients. To address this problem, a logistic ridge estimator has been proposed by numerous researchers. In this paper, a new Jackknifing logistic ridge estimator (NLRE) is proposed and derived. The idea behind the NLRE is to get diagonal matrix with small values of diagonal elements that leading to decrease the shrinkage parameter and, therefore, the resultant estimator can be better with small amount of bias. Our Monte Carlo simulation results suggest that the NLRE estimator can bring significant improvement relative to other existing estimators. In addition, the real application results demonstrate that the NLRE estimator outperforms both logistic ridge estimator and maximum likelihood estimator in terms of predictive performance
High Dimensional Logistic Regression Model using Adjusted Elastic Net Penalty
Reduction of the high dimensional binary classification data using penalized logistic regression is one of the challenges when the explanatory variables are correlated. To tackle both estimate the coefficients and perform variable selection simultaneously, elastic net penalty was successfully applied in high dimensional binary classification. However, elastic net has two major limitations. First it does not encouraging grouping effects when there is no high correlation. Second, it is not consistent in variable selection. To address these issues, an adjusted of the elastic net (AEN) and its adaptive adjusted elastic net (AAEM), are proposed to take into account the small and medium correlation between explanatory variables and to provide the consistency of the variable selection simultaneously. Our simulation and real data results show that AEN and AAEN has advantage with small, medium, and extremely correlated variables in terms of both prediction and variable selection consistency comparing with other existing penalized methods
Qsar model for predicting neuraminidase inhibitors of influenza a viruses (H1N1) based on adaptive grasshopper optimization algorithm
High-dimensionality is one of the major problems which affect the quality of the quantitative structure–activity relationship (QSAR) modelling. Obtaining a reliable QSAR model with few descriptors is an essential procedure in chemometrics. The binary grasshopper optimization algorithm (BGOA) is a new meta-heuristic optimization algorithm, which has been used successfully to perform feature selection. In this paper, four new transfer functions were adapted to improve the exploration and exploitation capability of the BGOA in QSAR modelling of influenza A viruses (H1N1). The QSAR model with these new quadratic transfer functions was internally and externally validated based on MSEtrain, Y-randomization test, MSEtest, and the applicability domain (AD). The validation results indicate that the model is robust and not due to chance correlation. In addition, the results indicate that the descriptor selection and prediction performance of the QSAR model for training dataset outperform the other S-shaped and V-shaped transfer functions. QSAR model using quadratic transfer function shows the lowest MSEtrain. For the test dataset, proposed QSAR model shows lower value of MSEtest compared with the other methods, indicating its higher predictive ability. In conclusion, the results reveal that the proposed QSAR model is an efficient approach for modelling high-dimensional QSAR models and it is useful for the estimation of IC50 values of neuraminidase inhibitors that have not been experimentally tested