7,500 research outputs found
Evaluation of Modified Non-Normal Process Capability Index and Its Bootstrap Confidence Intervals
Process capability index (PCI) is used to quantify the process performance and is becoming an attracted area of research. A variability measure plays an important role in PCI. The interquartile range (IQR) or the median absolute deviation (MAD) is commonly used for a variability measure in estimating PCI when a process follows a non-normal distribution In this paper, the efficacy of the IQR and MAD-based PCIs was evaluated under low, moderate, and high asymmetric behavior of the Weibull distribution using different sample sizes through three different bootstrap confidence intervals. The result reveals that MAD performs better than IQR, because the former produced less bias and mean square error. Also, the percentile bootstrap confidence interval is recommended for use, because it has less average width and high coverage probability.11Ysciescopu
Capability Indices for Non-Normal Distribution using Gini’s Mean Difference as Measure of Variability
This paper investigates the efficiency of Gini's mean difference (GMD) as a measure of variability in two commonly used process capability indices (PCIs), i.e., Cp and Cpk. A comparison has been carried out to evaluate the performance of GMD-based PCIs and Pearn and Chen quantile-based PCIs under low, moderate, and high asymmetry using Weibull distribution. The simulation results, under low and moderate asymmetric condition, indicate that GMD-based PCIs are more close to target values than quantile approach. Beside point estimation, nonparametric bootstrap confidence intervals, such as standard, percentile, and bias corrected percentile with their coverage probabilities also have been calculated. Using quantile approach, bias corrected percentile (BCPB) method is more effective for both Cp and Cpk, where as in case of GMD, both BCPB and percentile bootstrap method can be used to estimate the confidence interval of Cp and Cpk, respectively.1133Ysciescopu
An evaluation of intrusive instrumental intelligibility metrics
Instrumental intelligibility metrics are commonly used as an alternative to
listening tests. This paper evaluates 12 monaural intrusive intelligibility
metrics: SII, HEGP, CSII, HASPI, NCM, QSTI, STOI, ESTOI, MIKNN, SIMI, SIIB, and
. In addition, this paper investigates the ability of
intelligibility metrics to generalize to new types of distortions and analyzes
why the top performing metrics have high performance. The intelligibility data
were obtained from 11 listening tests described in the literature. The stimuli
included Dutch, Danish, and English speech that was distorted by additive
noise, reverberation, competing talkers, pre-processing enhancement, and
post-processing enhancement. SIIB and HASPI had the highest performance
achieving a correlation with listening test scores on average of
and , respectively. The high performance of SIIB may, in part, be
the result of SIIBs developers having access to all the intelligibility data
considered in the evaluation. The results show that intelligibility metrics
tend to perform poorly on data sets that were not used during their
development. By modifying the original implementations of SIIB and STOI, the
advantage of reducing statistical dependencies between input features is
demonstrated. Additionally, the paper presents a new version of SIIB called
, which has similar performance to SIIB and HASPI,
but takes less time to compute by two orders of magnitude.Comment: Published in IEEE/ACM Transactions on Audio, Speech, and Language
Processing, 201
Modeling Binary Time Series Using Gaussian Processes with Application to Predicting Sleep States
Motivated by the problem of predicting sleep states, we develop a mixed
effects model for binary time series with a stochastic component represented by
a Gaussian process. The fixed component captures the effects of covariates on
the binary-valued response. The Gaussian process captures the residual
variations in the binary response that are not explained by covariates and past
realizations. We develop a frequentist modeling framework that provides
efficient inference and more accurate predictions. Results demonstrate the
advantages of improved prediction rates over existing approaches such as
logistic regression, generalized additive mixed model, models for ordinal data,
gradient boosting, decision tree and random forest. Using our proposed model,
we show that previous sleep state and heart rates are significant predictors
for future sleep states. Simulation studies also show that our proposed method
is promising and robust. To handle computational complexity, we utilize Laplace
approximation, golden section search and successive parabolic interpolation.
With this paper, we also submit an R-package (HIBITS) that implements the
proposed procedure.Comment: Journal of Classification (2018
Emulsion copolymerization of styrene and butyl acrylate in the presence of a chain transfer agent. Part 2: parameters estimability and confidence regions
Accurate estimation of the model parameters is required to obtain reliable predictions of the products end-use properties. However, due to the mathematical model structure and/or to a possible lack of measurements, the estimation of some parameters may be impossible. This paper will focus on the case where the main limitations to the parameters estimability are their weak effect on the measured outputs or the correlation between the effects of two or more parameters. The objective of the method developed in this paper is to determine the subset of the most influencing parameters that can be estimated from the available experimental data, when the complete set of model parameters cannot be estimated. This approach has been applied to the mathematical model of the emulsion copolymerization of styrene and butyl acrylate, in the presence of n-dodecyl mercaptan as a chain transfer agent. In addition, a new approach is used to better assess the true confidence regions and evaluate the accuracy of the parameters estimates in more reliable way
Statistical learning methods as a preprocessing step for survival analysis: evaluation of concept using lung cancer data
<p>Abstract</p> <p>Background</p> <p>Statistical learning (SL) techniques can address non-linear relationships and small datasets but do not provide an output that has an epidemiologic interpretation.</p> <p>Methods</p> <p>A small set of clinical variables (CVs) for stage-1 non-small cell lung cancer patients was used to evaluate an approach for using SL methods as a preprocessing step for survival analysis. A stochastic method of training a probabilistic neural network (PNN) was used with differential evolution (DE) optimization. Survival scores were derived stochastically by combining CVs with the PNN. Patients (n = 151) were dichotomized into favorable (n = 92) and unfavorable (n = 59) survival outcome groups. These PNN derived scores were used with logistic regression (LR) modeling to predict favorable survival outcome and were integrated into the survival analysis (i.e. Kaplan-Meier analysis and Cox regression). The hybrid modeling was compared with the respective modeling using raw CVs. The area under the receiver operating characteristic curve (Az) was used to compare model predictive capability. Odds ratios (ORs) and hazard ratios (HRs) were used to compare disease associations with 95% confidence intervals (CIs).</p> <p>Results</p> <p>The LR model with the best predictive capability gave Az = 0.703. While controlling for gender and tumor grade, the OR = 0.63 (CI: 0.43, 0.91) per standard deviation (SD) increase in age indicates increasing age confers unfavorable outcome. The hybrid LR model gave Az = 0.778 by combining age and tumor grade with the PNN and controlling for gender. The PNN score and age translate inversely with respect to risk. The OR = 0.27 (CI: 0.14, 0.53) per SD increase in PNN score indicates those patients with decreased score confer unfavorable outcome. The tumor grade adjusted hazard for patients above the median age compared with those below the median was HR = 1.78 (CI: 1.06, 3.02), whereas the hazard for those patients below the median PNN score compared to those above the median was HR = 4.0 (CI: 2.13, 7.14).</p> <p>Conclusion</p> <p>We have provided preliminary evidence showing that the SL preprocessing may provide benefits in comparison with accepted approaches. The work will require further evaluation with varying datasets to confirm these findings.</p
Validity Of Scoring Methods In The Presence Of Outliers
Rankings enjoy growing popularity in the economical sciences. Well known institutions like the World Economic Forum, Heritage Foundation and the OECD make use of rankings to exert competitive pressure on the ranked countries. To achieve any such desired effects rankings need to be accepted and approved as a whole, and in particular regarding the applied methodology. In order to appeal to wide sections of the population scoring methods are applied to aggregate a composite indicator. Experience has shown that outliers have a distorting effect on the ranking order and therefore cause economically implausible results which are a target for criticism. For these reasons the choice of an adequate scoring method is of great importance. The applied technique should provide a feature which enables it to mitigate the distorting effect of outliers without the necessity for an arbitrarily elimination of data points. Although scoring methods have a high influence on ranking results, scientific analysis is often more concerned with the optimal choice of indicators or the weighting scheme, whereas the impact of extreme values is not addressed. According to this, the present research is related to the question which scoring method is the best choice in the presence of outliers. Evidence is given, that Logistic Function Methods have the ability to mitigate outlier distortion effects. The analysis approaches the issue considering two aspects: It combines the theoretically derivation of scoring methods’ statistic strengths and weaknesses for the ranking process and highlights the bootstrap technique to assess the validity of score results in the presence of outliers
Wavelet Neural Networks: A Practical Guide
Wavelet networks (WNs) are a new class of networks which have been used with great success in a wide range of application. However a general accepted framework for applying WNs is missing from the literature. In this study, we present a complete statistical model identification framework in order to apply WNs in various applications. The following subjects were thorough examined: the structure of a WN, training methods, initialization algorithms, variable significance and variable selection algorithms, model selection methods and finally methods to construct confidence and prediction intervals. In addition the complexity of each algorithm is discussed. Our proposed framework was tested in two simulated cases, in one chaotic time series described by the Mackey-Glass equation and in three real datasets described by daily temperatures in Berlin, daily wind speeds in New York and breast cancer classification. Our results have shown that the proposed algorithms produce stable and robust results indicating that our proposed framework can be applied in various applications
- …