151,748 research outputs found
Influence of observations on the misclassification probability in quadratic discriminant analysis.
In this paper it is analyzed how observations in the training sample affect the misclassification probability of a quadratic discriminant rule. An approach based on partial influence functions is followed. It allows to quantify the effect of observations in the training sample on the quality of the associated classification rule. Focus is more on the effect on the future misclassification rate, than on the influence on the parameters of the quadratic discriminant rule. The expression for the influence function is then used to construct a diagnostic tool for detecting influential observations. Applications on real data sets are provided.Applications; Classification; Data; Diagnostics; Discriminant analysis; Functions; Influence function; Misclassification probability; Outliers; Partial influence functions; Probability; Quadratic discriminant analysis; Quality; Robust covariance estimation; Robust regression; Training;
A review of associative classification mining
Associative classification mining is a promising approach in data mining that utilizes the
association rule discovery techniques to construct classification systems, also known as
associative classifiers. In the last few years, a number of associative classification algorithms
have been proposed, i.e. CPAR, CMAR, MCAR, MMAC and others. These algorithms
employ several different rule discovery, rule ranking, rule pruning, rule prediction and rule
evaluation methods. This paper focuses on surveying and comparing the state-of-the-art associative
classification techniques with regards to the above criteria. Finally, future directions in associative
classification, such as incremental learning and mining low-quality data sets, are also
highlighted in this paper
Active Clinical Trials for Personalized Medicine
Individualized treatment rules (ITRs) tailor treatments according to
individual patient characteristics. They can significantly improve patient care
and are thus becoming increasingly popular. The data collected during
randomized clinical trials are often used to estimate the optimal ITRs.
However, these trials are generally expensive to run, and, moreover, they are
not designed to efficiently estimate ITRs. In this paper, we propose a
cost-effective estimation method from an active learning perspective. In
particular, our method recruits only the "most informative" patients (in terms
of learning the optimal ITRs) from an ongoing clinical trial. Simulation
studies and real-data examples show that our active clinical trial method
significantly improves on competing methods. We derive risk bounds and show
that they support these observed empirical advantages.Comment: 48 Page, 9 Figures. To Appear in JASA--T&
- …