Search CORE

2,404 research outputs found

Action following the discovery of a global association between the whole genome and adverse event risk in a clinical drug-development programme

Author: Breiman
Breslow
Dudbridge
Díaz-Uriarte
Efron
Fielding
Genkin
Graven
Hastie
Izmirlian
Kelly
Lindley
Park
Ripley
Roses
Roses
Tibshirani
Troyanskaya
Wille
Xiong
Yuan
Zaykin
Publication venue: 'Wiley'
Publication date: 01/01/2008
Field of study

Observation of adverse drug reactions during drug development can cause closure of the whole programme. However, if association between the genotype and the risk of an adverse event is discovered, then it might suffice to exclude patients of certain genotypes from future recruitment. Various sequential and non-sequential procedures are available to identify an association between the whole genome, or at least a portion of it, and the incidence of adverse events. In this paper we start with a suspected association between the genotype and the risk of an adverse event and suppose that the genetic subgroups with elevated risk can be identified. Our focus is determination of whether the patients identified as being at risk should be excluded from further studies of the drug. We propose using a utility function to determine the appropriate action, taking into account the relative costs of suffering an adverse reaction and of failing to alleviate the patient's disease. Two illustrative examples are presented, one comparing patients who suffer from an adverse event with contemporary patients who do not, and the other making use of a reference control group. We also illustrate two classification methods, LASSO and CART, for identifying patients at risk, but we stress that any appropriate classification method could be used in conjunction with the proposed utility function. Our emphasis is on determining the action to take rather than on providing definitive evidence of an association

Central Archive at the University of Reading

Crossref

Warwick Research Archives Portal Repository

Lancaster E-Prints

A rare event classification in the advanced manufacturing system: focused on imbalanced datasets

Author: Lee Wonjae
Publication venue: 'University of Missouri Libraries'
Publication date
Field of study

In many industrial applications, classification tasks are often associated with imbalanced class labels in training datasets. Imbalanced datasets can severely affect the accuracy of class predictions, and thus they need to be handled by appropriate data processing before analyzing the data since most machine learning techniques assume that the input data is balanced. When this imbalance problem comes with highdimensional space, feature extraction can be applied. In Chapter 2, we present two versions of feature extraction techniques called CL-LNN and RD-LNN in a time series dataset based on the nearest neighbor combined with machine learning algorithms to detect a failure of the paper manufacturing machinery earlier than its occurrence from the multi-stream system monitoring data. The nearest neighbor is applied to each separate feature instead of the whole 61 features to address the curse of dimensionality. Also, another technique for the skewness between class labels can be solved by either oversampling minorities or downsampling majorities in class. In the chapter 3, we are seeking to find a better way of downsampling by selecting the most informative samples in the given imbalanced dataset through the active learning strategy to mitigate the effect of imbalanced class labels. The data selection for downsampling is performed by the criterion used in optimal experimental designs, from which the generalization error of the trained model is minimized in a sequential manner under the penalized logistic regression as a classification model. We also suggest that the performance is significantly improved, especially with the highly imbalanced dataset, e.g., the imbalanced ratio is greater than ten if tuning hyper-parameter and costweight method are applied to the active downsampling technique. The research is further extended to cover nonlinearity using nonparametric logistic regression, and performance-based active learning (PBAL) is proposed to enhance the performance compared to the existing ones such as D-optimality and A-optimality.Includes bibliographical references

University of Missouri: MOspace