66 research outputs found

    An AUC-based Permutation Variable Importance Measure for Random Forests

    Get PDF
    The random forest (RF) method is a commonly used tool for classification with high dimensional data as well as for ranking candidate predictors based on the so-called random forest variable importance measures (VIMs). However the classification performance of RF is known to be suboptimal in case of strongly unbalanced data, i.e. data where response class sizes differ considerably. Suggestions were made to obtain better classification performance based either on sampling procedures or on cost sensitivity analyses. However to our knowledge the performance of the VIMs has not yet been examined in the case of unbalanced response classes. In this paper we explore the performance of the permutation VIM for unbalanced data settings and introduce an alternative permutation VIM based on the area under the curve (AUC) that is expected to be more robust towards class imbalance. We investigated the performance of the standard permutation VIM and of our novel AUC-based permutation VIM for different class imbalance levels using simulated data and real data. The results suggest that the standard permutation VIM loses its ability to discriminate between associated predictors and predictors not associated with the response for increasing class imbalance. It is outperformed by our new AUC-based permutation VIM for unbalanced data settings, while the performance of both VIMs is very similar in the case of balanced classes. The new AUC-based VIM is implemented in the R package party for the unbiased RF variant based on conditional inference trees. The codes implementing our study are available from the companion website: http://www.ibe.med.uni-muenchen.de/organisation/mitarbeiter/070_drittmittel/janitza/index.html

    6^{6}He + α\alpha clustering in 10^{10}Be

    Full text link
    In a kinematically complete measurement of the 7^{7}Li(7^{7}Li,α\alpha6^{6}He)4^4He reaction at EiE_{i} = 8 MeV it was observed that the 10^{10}Be excited states at 9.6 and 10.2 MeV decay by 6^{6}He emission. The state at 10.2 MeV may be a member of a rotational band based on the 6.18 MeV 0+^+ state.Comment: 9 pages, RevTex, 3 Postscript figures (tarred, gzipped and uuencoded) include

    Exploring synergetic effects of dimensionality reduction and resampling tools on hyperspectral imagery data classification

    Get PDF
    The present paper addresses the problem of the classification of hyperspectral images with multiple imbalanced classes and very high dimensionality. Class imbalance is handled by resampling the data set, whereas PCA and a supervised filter are applied to reduce the number of spectral bands. This is a preliminary study that pursues to investigate the benefits of combining several techniques to tackle the imbalance and the high dimensionality problems, and also to evaluate the order of application that leads to the best classification performance. Experimental results demonstrate the significance of using together these two preprocessing tools to improve the performance of hyperspectral imagery classification. Although it seems that the most effective order corresponds to first a resampling strategy and then a feature (or extraction) selection algorithm, this is a question that still needs a much more thorough investigation in the futureThis work has partially been supported by the Spanish Ministry of Education and Science under grants CSD2007–00018, AYA2008–05965–0596 and TIN2009–14205, the Fundació Caixa Castelló–Bancaixa under grant P1–1B2009–04, and the Generalitat Valenciana under grant PROMETEO/2010/02

    Prediction of Preterm Deliveries from EHG Signals Using Machine Learning

    Get PDF
    There has been some improvement in the treatment of preterm infants, which has helped to increase their chance of survival. However, the rate of premature births is still globally increasing. As a result, this group of infants are most at risk of developing severe medical conditions that can affect the respiratory, gastrointestinal, immune, central nervous, auditory and visual systems. In extreme cases, this can also lead to long-term conditions, such as cerebral palsy, mental retardation, learning difficulties, including poor health and growth. In the US alone, the societal and economic cost of preterm births, in 2005, was estimated to be $26.2 billion, per annum. In the UK, this value was close to £2.95 billion, in 2009. Many believe that a better understanding of why preterm births occur, and a strategic focus on prevention, will help to improve the health of children and reduce healthcare costs. At present, most methods of preterm birth prediction are subjective. However, a strong body of evidence suggests the analysis of uterine electrical signals (Electrohysterography), could provide a viable way of diagnosing true labour and predict preterm deliveries. Most Electrohysterography studies focus on true labour detection during the final seven days, before labour. The challenge is to utilise Electrohysterography techniques to predict preterm delivery earlier in the pregnancy. This paper explores this idea further and presents a supervised machine learning approach that classifies term and preterm records, using an open source dataset containing 300 records (38 preterm and 262 term). The synthetic minority oversampling technique is used to oversample the minority preterm class, and cross validation techniques, are used to evaluate the dataset against other similar studies. Our approach shows an improvement on existing studies with 96% sensitivity, 90% specificity, and a 95% area under the curve value with 8% global error using the polynomial classifier

    Classification of Caesarean Section and Normal Vaginal Deliveries Using Foetal Heart Rate Signals and Advanced Machine Learning Algorithms

    Get PDF
    ABSTRACT – Background: Visual inspection of Cardiotocography traces by obstetricians and midwives is the gold standard for monitoring the wellbeing of the foetus during antenatal care. However, inter- and intra-observer variability is high with only a 30% positive predictive value for the classification of pathological outcomes. This has a significant negative impact on the perinatal foetus and often results in cardio-pulmonary arrest, brain and vital organ damage, cerebral palsy, hearing, visual and cognitive defects and in severe cases, death. This paper shows that using machine learning and foetal heart rate signals provides direct information about the foetal state and helps to filter the subjective opinions of medical practitioners when used as a decision support tool. The primary aim is to provide a proof-of-concept that demonstrates how machine learning can be used to objectively determine when medical intervention, such as caesarean section, is required and help avoid preventable perinatal deaths. Methodology: This is evidenced using an open dataset that comprises 506 controls (normal virginal deliveries) and 46 cases (caesarean due to pH ≤7.05 and pathological risk). Several machine-learning algorithms are trained, and validated, using binary classifier performance measures. Results: The findings show that deep learning classification achieves Sensitivity = 94%, Specificity = 91%, Area under the Curve = 99%, F-Score = 100%, and Mean Square Error = 1%. Conclusions: The results demonstrate that machine learning significantly improves the efficiency for the detection of caesarean section and normal vaginal deliveries using foetal heart rate signals compared with obstetrician and midwife predictions and systems reported in previous studies

    An insight into imbalanced Big Data classification: outcomes and challenges

    Get PDF
    Big Data applications are emerging during the last years, and researchers from many disciplines are aware of the high advantages related to the knowledge extraction from this type of problem. However, traditional learning approaches cannot be directly applied due to scalability issues. To overcome this issue, the MapReduce framework has arisen as a “de facto” solution. Basically, it carries out a “divide-and-conquer” distributed procedure in a fault-tolerant way to adapt for commodity hardware. Being still a recent discipline, few research has been conducted on imbalanced classification for Big Data. The reasons behind this are mainly the difficulties in adapting standard techniques to the MapReduce programming style. Additionally, inner problems of imbalanced data, namely lack of data and small disjuncts, are accentuated during the data partitioning to fit the MapReduce programming style. This paper is designed under three main pillars. First, to present the first outcomes for imbalanced classification in Big Data problems, introducing the current research state of this area. Second, to analyze the behavior of standard pre-processing techniques in this particular framework. Finally, taking into account the experimental results obtained throughout this work, we will carry out a discussion on the challenges and future directions for the topic.This work has been partially supported by the Spanish Ministry of Science and Technology under Projects TIN2014-57251-P and TIN2015-68454-R, the Andalusian Research Plan P11-TIC-7765, the Foundation BBVA Project 75/2016 BigDaPTOOLS, and the National Science Foundation (NSF) Grant IIS-1447795

    Enhancement strategies for transdermal drug delivery systems: current trends and applications

    Get PDF

    A 12-gene pharmacogenetic panel to prevent adverse drug reactions: an open-label, multicentre, controlled, cluster-randomised crossover implementation study

    Get PDF
    Background The benefit of pharmacogenetic testing before starting drug therapy has been well documented for several single gene–drug combinations. However, the clinical utility of a pre-emptive genotyping strategy using a pharmacogenetic panel has not been rigorously assessed. Methods We conducted an open-label, multicentre, controlled, cluster-randomised, crossover implementation study of a 12-gene pharmacogenetic panel in 18 hospitals, nine community health centres, and 28 community pharmacies in seven European countries (Austria, Greece, Italy, the Netherlands, Slovenia, Spain, and the UK). Patients aged 18 years or older receiving a first prescription for a drug clinically recommended in the guidelines of the Dutch Pharmacogenetics Working Group (ie, the index drug) as part of routine care were eligible for inclusion. Exclusion criteria included previous genetic testing for a gene relevant to the index drug, a planned duration of treatment of less than 7 consecutive days, and severe renal or liver insufficiency. All patients gave written informed consent before taking part in the study. Participants were genotyped for 50 germline variants in 12 genes, and those with an actionable variant (ie, a drug–gene interaction test result for which the Dutch Pharmacogenetics Working Group [DPWG] recommended a change to standard-of-care drug treatment) were treated according to DPWG recommendations. Patients in the control group received standard treatment. To prepare clinicians for pre-emptive pharmacogenetic testing, local teams were educated during a site-initiation visit and online educational material was made available. The primary outcome was the occurrence of clinically relevant adverse drug reactions within the 12-week follow-up period. Analyses were irrespective of patient adherence to the DPWG guidelines. The primary analysis was done using a gatekeeping analysis, in which outcomes in people with an actionable drug–gene interaction in the study group versus the control group were compared, and only if the difference was statistically significant was an analysis done that included all of the patients in the study. Outcomes were compared between the study and control groups, both for patients with an actionable drug–gene interaction test result (ie, a result for which the DPWG recommended a change to standard-of-care drug treatment) and for all patients who received at least one dose of index drug. The safety analysis included all participants who received at least one dose of a study drug. This study is registered with ClinicalTrials.gov, NCT03093818 and is closed to new participants. Findings Between March 7, 2017, and June 30, 2020, 41696 patients were assessed for eligibility and 6944 (51·4 % female, 48·6% male; 97·7% self-reported European, Mediterranean, or Middle Eastern ethnicity) were enrolled and assigned to receive genotype-guided drug treatment (n=3342) or standard care (n=3602). 99 patients (52 [1·6%] of the study group and 47 [1·3%] of the control group) withdrew consent after group assignment. 652 participants (367 [11·0%] in the study group and 285 [7·9%] in the control group) were lost to follow-up. In patients with an actionable test result for the index drug (n=1558), a clinically relevant adverse drug reaction occurred in 152 (21·0%) of 725 patients in the study group and 231 (27·7%) of 833 patients in the control group (odds ratio [OR] 0·70 [95% CI 0·54–0·91]; p=0·0075), whereas for all patients, the incidence was 628 (21·5%) of 2923 patients in the study group and 934 (28·6%) of 3270 patients in the control group (OR 0·70 [95% CI 0·61–0·79]; p Horizon 2020 (H2020)Genetics of disease, diagnosis and treatmen
    corecore