Search CORE

66 research outputs found

An AUC-based Permutation Variable Importance Measure for Random Forests

Author: A Estabrooks
AL Boulesteix
AL Boulesteix
Anne-Laure Boulesteix
C Chen
C Liu
C Strobl
Carolin Strobl
F Briggs
G Batista
J Chang
J Van Hulse
J Van Hulse
K Nicodemus
KK Nicodemus
KK Nicodemus
KK Nicodemus
L Breiman
M Calle
M Cummings
M Khalilia
M Kubat
M Pepe
N Japkowicz
R Blagus
Silke Janitza
T Fawcett
T Hothorn
T Hothorn
T Khoshgoftaar
WJ Lin
Y Huang
Y Sun
Y Xie
Publication venue
Publication date: 01/11/2012
Field of study

The random forest (RF) method is a commonly used tool for classification with high dimensional data as well as for ranking candidate predictors based on the so-called random forest variable importance measures (VIMs). However the classification performance of RF is known to be suboptimal in case of strongly unbalanced data, i.e. data where response class sizes differ considerably. Suggestions were made to obtain better classification performance based either on sampling procedures or on cost sensitivity analyses. However to our knowledge the performance of the VIMs has not yet been examined in the case of unbalanced response classes. In this paper we explore the performance of the permutation VIM for unbalanced data settings and introduce an alternative permutation VIM based on the area under the curve (AUC) that is expected to be more robust towards class imbalance. We investigated the performance of the standard permutation VIM and of our novel AUC-based permutation VIM for different class imbalance levels using simulated data and real data. The results suggest that the standard permutation VIM loses its ability to discriminate between associated predictors and predictors not associated with the response for increasing class imbalance. It is outperformed by our new AUC-based permutation VIM for unbalanced data settings, while the performance of both VIMs is very similar in the case of balanced classes. The new AUC-based VIM is implemented in the R package party for the unbiased RF variant based on conditional inference trees. The codes implementing our study are available from the companion website: http://www.ibe.med.uni-muenchen.de/organisation/mitarbeiter/070_drittmittel/janitza/index.html

CiteSeerX

Crossref

Springer - Publisher Connector

Open Access LMU ( Ludwig-Maximilians-Univ. München)

PubMed Central

ZORA

$^{6}$ He + $\alpha$ clustering in $^{10}$ Be

Author: Bochkarev O. V.
C Spitaleri
D Miljanić
D Rendić
Gabr M.
Glukhov Yu. A.
M Bogovac
M Lattuada
M Milin
M Zadro
N Soić
Nishioka H.
S Blagus
S Fazinić
Soic N.
T Tadić
Publication venue: 'IOP Publishing'
Publication date: 12/09/1995
Field of study

In a kinematically complete measurement of the

^{7}

Li(

^{7}

Li,

\alpha

^{6}

He)

^4

He reaction at

E_{i}

= 8 MeV it was observed that the

^{10}

Be excited states at 9.6 and 10.2 MeV decay by

^{6}

He emission. The state at 10.2 MeV may be a member of a rotational band based on the 6.18 MeV 0

^+

state.Comment: 9 pages, RevTex, 3 Postscript figures (tarred, gzipped and uuencoded) include

arXiv.org e-Print Archive

Crossref

EDP Sciences OAI-PMH repository (1.2.0)

Exploring synergetic effects of dimensionality reduction and resampling tools on hyperspectral imagery data classification

Author: A. Martínez-Usó
D.A. Landgrebe
D.P. Williams
F. Melgani
H. He
I.T. Jolliffe
J.A. Richards
J.C. Platt
J.R. Quinlan
J.R. Quinlan
L. Breiman
L. Bruzzone
L.O. Jiménez
M. Hall
M. Kubat
M. Trebar
M. Wasikowski
N. Japkowicz
N.V. Chawla
P.H. Hsu
R. Blagus
S. García
T. Fawcett
V. Kecman
V.N. Vapnik
X. Chen
Z.H. Zhou
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2011
Field of study

The present paper addresses the problem of the classification of hyperspectral images with multiple imbalanced classes and very high dimensionality. Class imbalance is handled by resampling the data set, whereas PCA and a supervised filter are applied to reduce the number of spectral bands. This is a preliminary study that pursues to investigate the benefits of combining several techniques to tackle the imbalance and the high dimensionality problems, and also to evaluate the order of application that leads to the best classification performance. Experimental results demonstrate the significance of using together these two preprocessing tools to improve the performance of hyperspectral imagery classification. Although it seems that the most effective order corresponds to first a resampling strategy and then a feature (or extraction) selection algorithm, this is a question that still needs a much more thorough investigation in the futureThis work has partially been supported by the Spanish Ministry of Education and Science under grants CSD2007–00018, AYA2008–05965–0596 and TIN2009–14205, the Fundació Caixa Castelló–Bancaixa under grant P1–1B2009–04, and the Generalitat Valenciana under grant PROMETEO/2010/02

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Repositori Institucional de la Universitat Jaume I

Prediction of Preterm Deliveries from EHG Signals Using Machine Learning

Author: A Greenough
Abir Hussain
C Buhimschi
C Buhimschi
C Buhimschi
C Rabotti
Chelsea Dobbins
Dhiya Al-Jumeily
E Charniak
G Fele-Žorž
H Leman
I Verdenik
J Gondry
J Nahar
JS Richman
L Tong
LJ Mangham
LJ Muglia
M Doret
M Hassan
M Lucovnik
M Lucovnik
M McPheeters
MO Diab
MO Diab
MP Vinken
NV Chawla
P Carre
Paul Fergus
Pauline Cheung
R Blagus
R Rattihalli
RE Garfield
RE Garfiled
RL Goldenberg
Shamaila Iram
T Fawcett
T Sun
TA Lasko
W Lin
WJ Lammers
WL Maner
WL Maner
WL Maner
Y Wang
Zhi Wei
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 28/10/2013
Field of study

There has been some improvement in the treatment of preterm infants, which has helped to increase their chance of survival. However, the rate of premature births is still globally increasing. As a result, this group of infants are most at risk of developing severe medical conditions that can affect the respiratory, gastrointestinal, immune, central nervous, auditory and visual systems. In extreme cases, this can also lead to long-term conditions, such as cerebral palsy, mental retardation, learning difficulties, including poor health and growth. In the US alone, the societal and economic cost of preterm births, in 2005, was estimated to be $26.2 billion, per annum. In the UK, this value was close to £2.95 billion, in 2009. Many believe that a better understanding of why preterm births occur, and a strategic focus on prevention, will help to improve the health of children and reduce healthcare costs. At present, most methods of preterm birth prediction are subjective. However, a strong body of evidence suggests the analysis of uterine electrical signals (Electrohysterography), could provide a viable way of diagnosing true labour and predict preterm deliveries. Most Electrohysterography studies focus on true labour detection during the final seven days, before labour. The challenge is to utilise Electrohysterography techniques to predict preterm delivery earlier in the pregnancy. This paper explores this idea further and presents a supervised machine learning approach that classifies term and preterm records, using an open source dataset containing 300 records (38 preterm and 262 term). The synthetic minority oversampling technique is used to oversample the minority preterm class, and cross validation techniques, are used to evaluate the dataset against other similar studies. Our approach shows an improvement on existing studies with 96% sensitivity, 90% specificity, and a 95% area under the curve value with 8% global error using the polynomial classifier

LJMU Research Online (Liverpool John Moores University)

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

University of Queensland eSpace

Classification of Caesarean Section and Normal Vaginal Deliveries Using Foetal Heart Rate Signals and Advanced Machine Learning Algorithms

Author: A Georgieva
A Pinas
A Sola
A Ugwumadu
Abir Hussain
AL Goldberger
AR Webb
B Chudacek
CK Karmakar
D Silver
De-Shuang Huang
Dhiya Al-Jumeily
DP Williams
E Kreyszig
F Tetschke
G Koop
H Ocak
J Camm
J Hand
J Kessler
J Nahar
J Nahar
J Spilka
J Spilka
J Spilka
JB Warren
L Omo-Aghoja
L Tong
LM Taft
LM Taft
ME Menai
MG Signorini
N Sarkar
N Srivastava
Nizar Bouguila
NV Chawla
P Fergus
P Pinto
PA Warrick
Paul Fergus
PD Welch
PM Granitto
R Blagus
R Blagus
R Blagus
R Brown
R Czabanski
R Mantel
R Vressler
S Schiermeier
T Sun
T Sun
T Sun
TM Khoshgoftaar
V Lopez
W Lin
W Lin
WL Maner
Y Wang
Y Wang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/07/2017
Field of study

ABSTRACT – Background: Visual inspection of Cardiotocography traces by obstetricians and midwives is the gold standard for monitoring the wellbeing of the foetus during antenatal care. However, inter- and intra-observer variability is high with only a 30% positive predictive value for the classification of pathological outcomes. This has a significant negative impact on the perinatal foetus and often results in cardio-pulmonary arrest, brain and vital organ damage, cerebral palsy, hearing, visual and cognitive defects and in severe cases, death. This paper shows that using machine learning and foetal heart rate signals provides direct information about the foetal state and helps to filter the subjective opinions of medical practitioners when used as a decision support tool. The primary aim is to provide a proof-of-concept that demonstrates how machine learning can be used to objectively determine when medical intervention, such as caesarean section, is required and help avoid preventable perinatal deaths. Methodology: This is evidenced using an open dataset that comprises 506 controls (normal virginal deliveries) and 46 cases (caesarean due to pH ≤7.05 and pathological risk). Several machine-learning algorithms are trained, and validated, using binary classifier performance measures. Results: The findings show that deep learning classification achieves Sensitivity = 94%, Specificity = 91%, Area under the Curve = 99%, F-Score = 100%, and Mean Square Error = 1%. Conclusions: The results demonstrate that machine learning significantly improves the efficiency for the detection of caesarean section and normal vaginal deliveries using foetal heart rate signals compared with obstetrician and midwife predictions and systems reported in previous studies

LJMU Research Online (Liverpool John Moores University)

Crossref

Directory of Open Access Journals

An insight into imbalanced Big Data classification: outcomes and challenges

Author: A Fernández
A Fernández
A Thusoo
B Krawczyk
C Bunkhumpornpat
CP Chen
D Lyubimov
E Elsebakhi
E Ramentol
F Hu
F Hu
G Haixiang
GEAPA Batista
GM Weiss
H He
H Yu
I Triguero
I Triguero
J Alcalá-Fdez
J Dean
J Huang
J Li
JA Sáez
JM Tomczak
K Kambatla
L Rokach
M Galar
M Galar
M Wasikowski
NV Chawla
NV Chawla
PC Zikopoulos
R Baeza-Yates
R Barandela
R Blagus
RC Prati
S Alshomrani
S Barua
S Elhag
S Kamal
S Owen
S Río
S Río
S-H Park
T Jo
T White
V García
V López
V López
V López
X Meng
X Wu
Y Guo
Y Sun
Y-S Chen
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

Big Data applications are emerging during the last years, and researchers from many disciplines are aware of the high advantages related to the knowledge extraction from this type of problem. However, traditional learning approaches cannot be directly applied due to scalability issues. To overcome this issue, the MapReduce framework has arisen as a “de facto” solution. Basically, it carries out a “divide-and-conquer” distributed procedure in a fault-tolerant way to adapt for commodity hardware. Being still a recent discipline, few research has been conducted on imbalanced classification for Big Data. The reasons behind this are mainly the difficulties in adapting standard techniques to the MapReduce programming style. Additionally, inner problems of imbalanced data, namely lack of data and small disjuncts, are accentuated during the data partitioning to fit the MapReduce programming style. This paper is designed under three main pillars. First, to present the first outcomes for imbalanced classification in Big Data problems, introducing the current research state of this area. Second, to analyze the behavior of standard pre-processing techniques in this particular framework. Finally, taking into account the experimental results obtained throughout this work, we will carry out a discussion on the challenges and future directions for the topic.This work has been partially supported by the Spanish Ministry of Science and Technology under Projects TIN2014-57251-P and TIN2015-68454-R, the Andalusian Research Plan P11-TIC-7765, the Foundation BBVA Project 75/2016 BigDaPTOOLS, and the National Science Foundation (NSF) Grant IIS-1447795

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Springer - Publisher Connector

Repositorio Institucional Universidad de Granada

Multi-class and feature selection extensions of Roughly Balanced Bagging for imbalanced data

Author: A Fernandez
B Krawczyk
C Chen
D Wilson
E Tang
G Pio
GM Weiss
H He
J Błaszczyński
J Jelonek
J Seaz
Jerzy Stefanowski
K Napierala
L Breiman
M Galar
Mateusz Lango
N Chawla
P Branco
R Blagus
S Hido
S Rio
S Wang
T Ho
T Jo
V Lopez
W Lin
Y Sun
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

SMOTE for high-dimensional class-imbalanced data

Author: A Fallahi
A Hinneburg
B Wallace
C Bunkhumpornpat
C Cortes
C Drummond
C Sotiriou
CM Bishop
DA Cieslak
E Fix
H Han
H He
J Pittman
J Wang
J Xiao
J Zhu
JV Hulse
K Beyer
KD MacIsaac
L Breiman
L Breiman
Lara Lusa
LD Miller
MA Shipp
N Iizuka
NV Chawla
P Radivojac
Q Gu
R Batuwita
R Blagus
R Development Core Team
R Johnson
R Tibshirani
RM Simon
Rok Blagus
S Daskalaki
S Doyle
S Dudoit
S Ramaswamy
SE Ertekin
T Fawcett
TP Speed
Y Guo
Y Liu
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Enhancement strategies for transdermal drug delivery systems: current trends and applications

Author: A Ahad
A Akbarzadeh
A Azagury
A Brunie
A Böhling
A Dahan
A Haq
A Hartmann
A Herwadkar
A Jadoul
A Joshi
A Kumar
A Manosroi
A Mazumder
A Melero
A Najjar
A Nokhodchi
A Pappas
A Patel
A Puri
A Sen
A Sharma
A Verma
A Zhang
AA Romanovsky
AC Fischer
AC Santos
AC Watkinson
AC Williams
AD Bangham
AD Permana
AF Moreira
AH Al Shuwaili
AH Elshafeey
AJ Courtenay
AJ Courtenay
AK Jain
ALM Ruela
AM Abbas
AM Barbero
AM Brown
AM Goldstein
AM Rodgers
AN Nyaku
AP Morris
AR Denet
AR Kim
AS Michaels
AS Torky
AV Nguyen
AW Lim
AZ Alkilani
B Cai
B Kim
B Yavuz
B Zorec
BD Wilson
BM Magnusson
BN Nalluri
BN Singh
BW Barry
C Amnuaikit
C Audiger
C Dillon
C Liu
C Vijaya
CE Serna-Jiménez
CK Song
CR Safinya
D Ameen
D Chakhalian
D Pando
D Park
D Park
D Ramadon
D Ramadon
D Ramadon
D Ramadon
D Ramadon
DD N’Da
DG Kassan
DH Shim
DIJ Morrow
DV McAllister
E Caffarel-Salvador
E Espey
E Kim
E Larrañeta
E Nuxoll
E Touitou
E Touitou
E Vranić
E-M Holmes
EA Shirshin
EA Tansey
EM Cahill
EM Migdadi
EM Saurer
EM Vicente-Perez
EM Vicente-Pérez
F Fratini
F Maestrelli
FF Sahle
FJ Verbaan
G Bozzuto
G Cevc
G Li
G Tan
GJ Molloy
H Abdelkader
H Cui
H Marwah
H Mirzaei
H Vallhov
H Zhai
H Zhao
HAE Benson
HI Maibach
HL Chen
HL Quinn
HN Cole
HP Ju
HS Gill
HX Nguyen
I Abiandu
I Mansoor
I Singh
I Som
IB Pathan
IK Ramöller
IK Ramöller
J Abraham
J Chen
J Choi
J Cázares-Delgadillo
J Drustrup
J Hadgraft
J Hannavy
J Hao
J Jaiswal
J Klimentová
J Li
J Mueller
J Piret
J Proctor
J Sandby-Møller
J Stahl
J Wang
JA Jona
JC McElnay
JD Zahn
JG Betts
JG Hardy
JH Park
JJ Escobar-Chávez
JM Sorrell
JW Judy
JY Fang
K Bavaskar
K Clayton
K Ita
K Jones
K Khatoon
K Menon
K Moffatt
K Tsuchiya
KL Yung
KS Paudel
L Djekic
L Eckhart
L Karpagavalli
L Machet
L Niu
L Tavano
LB Baker
LK Vora
LK Vora
LM Russell
LN Shen
LP Gangarosa Sr
M Argenziano
M Boer
M Cichorek
M Cormier
M He
M Hetta
M Leone
M Lovászi
M Lu
M Roustit
M Sacha
M Shikida
M Sznitowska
MA Calatayud-Pascual
MB Brown
MC Kearney
MC Kearney
MD Ansari
MD Shin
MF Peralta
MK Marschütz
ML González-Rodríguez
MN Pastore
MR Prausnitz
MR Prausnitz
MS Gerstel
MT Hoang
MTC McCrudden
MTC McCrudden
N Akhtar
N Chandrashekar
N Grimaldi
N Maluf
NA Charoo
NC Dlova
NX Wen
O Pillai
OV Krylova
P Bhardwaj
P Gazerani
P Jurĉíĉek
P Karande
P Sakdiset
P Verma
P Zheng
PAJ Kolarsick
PJ Lee
PK Kiptoo
PM Wang
PS Giffen
PW Stott
PW Stott
Q Hu
Q Li
QY Li
R Al-Kasasbeh
R Driskell
R Hajar
R Muzzalupo
R Rao
R Vanbever
R Wong
R Wölfel
RF Donnellly
RF Donnelly
RF Donnelly
RF Donnelly
RF Donnelly
RF Donnelly
RF Donnelly
RF Donnelly
RF Donnelly
RKR Rajoli
RR Wickett
S Baji
S Björklund
S Bystrova
S Hamanaka
S Henry
S Jain
S Jaitley
S Kommareddy
S Liu
S Ma
S Mitragotri
S Mitragotri
S Moghassemi
S Rai
S Ramkanth
S Ross
S Saluja
S Singh
S Valiveti
S Wiedersberg
SH Abd El-Alim
SH Baek
SH Nam
SJ Moon
SN Murthy
SP Sullivan
SS Ajazuddin
SS Zhou
STK Narishetty
T Blagus
T Haque
T Ilić
T Matsui
T Moolakkadath
T Ogiso
T Ogiso
T Waghule
TA McDonald
TJ Lin
TM Adams
TM Rawson
TO Herndon
TR Yerramreddy
V Dhote
V Garg
V Ramezani
VM Meidan
W Chen
W Li
W Li
W Liu
W Martanto
W Sun
W Wang
W Yu
W Zhou
X Chen
X Ge
X Li
X Yu
Y Hiraishi
Y Ito
Y Ito
Y Kapoor
Y Li
Y Wu
Y Zhang
YK Demir
YK Demir
YL Deng
YT Zhang
Z Jing
Z Nemes
Z Rukavina
ZL Van
ZX Chen
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 20/01/2021
Field of study

Queen's University Belfast Research Portal

Crossref

Ulster University's Research Portal

A 12-gene pharmacogenetic panel to prevent adverse drug reactions: an open-label, multicentre, controlled, cluster-randomised crossover implementation study

Background The benefit of pharmacogenetic testing before starting drug therapy has been well documented for several single gene–drug combinations. However, the clinical utility of a pre-emptive genotyping strategy using a pharmacogenetic panel has not been rigorously assessed. Methods We conducted an open-label, multicentre, controlled, cluster-randomised, crossover implementation study of a 12-gene pharmacogenetic panel in 18 hospitals, nine community health centres, and 28 community pharmacies in seven European countries (Austria, Greece, Italy, the Netherlands, Slovenia, Spain, and the UK). Patients aged 18 years or older receiving a first prescription for a drug clinically recommended in the guidelines of the Dutch Pharmacogenetics Working Group (ie, the index drug) as part of routine care were eligible for inclusion. Exclusion criteria included previous genetic testing for a gene relevant to the index drug, a planned duration of treatment of less than 7 consecutive days, and severe renal or liver insufficiency. All patients gave written informed consent before taking part in the study. Participants were genotyped for 50 germline variants in 12 genes, and those with an actionable variant (ie, a drug–gene interaction test result for which the Dutch Pharmacogenetics Working Group [DPWG] recommended a change to standard-of-care drug treatment) were treated according to DPWG recommendations. Patients in the control group received standard treatment. To prepare clinicians for pre-emptive pharmacogenetic testing, local teams were educated during a site-initiation visit and online educational material was made available. The primary outcome was the occurrence of clinically relevant adverse drug reactions within the 12-week follow-up period. Analyses were irrespective of patient adherence to the DPWG guidelines. The primary analysis was done using a gatekeeping analysis, in which outcomes in people with an actionable drug–gene interaction in the study group versus the control group were compared, and only if the difference was statistically significant was an analysis done that included all of the patients in the study. Outcomes were compared between the study and control groups, both for patients with an actionable drug–gene interaction test result (ie, a result for which the DPWG recommended a change to standard-of-care drug treatment) and for all patients who received at least one dose of index drug. The safety analysis included all participants who received at least one dose of a study drug. This study is registered with ClinicalTrials.gov, NCT03093818 and is closed to new participants. Findings Between March 7, 2017, and June 30, 2020, 41696 patients were assessed for eligibility and 6944 (51·4 % female, 48·6% male; 97·7% self-reported European, Mediterranean, or Middle Eastern ethnicity) were enrolled and assigned to receive genotype-guided drug treatment (n=3342) or standard care (n=3602). 99 patients (52 [1·6%] of the study group and 47 [1·3%] of the control group) withdrew consent after group assignment. 652 participants (367 [11·0%] in the study group and 285 [7·9%] in the control group) were lost to follow-up. In patients with an actionable test result for the index drug (n=1558), a clinically relevant adverse drug reaction occurred in 152 (21·0%) of 725 patients in the study group and 231 (27·7%) of 833 patients in the control group (odds ratio [OR] 0·70 [95% CI 0·54–0·91]; p=0·0075), whereas for all patients, the incidence was 628 (21·5%) of 2923 patients in the study group and 934 (28·6%) of 3270 patients in the control group (OR 0·70 [95% CI 0·61–0·79]; p Horizon 2020 (H2020)Genetics of disease, diagnosis and treatmen

Leiden University Scholary Publications