Search CORE

235 research outputs found

Probability calibration trees

Author: Frank Eibe
Holmes Geoffrey
Leathart Tim
Pfahringer Bernhard
Publication venue
Publication date: 01/01/2017
Field of study

Obtaining accurate and well calibrated probability estimates from classifiers is useful in many applications, for example, when minimising the expected cost of classifications. Existing methods of calibrating probability estimates are applied globally, ignoring the potential for improvements by applying a more fine-grained model. We propose probability calibration trees, a modification of logistic model trees that identifies regions of the input space in which different probability calibration models are learned to improve performance. We compare probability calibration trees to two widely used calibration methods—isotonic regression and Platt scaling—and show that our method results in lower root mean squared error on average than both methods, for estimates produced by a variety of base learners

arXiv.org e-Print Archive

Research Commons@Waikato

Respiratory symptoms and changes of pulmonary function associated with radiographic evidence of antimony oxide dust retention

Author: G.L. Leathart
Publication venue: Institute for Medical Research and Occupational Health
Publication date: 01/01/1979
Field of study

HRČAK - Portal of Croatian Scientific and Professional Journals

Hrčak - Portal of scientific journals of Croatia

Tree-structured multiclass probability estimators

Author: Leathart Timothy Matthew
Publication venue: The University of Waikato
Publication date: 10/09/2019
Field of study

Nested dichotomies are used as a method of transforming a multiclass classification problem into a series of binary problems. A binary tree structure is constructed over the label space that recursively splits the set of classes into subsets, and a binary classification model learns to discriminate between the two subsets of classes at each node. Several distinct nested dichotomy structures can be built in an ensemble for superior performance. In this thesis, we introduce two new methods for constructing more accurate nested dichotomies. Random-pair selection is a subset selection method that aims to group similar classes together in a non-deterministic fashion to easily enable the construction of accurate ensembles. Multiple subset evaluation takes this, and other subset selection methods, further by evaluating several different splits and choosing the best performing one. Finally, we also discuss the calibration of the probability estimates produced by nested dichotomies. We observe that nested dichotomies systematically produce under-confident predictions, even if the binary classifiers are well calibrated, and especially when the number of classes is high. Furthermore, substantial performance gains can be made when probability calibration methods are also applied to the internal models

Research Commons@Waikato

The SOD2 C47T polymorphism influences NAFLD fibrosis severity: evidence from case-control and intra-familial allele association studies.

Author: Al Serri Ahmad
Anstee Quentin M
Daly Ann K
Day Christopher P
Dongiovanni Paola
Fargion Silvia
Fracanzani Anna
Leathart Julian B S
Nobili Valerio
Patch Julia
Valenti Luca
Publication venue
Publication date: 12/07/2011
Field of study

AIMS: Non-alcoholic fatty liver disease (NAFLD) is a complex disease trait where genetic variations and environment interact to determine disease progression. The association of PNPLA3 with advanced disease has been consistently demonstrated but many other modifier genes remain unidentified. In NAFLD, increased fatty acid oxidation produces high levels of reactive oxygen species. Manganese-dependent superoxide dismutase (MnSOD), encoded by the SOD2 gene, plays an important role in protecting cells from oxidative stress. A common non-synonymous polymorphism in SOD2 (C47T; rs4880) is associated with decreased MnSOD mitochondrial targeting and activity making it a good candidate modifier of NAFLD severity. METHODS: The relevance of the SOD2 C47T polymorphism to fibrotic NAFLD was assessed by two complementary approaches: we sought preferential transmission of alleles from parents to affected children in 71 family trios and adopted a case-control approach to compare genotype frequencies in a cohort of 502 European NAFLD patients. RESULTS: In the family study, 55 families were informative. The T allele was transmitted on 47/76 (62%) possible occasions whereas the C allele was transmitted on only 29/76 (38%) occasions, p=0.038. In the case control study, the presence of advanced fibrosis (stage>1) increased with the number of T alleles, p=0.008 for trend. Multivariate analysis showed susceptibility to advanced fibrotic disease was determined by SOD2 genotype (OR 1.56 (95% CI 1.09-2.25), p=0.014), PNPLA3 genotype (p=0.041), type 2 diabetes mellitus (p=0.009) and histological severity of NASH (p=2.0×10(-16)). CONCLUSIONS: Carriage of the SOD2 C47T polymorphism is associated with more advanced fibrosis in NASH

ZENODO

Ensembles of nested dichotomies with multiple subset evaluation

Author: A Beygelzimer
G Brier
HL Harter
J Demšar
J Fox
J Fürnkranz
J Royston
JJ Rodríguez
L Breiman
L Dong
LI Kuncheva
M Hall
M Meilă
MM Duarte-Villaseñor
R Rifkin
T Hastie
T Leathart
T Leathart
TG Dietterich
V Melnikov
Y LeCun
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 10/09/2018
Field of study

A system of nested dichotomies (NDs) is a method of decomposing a multiclass problem into a collection of binary problems. Such a system recursively applies binary splits to divide the set of classes into two subsets, and trains a binary classifier for each split. Many methods have been proposed to perform this split, each with various advantages and disadvantages. In this paper, we present a simple, general method for improving the predictive performance of NDs produced by any subset selection techniques that employ randomness to construct the subsets. We provide a theoretical expectation for performance improvements, as well as empirical results showing that our method improves the root mean squared error of NDs, regardless of whether they are employed as an individual model or in an ensemble setting

arXiv.org e-Print Archive

Crossref

Research Commons@Waikato

On calibration of nested dichotomies

Author: A Beygelzimer
A Kumar
AH Murphy
CC Chang
F Pedregosa
J Fox
J Platt
K Dembczyński
L Dong
O Russakovsky
P Mahé
R Rifkin
T Hastie
T Leathart
TG Dietterich
V Melnikov
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2019
Field of study

Nested dichotomies (NDs) are used as a method of transforming a multiclass classification problem into a series of binary problems. A tree structure is induced that recursively splits the set of classes into subsets, and a binary classification model learns to discriminate between the two subsets of classes at each node. In this paper, we demonstrate that these NDs typically exhibit poor probability calibration, even when the binary base models are well-calibrated. We also show that this problem is exacerbated when the binary models are poorly calibrated. We discuss the effectiveness of different calibration strategies and show that accuracy and log-loss can be significantly improved by calibrating both the internal base models and the full ND structure, especially when the number of classes is high

Crossref

Research Commons@Waikato

Protein Translation and Cell Death: The Role of Rare tRNAs in Biofilm Formation and in Activating Dormant Phage Killer Genes

Author: A Blanco
A Heydorn
A Mai-Prochnow
A Reisner
A Ritter
A Vivero
AF Barrios
AF González Barrios
AV Mikulskis
BM Prüß
C Balsalobre
C Beloin
C Frumerie
C Madrid
Christophe Herman
CJ Dorman
D Mirelman
D Ren
D Ren
D Ren
D Shah
DL Gally
DL Gally
DL Swenson
DW Jackson
I Connell
J Domka
J Domka
J Lee
J Lee
J Sambrook
JB Leathart
JE Kirby
JK Tinker
JK Tinker
JM Abraham
JM Brown
JM Nieto
JS Webb
K Kohno
K Sauer
KS Yeh
KV Srividhya
L Delgado-Olivares
LA Pratt
M Aviv
M Carmona
M Gjermansen
M Herzberg
M Kitagawa
M Mouriño
MA Schembri
MA Schembri
MC Hansen
MH Saier Jr
N Forns
N Godessart
N Goosen
PP Cherepanov
Q Tu
R Edgar
RD Magnuson
Rodolfo García-Contreras
SK Christensen
SK Christensen
T Atlung
T Baba
T Bansal
T Maeda
TF Fahlen
TH Kawula
Thomas K. Wood
TK Wood
U Dobrindt
V de Lorenzo
VK Sharma
WG Miller
XS Zhang
Xue-Song Zhang
Y Benjamini
Y Tamimi
Younghoon Kim
Publication venue: Public Library of Science
Publication date: 01/01/2008
Field of study

We discovered previously that the small Escherichia coli proteins Hha (hemolysin expression modulating protein) and the adjacent, poorly-characterized YbaJ are important for biofilm formation; however, their roles have been nebulous. Biofilms are intricate communities in which cell signaling often converts single cells into primitive tissues. Here we show that Hha decreases biofilm formation dramatically by repressing the transcription of rare codon tRNAs which serves to inhibit fimbriae production and by repressing to some extent transcription of fimbrial genes fimA and ihfA. In vivo binding studies show Hha binds to the rare codon tRNAs argU, ileX, ileY, and proL and to two prophage clusters D1P12 and CP4-57. Real-time PCR corroborated that Hha represses argU and proL, and Hha type I fimbriae repression is abolished by the addition of extra copies of argU, ileY, and proL. The repression of transcription of rare codon tRNAs by Hha also leads to cell lysis and biofilm dispersal due to activation of prophage lytic genes rzpD, yfjZ, appY, and alpA and due to induction of ClpP/ClpX proteases which activate toxins by degrading antitoxins. YbaJ serves to mediate the toxicity of Hha. Hence, we have identified that a single protein (Hha) can control biofilm formation by limiting fimbriae production as well as by controlling cell death. The mechanism used by Hha is the control of translation via the availability of rare codon tRNAs which reduces fimbriae production and activates prophage lytic genes. Therefore, Hha acts as a toxin in conjunction with co-transcribed YbaJ (TomB) that attenuates Hha toxicity

Crossref

Directory of Open Access Journals

PubMed Central

OAKTrust Digital Repository (Texas A&M Univ)

TM6SF2 rs58542926 influences hepatic fibrosis progression in patients with non-alcoholic fatty liver disease.

Author: Liu Yang-Lin
Reeves Helen L
Burt Alastair D
Tiniakos Dina
McPherson Stuart
Leathart Julian B S
Allison Michael E D
Alexander Graeme J
Piguet Anne Christine
Anty Rodolphe
Donaldson Peter
Aithal Guruprasad P
Francque Sven
Van Gaal Luc
Clement Karine
Ratziu Vlad
Dufour Jean-François
Day Christopher P
Daly Ann K
Anstee Quentin M
Publication venue: Nature Publishing Group
Publication date: 01/01/2014
Field of study

Non-alcoholic fatty liver disease (NAFLD) is an increasingly common condition, strongly associated with the metabolic syndrome, that can lead to progressive hepatic fibrosis, cirrhosis and hepatic failure. Subtle inter-patient genetic variation and environmental factors combine to determine variation in disease progression. A common non-synonymous polymorphism in TM6SF2 (rs58542926 c.449 C>T, p.Glu167Lys) was recently associated with increased hepatic triglyceride content, but whether this variant promotes clinically relevant hepatic fibrosis is unknown. Here we confirm that TM6SF2 minor allele carriage is associated with NAFLD and is causally related to a previously reported chromosome 19 GWAS signal that was ascribed to the gene NCAN. Furthermore, using two histologically characterized cohorts encompassing steatosis, steatohepatitis, fibrosis and cirrhosis (combined n=1,074), we demonstrate a new association, independent of potential confounding factors (age, BMI, type 2 diabetes mellitus and PNPLA3 rs738409 genotype), with advanced hepatic fibrosis/cirrhosis. These findings establish new and important clinical relevance to TM6SF2 in NAFLD

Crossref

Elsevier - Publisher Connector

Southampton (e-Prints Soton)

Caltech Authors

Bern Open Repository and Information System (BORIS)

King's Research Portal