Search CORE

6,030 research outputs found

Dimensionality reduction of clustered data sets

Author: Sanguinetti G.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/03/2008
Field of study

We present a novel probabilistic latent variable model to perform linear dimensionality reduction on data sets which contain clusters. We prove that the maximum likelihood solution of the model is an unsupervised generalisation of linear discriminant analysis. This provides a completely new approach to one of the most established and widely used classification algorithms. The performance of the model is then demonstrated on a number of real and artificial data sets

CiteSeerX

Crossref

White Rose Research Online

Probabilistic classification of acute myocardial infarction from multiple cardiac markers

Author: C Bishop
F Dombal de
F Fesmire
F Fesmire
George W. Irwin
H Selker
J Ellenuis
J Habbema
J Habbema
J Habbema
J Hanley
J Hilden
J Hilden
John V. Lamont
L Goldman
L Goldman
M Pozen
Paul C. Wilson
Robert F. Harrison
T Groth
W Tierney
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/06/2008
Field of study

Logistic regression and Gaussian mixture model (GMM) classifiers have been trained to estimate the probability of acute myocardial infarction (AMI) in patients based upon the concentrations of a panel of cardiac markers. The panel consists of two new markers, fatty acid binding protein (FABP) and glycogen phosphorylase BB (GPBB), in addition to the traditional cardiac troponin I (cTnI), creatine kinase MB (CKMB) and myoglobin. The effect of using principal component analysis (PCA) and Fisher discriminant analysis (FDA) to preprocess the marker concentrations was also investigated. The need for classifiers to give an accurate estimate of the probability of AMI is argued and three categories of performance measure are described, namely discriminatory ability, sharpness, and reliability. Numerical performance measures for each category are given and applied. The optimum classifier, based solely upon the samples take on admission, was the logistic regression classifier using FDA preprocessing. This gave an accuracy of 0.85 (95% confidence interval: 0.78–0.91) and a normalised Brier score of 0.89. When samples at both admission and a further time, 1–6 h later, were included, the performance increased significantly, showing that logistic regression classifiers can indeed use the information from the five cardiac markers to accurately and reliably estimate the probability AMI

Queen's University Belfast Research Portal

Crossref

White Rose Research Online

A Review of Kernel Methods for Feature Extraction in Nonlinear Process Monitoring

Author: Bishop
Chakour
Chiang
Cristianini
Domingos
Gönen
Halim
Hastie
Kolesnikov
Melis
Murphy
Shawe-Taylor
Vachtsevanos
Wilson
Yang
Publication venue: 'MDPI AG'
Publication date: 23/12/2019
Field of study

Kernel methods are a class of learning machines for the fast recognition of nonlinear patterns in any data set. In this paper, the applications of kernel methods for feature extraction in industrial process monitoring are systematically reviewed. First, we describe the reasons for using kernel methods and contextualize them among other machine learning tools. Second, by reviewing a total of 230 papers, this work has identified 12 major issues surrounding the use of kernel methods for nonlinear feature extraction. Each issue was discussed as to why they are important and how they were addressed through the years by many researchers. We also present a breakdown of the commonly used kernel functions, parameter selection routes, and case studies. Lastly, this review provides an outlook into the future of kernel-based process monitoring, which can hopefully instigate more advanced yet practical solutions in the process industries

Multidisciplinary Digital Publishing Institute

Crossref

Cranfield CERES

Kent Academic Repository