282 research outputs found

    Kernel Methods and Measures for Classification with Transparency, Interpretability and Accuracy in Health Care

    Get PDF
    Support vector machines are a popular method in machine learning. They learn from data about a subject, for example, lung tumors in a set of patients, to classify new data, such as, a new patient’s tumor. The new tumor is classified as either cancerous or benign, depending on how similar it is to the tumors of other patients in those two classes—where similarity is judged by a kernel. The adoption and use of support vector machines in health care, however, is inhibited by a perceived and actual lack of rationale, understanding and transparency for how they work and how to interpret information and results from them. For example, a user must select the kernel, or similarity function, to be used, and there are many kernels to choose from but little to no useful guidance on choosing one. The primary goal of this thesis is to create accurate, transparent and interpretable kernels with rationale to select them for classification in health care using SVM—and to do so within a theoretical framework that advances rationale, understanding and transparency for kernel/model selection with atomic data types. The kernels and framework necessarily co-exist. The secondary goal of this thesis is to quantitatively measure model interpretability for kernel/model selection and identify the types of interpretable information which are available from different models for interpretation. Testing my framework and transparent kernels with empirical data I achieve classification accuracy that is better than or equivalent to the Gaussian RBF kernels. I also validate some of the model interpretability measures I propose

    Partial discharge feature extraction based on ensemble empirical mode decomposition and sample entropy

    Get PDF
    Partial Discharge (PD) pattern recognition plays an important part in electrical equipment fault diagnosis and maintenance. Feature extraction could greatly affect recognition results. Traditional PD feature extraction methods suffer from high-dimension calculation and signal attenuation. In this study, a novel feature extraction method based on Ensemble Empirical Mode Decomposition (EEMD) and Sample Entropy (SamEn) is proposed. In order to reduce the influence of noise, a wavelet method is applied to PD de-noising. Noise Rejection Ratio (NRR) and Mean Square Error (MSE) are adopted as the de-noising indexes. With EEMD, the de-noised signal is decomposed into a finite number of Intrinsic Mode Functions (IMFs). The IMFs, which contain the dominant information of PD, are selected using a correlation coefficient method. From that, the SamEn of selected IMFs are extracted as PD features. Finally, a Relevance Vector Machine (RVM) is utilized for pattern recognition using the features extracted. Experimental results demonstrate that the proposed method combines excellent properties of both EEMD and SamEn. The recognition results are encouraging with satisfactory accuracy

    Deep ROC Analysis and AUC as Balanced Average Accuracy to Improve Model Selection, Understanding and Interpretation

    Get PDF
    Optimal performance is critical for decision-making tasks from medicine to autonomous driving, however common performance measures may be too general or too specific. For binary classifiers, diagnostic tests or prognosis at a timepoint, measures such as the area under the receiver operating characteristic curve, or the area under the precision recall curve, are too general because they include unrealistic decision thresholds. On the other hand, measures such as accuracy, sensitivity or the F1 score are measures at a single threshold that reflect an individual single probability or predicted risk, rather than a range of individuals or risk. We propose a method in between, deep ROC analysis, that examines groups of probabilities or predicted risks for more insightful analysis. We translate esoteric measures into familiar terms: AUC and the normalized concordant partial AUC are balanced average accuracy (a new finding); the normalized partial AUC is average sensitivity; and the normalized horizontal partial AUC is average specificity. Along with post-test measures, we provide a method that can improve model selection in some cases and provide interpretation and assurance for patients in each risk group. We demonstrate deep ROC analysis in two case studies and provide a toolkit in Python.Comment: 14 pages, 6 Figures, submitted to IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), currently under revie

    A Novel Fuzzy Multilayer Perceptron (F-MLP) for the Detection of Irregularity in Skin Lesion Border Using Dermoscopic Images

    Get PDF
    Skin lesion border irregularity, which represents the B feature in the ABCD rule, is considered one of the most significant factors in melanoma diagnosis. Since signs that clinicians rely on in melanoma diagnosis involve subjective judgment including visual signs such as border irregularity, this deems it necessary to develop an objective approach to finding border irregularity. Increased research in neural networks has been carried out in recent years mainly driven by the advances of deep learning. Artificial neural networks (ANNs) or multilayer perceptrons have been shown to perform well in supervised learning tasks. However, such networks usually don't incorporate information pertaining the ambiguity of the inputs when training the network, which in turn could affect how the weights are being updated in the learning process and eventually degrading the performance of the network when applied on test data. In this paper, we propose a fuzzy multilayer perceptron (F-MLP) that takes the ambiguity of the inputs into consideration and subsequently reduces the effects of ambiguous inputs on the learning process. A new optimization function, the fuzzy gradient descent, has been proposed to reflect those changes. Moreover, a type-II fuzzy sigmoid activation function has also been proposed which enables finding the range of performance the fuzzy neural network is able to attain. The fuzzy neural network was used to predict the skin lesion border irregularity, where the lesion was firstly segmented from the skin, the lesion border extracted, border irregularity measured using a proposed measure vector, and using the extracted border irregularity measures to train the neural network. The proposed approach outperformed most of the state-of-the-art classification methods in general and its standard neural network counterpart in particular. However, the proposed fuzzy neural network was more time-consuming when training the network

    A Comparative Study on Support Vector Machines

    Get PDF
    In this thesis, we study Support Vector Machines (SVMs) for binary classification. We review literature on SVMs and other classification methods. We perform simulations to compare kernel functions found in selected R packages and also investigate the variable selection property of penalized SVMs. We consider most linearly separable data set, mostly linearly non-separable data set, and linearly non-separable data set requiring nonlinear SVMs. In addition, traditional classification methods, including the Linear Discriminant Analysis, Quadratic Discriminant Analysis, K-Nearest Neighbors, and Logistic Regression, are also fit to the data sets and compared to the SVM models. The results of the simulation indicate that choosing a kernel function is key to obtaining a good fit to a particular data set. Moreover, in situations where nonlinear SVMs are not required (such as the linear separable data set) fitting nonlinear SVMs to a data set might likely result in overfitting. Finally, we apply SVMs and other classification techniques to Alzheimer's disease data

    Semantic Classification of Scientific Sentence Pair Using Recurrent Neural Network

    Get PDF
    One development of Natural Language Processing is the semantic classification of sentences and documents. The challenge is finding relationships between words and between documents through a computational model. The development of machine learning makes it possible to try out various possibilities that provide classification capabilities. This paper proposes the semantic classification of sentence pairs using Recurrent Neural Networks (RNN) and Long Short-Term Memory (LSTM). Each couple of sentences is turned into vectors using Word2Vec. Experiments carried out using CBOW and Skip-Gram to get the best combination. The results are obtained that word embedding using CBOW produces better than Skip-Gram, although it is still around 5%. However, CBOW slows slightly at the beginning of iteration but is stable towards convergence. Classification of all six classes, namely Equivalent, Similar, Specific, No Alignment, Related, and Opposite. As a result of the unbalanced data set, the retraining was conducted by eliminating a few classes member from the data set, thus providing an accuracy of 73% for non-training data. The results showed that the Adam model gave a faster convergence at the start of training compared to the SGD model, and AdaDelta, which was built, gave 75% better accuracy with an F1-Score of 67%

    Estimation of wrist angle from sonomyography using support vector machine and artificial neural network models

    Get PDF
    2008-2009 > Academic research: refereed > Publication in refereed journalAccepted ManuscriptPublishe
    corecore