78,271 research outputs found

    Knowledge Initialisation for Support Vector Machines

    Get PDF
    Since their introduction more than a decade ago, support vector machines (SVMs) have shown good performance in a number of application areas, including text classification, pattern recognition and bioinformatics. However, the success of SVMs comes at a cost - there is no way to utilise prior knowledge. SVMs are purely inductive learning machines. In this paper, a novel approach for rule initialisation for support vector machines is presented. The application domain is medical diagnosis. The approach presented here uses domain knowledge in the form of propositional rules to create a virtual data set to bias an SVM. The virtual data set is combined with real data for SVM learning. Knowledge initialisation results in better classification accuracy and enhanced rule quality compared with purely inductive learning

    Dimensionality Reduction, Classification and Reconstruction Problems in Statistical Learning Approaches

    Get PDF
    Statistical learning theory explores ways of estimating functional dependency from a given collection of data. The specific sub-area of supervised statistical learning covers important models like Perceptron, Support Vector Machines (SVM) and Linear Discriminant Analysis (LDA). In this paper we review the theory of such models and compare their separating hypersurfaces for extracting group-differences between samples. Classification and reconstruction are the main goals of this comparison. We show recent advances in this topic of research illustrating their application on face and medical image databases.Statistical learning theory explores ways of estimating functional dependency from a given collection of data. The specific sub-area of supervised statistical learning covers important models like Perceptron, Support Vector Machines (SVM) and Linear Discriminant Analysis (LDA). In this paper we review the theory of such models and compare their separating hypersurfaces for extracting group-differences between samples. Classification and reconstruction are the main goals of this comparison. We show recent advances in this topic of research illustrating their application on face and medical image databases

    Survey: Data Mining Techniques in Medical Data Field

    Get PDF
    Now days most of the research area are working on data mining techniques in medical data. Knowledge discovery and data mining have found numerous applications in business and scientific domain. Valuable knowledge can be discovered from application of data mining techniques in healthcare system. In this study, we briefly examine the potential use of classification based data mining techniques such as Rule based, decision tree, machine learning algorithms like Support Vector Machines, Principle Component Analysis etc., Rough Set Theory and Fuzzy logic. In particular we consider a case study using classification techniques on a medical data set of diabetic patients

    Learning-based Rule-Extraction from Support Vector Machines

    Get PDF
    In recent years, support vector machines (SVMs) have shown good performance in a number of application areas, including text classification. However, the success of SVMs comes at a cost - an inability to explain the process by which a learning result was reached and why a decision is being made. Rule-extraction from SVMs is important for the acceptance of this machine learning technology, especially for applications such as medical diagnosis. It is crucial for the users to understand how the system makes a decision. In this paper, a novel approach for rule-extraction from support vector machines is presented. This approach handles rule-extraction as a learning task, which proceeds in two steps. The first is to use the labeled patterns from a data set to train an SVM. The second step is to use the generated model to predict the label (class) for an extended data set or different, unlabeled data set. The resulting patterns are then used to train a decision tree learning system and to extract the corresponding rule sets. The output rule sets are verified against available knowledge for the domain problem (e.g. a medical expert), and other classification techniques, to assure correctness and validity of rules

    Medical Image Classification via SVM using LBP Features from Saliency-Based Folded Data

    Full text link
    Good results on image classification and retrieval using support vector machines (SVM) with local binary patterns (LBPs) as features have been extensively reported in the literature where an entire image is retrieved or classified. In contrast, in medical imaging, not all parts of the image may be equally significant or relevant to the image retrieval application at hand. For instance, in lung x-ray image, the lung region may contain a tumour, hence being highly significant whereas the surrounding area does not contain significant information from medical diagnosis perspective. In this paper, we propose to detect salient regions of images during training and fold the data to reduce the effect of irrelevant regions. As a result, smaller image areas will be used for LBP features calculation and consequently classification by SVM. We use IRMA 2009 dataset with 14,410 x-ray images to verify the performance of the proposed approach. The results demonstrate the benefits of saliency-based folding approach that delivers comparable classification accuracies with state-of-the-art but exhibits lower computational cost and storage requirements, factors highly important for big data analytics.Comment: To appear in proceedings of The 14th International Conference on Machine Learning and Applications (IEEE ICMLA 2015), Miami, Florida, USA, 201

    Aplikasi Metode Fuzzy Kernel K-Medoids untuk Klasifikasi Kanker berdasarkan Konsentrasi Logam di dalam Darah

    Get PDF
    Classification technique has already been applied widely in the medical data. One of its applications is for classification of cancer. The accuracy of this technique highly depends on the type of data to be processed (whether the data are separable or non-separable) and the dissimilarity function used. To surmount those hindrances and to improve the accuracy of classification therefore a method named Fuzzy Kernel K-Medoids (FKKM). The method can be used for separable or non separable of data. Based on the research on the concentration data of Zn, Ba, Mg, Ca, Cu, and Se in blood in order to diagnose cancer, FKKM gives better result than the Support Vector Machines Method. This paper will discuss an application of the FKKM method on the concentration data of Zn, Ba, Mg, Ca, Cu, and Se in blood samples and compared with the Support Vector Machines Method for the diagnosis of cancer. Results showed that the FKKM method produced a better result than the Support Vector Machines Method.Teknik klasifikasi telah diaplikasikan secara luas didalam bidang medis. Salah satunya adalah untuk klasifikasi kanker. Akurasi teknik ini sangat tinggi tergantung pada tipe data yang diproses (apakah data dapat atau tidak dapat dipisahkan secara linear) dan fungsi disimiliritas yang digunakan. Untuk mengatasi kedua hambatan tersebut dan meningkatkan akurasi teknik klasifikasi dibentuk suatu metode yang dinamakan Fuzzy Kernel K-Medoids (FKKM). Metode ini dapat digunakan untuk data yang dapat dipisahkansecara linear maupun tidak. Berdasarkan hasil penelitian terhadap konsentrasi logam Zn, Ba, Mg, Ca, Cu, dan Se dalam darah, didalam mendiagnosis penyakit kanker, FKKM memberikan hasil yang lebih baik dibandingkan dengan metode Support Vector Machines

    Pain Level Detection From Facial Image Captured by Smartphone

    Get PDF
    Accurate symptom of cancer patient in regular basis is highly concern to the medical service provider for clinical decision making such as adjustment of medication. Since patients have limitations to provide self-reported symptoms, we have investigated how mobile phone application can play the vital role to help the patients in this case. We have used facial images captured by smart phone to detect pain level accurately. In this pain detection process, existing algorithms and infrastructure are used for cancer patients to make cost low and user-friendly. The pain management solution is the first mobile-based study as far as we found today. The proposed algorithm has been used to classify faces, which is represented as a weighted combination of Eigenfaces. Here, angular distance, and support vector machines (SVMs) are used for the classification system. In this study, longitudinal data was collected for six months in Bangladesh. Again, cross-sectional pain images were collected from three different countries: Bangladesh, Nepal and the United States. In this study, we found that personalized model for pain assessment performs better for automatic pain assessment. We also got that the training set should contain varying levels of pain in each group: low, medium and high

    Kernel-based methods for persistent homology and their applications to Alzheimer's Disease

    Get PDF
    Kernel-based methods are powerful tools that are widely applied in many applications and fields of research. In recent years, methods from computational topology have emerged for characterizing the intrinsic geometry of data. Persistence homology is a central tool in topological data analysis, which allows to capture the evolution of topological features of the data. Persistence diagrams represent a natural way to summarize these features, but they can not be directly used in machine learning algorithms. To deal with them, we first analyse various kernel-based methods of recent development, then we propose and apply Variable Scaled Kernels (VSKs) to the persistence diagrams framework. We therefore discuss the application of these kernels in medical imaging in the context of Alzheimer’s Disease classification. Taking into account the cortical thickness measures on the cortical surface, we build the persistence diagrams upon different MRI subjects and we perform some classification tests using the support vector machines classifier.ope

    Automated Identification of Unhealthy Drinking Using Routinely Collected Data: A Machine Learning Approach

    Get PDF
    Background: Unhealthy drinking is prevalent in the United States and can lead to serious health and social consequences, yet it is under-diagnosed and under-treated. Identifying unhealthy drinkers can be time-consuming for primary care providers. An automated tool for identification would allow attention to be focused on patients most likely to need care and therefore increase efficiency and effectiveness. Objectives: To build a clinical prediction tool for unhealthy drinking based solely on routinely collected demographic and laboratory data. Methods: We obtained demographic and laboratory data on 89,325 adults seen at the University of Vermont Medical Center from 2011-2017. Logistic regression, support vector machines (SVM), k-nearest neighbor, and random forests were each used to build clinical prediction models. The model with the largest area under the receiver operator curve (AUC) was selected. Results: SVM with polynomials of degree 3 produced the largest AUC. The most influential predictors were alkaline phosphatase, gender, glucose, and serum bicarbonate. The optimum operating point had sensitivity 31.1%, specificity 91.2%, positive predictive value 50.4%, and negative predictive value 82.1%. Application of the tool increased the prevalence of unhealthy drinking from 18.3% to 32.4%, while reducing the target population by 22%. Limitations: Universal screening was not used during the time data was collected. The prevalence of unhealthy drinking among those screened was 60% suggesting the AUDIT-C was administered to confirm rather than screen for unhealthy drinking. Conclusion: An automated tool, using commonly available data, can identify a subset of patients who appear to warrant clinical attention for unhealthy drinking

    Classification of Resting-State fMRI using Evolutionary Algorithms: Towards a Brain Imaging Biomarker for Parkinson’s Disease

    Get PDF
    It is commonly accepted that accurate early diagnosis and monitoring of neurodegenerative conditions is essential for effective disease management and delivery of medication and treatment. This research develops automatic methods for detecting brain imaging preclinical biomarkers for Parkinson’s disease (PD) by considering the novel application of evolutionary algorithms. An additional novel element of this work is the use of evolutionary algorithms to both map and predict the functional connectivity in patients using rs-fMRI data. Specifically, Cartesian Genetic Programming was used to classify dynamic causal modelling data as well as timeseries data. The findings were validated using two other commonly used classification methods (Artificial Neural Networks and Support Vector Machines) and by employing k-fold cross-validation. Across dynamic causal modelling and timeseries analyses, findings revealed maximum accuracies of 75.21% for early stage (prodromal) PD patients in which patients reveal no motor symptoms versus healthy controls, 85.87% for PD patients versus prodromal PD patients, and 92.09% for PD patients versus healthy controls. Prodromal PD patients were classified from healthy controls with high accuracy – this is notable and represents the key finding since current methods of diagnosing prodromal PD have low reliability and low accuracy. Furthermore, Cartesian Genetic Programming provided comparable performance accuracy relative to Artificial Neural Networks and Support Vector Machines. Nevertheless, evolutionary algorithms enable us to decode the classifier in terms of understanding the data inputs that are used, more easily than in Artificial Neural Networks and Support Vector Machines. Hence, these findings underscore the relevance of both dynamic causal modelling analyses for classification and Cartesian Genetic Programming as a novel classification tool for brain imaging data with medical implications for disease diagnosis, particularly in early stages 5-20 years prior to motor symptoms
    • …
    corecore