1,699 research outputs found

    Automated design of robust discriminant analysis classifier for foot pressure lesions using kinematic data

    Get PDF
    In the recent years, the use of motion tracking systems for acquisition of functional biomechanical gait data, has received increasing interest due to the richness and accuracy of the measured kinematic information. However, costs frequently restrict the number of subjects employed, and this makes the dimensionality of the collected data far higher than the available samples. This paper applies discriminant analysis algorithms to the classification of patients with different types of foot lesions, in order to establish an association between foot motion and lesion formation. With primary attention to small sample size situations, we compare different types of Bayesian classifiers and evaluate their performance with various dimensionality reduction techniques for feature extraction, as well as search methods for selection of raw kinematic variables. Finally, we propose a novel integrated method which fine-tunes the classifier parameters and selects the most relevant kinematic variables simultaneously. Performance comparisons are using robust resampling techniques such as Bootstrap632+632+and k-fold cross-validation. Results from experimentations with lesion subjects suffering from pathological plantar hyperkeratosis, show that the proposed method can lead tosim96sim 96%correct classification rates with less than 10% of the original features

    A survey of cost-sensitive decision tree induction algorithms

    Get PDF
    The past decade has seen a significant interest on the problem of inducing decision trees that take account of costs of misclassification and costs of acquiring the features used for decision making. This survey identifies over 50 algorithms including approaches that are direct adaptations of accuracy based methods, use genetic algorithms, use anytime methods and utilize boosting and bagging. The survey brings together these different studies and novel approaches to cost-sensitive decision tree learning, provides a useful taxonomy, a historical timeline of how the field has developed and should provide a useful reference point for future research in this field

    A New Computer-Aided Diagnosis System with Modified Genetic Feature Selection for BI-RADS Classification of Breast Masses in Mammograms

    Full text link
    Mammography remains the most prevalent imaging tool for early breast cancer screening. The language used to describe abnormalities in mammographic reports is based on the breast Imaging Reporting and Data System (BI-RADS). Assigning a correct BI-RADS category to each examined mammogram is a strenuous and challenging task for even experts. This paper proposes a new and effective computer-aided diagnosis (CAD) system to classify mammographic masses into four assessment categories in BI-RADS. The mass regions are first enhanced by means of histogram equalization and then semiautomatically segmented based on the region growing technique. A total of 130 handcrafted BI-RADS features are then extrcated from the shape, margin, and density of each mass, together with the mass size and the patient's age, as mentioned in BI-RADS mammography. Then, a modified feature selection method based on the genetic algorithm (GA) is proposed to select the most clinically significant BI-RADS features. Finally, a back-propagation neural network (BPN) is employed for classification, and its accuracy is used as the fitness in GA. A set of 500 mammogram images from the digital database of screening mammography (DDSM) is used for evaluation. Our system achieves classification accuracy, positive predictive value, negative predictive value, and Matthews correlation coefficient of 84.5%, 84.4%, 94.8%, and 79.3%, respectively. To our best knowledge, this is the best current result for BI-RADS classification of breast masses in mammography, which makes the proposed system promising to support radiologists for deciding proper patient management based on the automatically assigned BI-RADS categories

    A Survey of Parallel Data Mining

    Get PDF
    With the fast, continuous increase in the number and size of databases, parallel data mining is a natural and cost-effective approach to tackle the problem of scalability in data mining. Recently there has been a considerable research on parallel data mining. However, most projects focus on the parallelization of a single kind of data mining algorithm/paradigm. This paper surveys parallel data mining with a broader perspective. More precisely, we discuss the parallelization of data mining algorithms of four knowledge discovery paradigms, namely rule induction, instance-based learning, genetic algorithms and neural networks. Using the lessons learned from this discussion, we also derive a set of heuristic principles for designing efficient parallel data mining algorithms

    Improving activity recognition using a wearable barometric pressure sensor in mobility-impaired stroke patients.

    Get PDF
    © 2015 Massé et al.Background: Stroke survivors often suffer from mobility deficits. Current clinical evaluation methods, including questionnaires and motor function tests, cannot provide an objective measure of the patients mobility in daily life. Physical activity performance in daily-life can be assessed using unobtrusive monitoring, for example with a single sensor module fixed on the trunk. Existing approaches based on inertial sensors have limited performance, particularly in detecting transitions between different activities and postures, due to the inherent inter-patient variability of kinematic patterns. To overcome these limitations, one possibility is to use additional information from a barometric pressure (BP) sensor. Methods: Our study aims at integrating BP and inertial sensor data into an activity classifier in order to improve the activity (sitting, standing, walking, lying) recognition and the corresponding body elevation (during climbing stairs or when taking an elevator). Taking into account the trunk elevation changes during postural transitions (sit-to-stand, stand-to-sit), we devised an event-driven activity classifier based on fuzzy-logic. Data were acquired from 12 stroke patients with impaired mobility, using a trunk-worn inertial and BP sensor. Events, including walking and lying periods and potential postural transitions, were first extracted. These events were then fed into a double-stage hierarchical Fuzzy Inference System (H-FIS). The first stage processed the events to infer activities and the second stage improved activity recognition by applying behavioral constraints. Finally, the body elevation was estimated using a pattern-enhancing algorithm applied on BP. The patients were videotaped for reference. The performance of the algorithm was estimated using the Correct Classification Rate (CCR) and F-score. The BP-based classification approach was benchmarked against a previously-published fuzzy-logic classifier (FIS-IMU) and a conventional epoch-based classifier (EPOCH). Results: The algorithm performance for posture/activity detection, in terms of CCR was 90.4 %, with 3.3 % and 5.6 % improvements against FIS-IMU and EPOCH, respectively. The proposed classifier essentially benefits from a better recognition of standing activity (70.3 % versus 61.5 % [FIS-IMU] and 42.5 % [EPOCH]) with 98.2 % CCR for body elevation estimation. Conclusion: The monitoring and recognition of daily activities in mobility-impaired stoke patients can be significantly improved using a trunk-fixed sensor that integrates BP, inertial sensors, and an event-based activity classifier

    Optimization of Computer Aided Detection systems: an evolutionary approach

    Get PDF
    Computer Aided Diagnosis (CAD) systems are designed to aid the radiologist in interpreting medical images. They are usually based on lesion detection and segmentation algorithms whose performance depends on a large number of parameters. While time consuming and sub-optimal, manual adjustment is still widely used to adjust parameter values. Genetic or evolutionary algorithms (GA) are effective optimization methods that mimic biological evolution. Genetic algorithms have been shown to efficiently manage complex search spaces, and can be applied to all kinds of objective functions, including discontinuous, nondifferentiable, or highly nonlinear ones. In this study, we have adopted an evolutionary approach to the problem of parameter optimization. We show that the genetic algorithm is able to effectively converge to a better solution than manual optimization on a case study for digital breast tomosynthesis CAD. Parameter optimization was framed as a constrained optimization problem, where the function to be maximized was defined as weighted sum of sensitivity, false positive rate and segmentation accuracy. A modified Dice coefficient was defined to assess the segmentation quality of individual lesions. Finally, all viable solutions evaluated by the GA were studied by means of exploratory data analysis techniques, such as association rules, to gain useful insight on the strength of the influence of each parameter on overall algorithm performance. We showed that this combination was able to identify multiple ranges of viable solutions with good segmentation accuracy

    Feature Selection and Molecular Classification of Cancer Using Genetic Programming

    Get PDF
    AbstractDespite important advances in microarray-based molecular classification of tumors, its application in clinical settings remains formidable. This is in part due to the limitation of current analysis programs in discovering robust biomarkers and developing classifiers with a practical set of genes. Genetic programming (GP) is a type of machine learning technique that uses evolutionary algorithm to simulate natural selection as well as population dynamics, hence leading to simple and comprehensible classifiers. Here we applied GP to cancer expression profiling data to select feature genes and build molecular classifiers by mathematical integration of these genes. Analysis of thousands of GP classifiers generated for a prostate cancer data set revealed repetitive use of a set of highly discriminative feature genes, many of which are known to be disease associated. GP classifiers often comprise five or less genes and successfully predict cancer types and subtypes. More importantly, GP classifiers generated in one study are able to predict samples from an independent study, which may have used different microarray platforms. In addition, GP yielded classification accuracy better than or similar to conventional classification methods. Furthermore, the mathematical expression of GP classifiers provides insights into relationships between classifier genes. Taken together, our results demonstrate that GP may be valuable for generating effective classifiers containing a practical set of genes for diagnostic/ prognostic cancer classification
    • …
    corecore