611,987 research outputs found

    Automated design of robust discriminant analysis classifier for foot pressure lesions using kinematic data

    Get PDF
    In the recent years, the use of motion tracking systems for acquisition of functional biomechanical gait data, has received increasing interest due to the richness and accuracy of the measured kinematic information. However, costs frequently restrict the number of subjects employed, and this makes the dimensionality of the collected data far higher than the available samples. This paper applies discriminant analysis algorithms to the classification of patients with different types of foot lesions, in order to establish an association between foot motion and lesion formation. With primary attention to small sample size situations, we compare different types of Bayesian classifiers and evaluate their performance with various dimensionality reduction techniques for feature extraction, as well as search methods for selection of raw kinematic variables. Finally, we propose a novel integrated method which fine-tunes the classifier parameters and selects the most relevant kinematic variables simultaneously. Performance comparisons are using robust resampling techniques such as Bootstrap632+632+and k-fold cross-validation. Results from experimentations with lesion subjects suffering from pathological plantar hyperkeratosis, show that the proposed method can lead tosim96sim 96%correct classification rates with less than 10% of the original features

    Identifying Optimal Parameters And Their Impact For Predicting Credit Card Defaulters Using Machine-Learning Algorithms

    Get PDF
    Data mining and Machine learning are the emerging technologies that are rapidly spreading in every field of life due to their beneficial aspects. The financial sector also makes use of these technologies. Many research studies regarding banking data analysis have been performed using machine learning techniques. These research studies also have many Problems as the main focus of these studies was to achieve high accuracy and some of them only perform comparative analysis of different classifier's performance. Another major drawback of these studies was that they do not identify any optimal parameters and their impact. In this research, we have identified optimal parameters. These parameters are valuable for performing the credit scoring process and might also be used to predict credit card defaulters. We also find their impact on the results. We have used feature selection and classification techniques to identify optimal parameters and their impact on credit card defaulters identification. We have introduced three classifiers which are Kstar, SMO and Multilayer perceptron and repeat the process of classification and feature selection for every classifier. First, we apply feature selection techniques to our dataset with each classifier to find out possible optimal parameters and In the next phase, we use classification to find the impact of possible optimal parameters and proved our findings. In each round of classification, we have used different parameters available in the dataset every time we include and exclude some parameters and noted the results of each run of classification with each classifier and in this way, we identify the optimal parameters and their impact on the results Whereas we also analyze the performance of classifiers. To perform this research study, we use the ā€œcredit card defaultsā€ dataset which we obtained from UCI Machine learning online repository. We use two feature selection techniques that include ranker approach and evolutionary search method and after that, we also apply classification techniques on the dataset. This research can help to reduce the complexities of the credit scoring process. Through this study, we identify up to six optimal parameters and also find their impact on the performance of classifiers. Further We also identify that multilayer perceptron was the best performing classifier out of three. This research work can also be extended to other fields in the future where we use this mechanism to find out optimal parameters and their impact can help us to predict the  results.  &nbsp

    Artificial Intelligence Based Classification for Urban Surface Water Modelling

    Get PDF
    Estimations and predictions of surface water runoff can provide very useful insights, regarding flood risks in urban areas. To automatically predict the flow behaviour of the rainfall-runoff water, in real-world satellite images, it is important to precisely identify permeable and impermeable areas. This identification indicates and helps to calculate the amount of surface water, by taking into account the amount of water being absorbed in a permeable area and what remains on the impermeable area. In this research, a model of surface water has been established, to predict the behavioural flow of rainfall-runoff water. This study employs a combination of image processing, artificial intelligence and machine learning techniques, for automatic segmentation and classification of permeable and impermeable areas, in satellite images. These techniques investigate the image classification approaches for classifying three land-use categories (roofs, roads, and pervious areas), commonly found in satellite images of the earthā€™s surface. Three different classification scenarios are investigated, to select the best classification model. The first scenario involves pixel by pixel classification of images, using Classification Tree and Random Forest classification techniques, in 2 different settings of sequential and parallel execution of algorithms. In the second classification scenario, the image is divided into objects, by using Superpixels (SLIC) segmentation method, while three kinds of feature sets are extracted from the segmented objects. The performance of eight different supervised machine learning classifiers is probed, using 5-fold cross-validation, for multiple SLIC values, while detailed performance comparisons lead to conclusions about the classification into different classes, regarding Object-based and Pixel-based classification schemes. Pareto analysis and Knee point selection are used to select SLIC value and the suitable type of classification, among the aforementioned two. Furthermore, a new diversity and weighted sum-based ensemble classification model, called ParetoEnsemble, is proposed, in this classification scenario. The weights are applied to selected component classifiers of an ensemble, creating a strong classifier, where classification is done based on multiple votes from candidate classifiers of the ensemble, as opposed to individual classifiers, where classification is done based on a single vote, from only one classifier. Unbalanced and balanced data-based classification results are also evaluated, to determine the most suitable mode, for satellite image classifications, in this study. Convolutional Neural Networks, based on semantic segmentation, are also employed in the classification phase, as a third scenario, to evaluate the strength of deep learning model SegNet, in the classification of satellite imaging. The best results, from the three classification scenarios, are compared and the best classification method, among the three scenarios, is used in the next phase of water modelling, with the InfoWorks ICM software, to explore the potential of modelling process, regarding a partially automated surface water network. By using the parameter settings, with a specified amount of simulated rain falling, onto the imaged area, the amount of surface water flow is estimated, to get predictions about runoff situations in urban areas, since runoff, in such a situation, can be high enough to pose a dangerous flood risk. The area of Feock, in Cornwall, is used as a simulation area of study, in this research, where some promising results have been derived, regarding classification and modelling of runoff. The correlation coefficient estimation, between classification and runoff accuracy, provides useful insight, regarding the dependence of runoff performance on classification performance. The trained system was tested on some unknown area images as well, demonstrating a reasonable performance, considering the training and classification limitations and conditions. Furthermore, in these unknown area images, reasonable estimations were derived, regarding surface water runoff. An analysis of unbalanced and balanced data-based classification and runoff estimations, for multiple parameter configurations, provides aid to the selection of classification and modelling parameter values, to be used in future unknown data predictions. This research is founded on the incorporation of satellite imaging into water modelling, using selective images for analysis and assessment of results. This system can be further improved, and runoff predictions of high precision can be better achieved, by adding more high-resolution images to the classifiers training. The added variety, to the trained model, can lead to an even better classification of any unknown image, which could eventually provide better modelling and better insights into surface water modelling. Moreover, the modelling phase can be extended, in future research, to deal with real-time parameters, by calibrating the model, after the classification phase, in order to observe the impact of classification on the actual calibration

    Clinical Dengue Data Analysis and Prediction using Multiple Classifiers: An Ensemble Techniques

    Get PDF
    The dengue infection is caused by the mosquito Aedes aegypti According to WHO 50 to 100 million dengue infections will occur every year Data-miming techniques will extract information from the raw data Dengue symptoms are fever severe headache body pain vomiting diarrhea cough pain in abdomen etc The research work is carried out on real data and the patient data is collected from the Department of General Medicine PESIMSR Kuppam Andrapradesh Dataset consists of 18 attributes and one target value Research work has been done on binary classification to classify dengue positive DF and dengue negative NDF cases using different ML techniques The proposed work demonstrates that ensemble techniques bagging boosting and stacking gives better results than other models The Extreme Gradient Boost XGB Random Forest by majority voting and stacking with different meta classifiers are the ensemble techniques used for the binary classification The dataset is divided into 80 training and 20 testing dataset Performance parameters used for the analysis are accuracy precision recall and f1 score and compared the proposed model with other ML models The experimental results shows that the accuracy of extended boost random forest and stacking is 98 99 99 for training dataset and 97 94 98 testing dataset respectively The extended metrics ROC Precision -Recall curve and AUC better analysi

    Target recognition techniques for multifunction phased array radar

    Get PDF
    This thesis, submitted for the degree of Doctor of Philosophy at University College London, is a discussion and analysis of combined stepped-frequency and pulse-Doppler target recognition methods which enable a multifunction phased array radar designed for automatic surveillance and multi-target tracking to offer a Non Cooperative Target Recognition (NCTR) capability. The primary challenge is to investigate the feasibility of NCTR via the use of high range resolution profiles. Given stepped frequency waveforms effectively trade time for enhanced bandwidth, and thus resolution, attention is paid to the design of a compromise between resolution and dwell time. A secondary challenge is to investigate the additional benefits to overall target classification when the number of coherent pulses within an NCTR wavefrom is expanded to enable the extraction of spectral features which can help to differentiate particular classes of target. As with increased range resolution, the price for this extra information is a further increase in dwell time. The response to the primary and secondary challenges described above has involved the development of a number of novel techniques, which are summarized below: ā€¢ Design and execution of a series of experiments to further the understanding of multifunction phased array Radar NCTR techniques ā€¢ Development of a ā€˜Hybridā€™ stepped frequency technique which enables a significant extension of range profiles without the proportional trade in resolution as experienced with ā€˜Classicalā€™ techniques ā€¢ Development of an ā€˜end to endā€™ NCTR processing and visualization pipeline ā€¢ Use of ā€˜Doppler fractionā€™ spectral features to enable aircraft target classification via propulsion mechanism. Combination of Doppler fraction and physical length features to enable broad aircraft type classification. ā€¢ Optimization of NCTR method classification performance as a function of feature and waveform parameters. ā€¢ Generic waveform design tools to enable delivery of time costly NCTR waveforms within operational constraints. The thesis is largely based upon an analysis of experimental results obtained using the multifunction phased array radar MESAR2, based at BAE Systems on the Isle of Wight. The NCTR mode of MESAR2 consists of the transmission and reception of successive multi-pulse coherent bursts upon each target being tracked. Each burst is stepped in frequency resulting in an overall bandwidth sufficient to provide sub-metre range resolution. A sequence of experiments, (static trials, moving point target trials and full aircraft trials) are described and an analysis of the robustness of target length and Doppler spectra feature measurements from NCTR mode data recordings is presented. A recorded data archive of 1498 NCTR looks upon 17 different trials aircraft using five different varieties of stepped frequency waveform is used to determine classification performance as a function of various signal processing parameters and extent (numbers of pulses) of the data used. From analysis of the trials data, recommendations are made with regards to the design of an NCTR mode for an operational system that uses stepped frequency techniques by design choice

    Brain Lesion Segmentation And Classification Using Diffusion-Weighted Imaging (DWI)

    Get PDF
    Research and development of brain detection and diagnosis system for brain disorder based on Magnetic Resonance Imaging (MRI) have become one of the most common interest in the past few years. Out of various MRI techniques, Diffusion-Weighted Imaging (DWI) remains the most accurate technique for early detection and discrimination of several brain lesions such as stroke. This study proposed the image analysis technique for automatically segmenting and classifying abnormal lesion structures from DWI. Four lesions namely acute stroke, chronic stroke, solid tumor and necrosis were analyzed. The proposed analysis framework were pre-processing, segmentation, features extraction and classification. Four different segmentation techniques were proposed based on Thresholding with Morphological Operation (TMO), Fuzzy C-Means (FCM), Fuzzy C-Means with Active Contour (FCMAC) and Fuzzy C-Means with Correlation Template (FCMCT) to segment the lesionā€™s region. Next, the statistical parameters from spatial and wavelet transforms were extracted from the Region of Interest (ROI) as features. These features were classified using a rule-based classifier for automatic classification. The results indicate that FCMCT offered the best performance for Jaccard Index, Dice Index, False Positive Rate and False Negative Rate which were 0.6, 0.73, 0.19 and 0.2 respectively. The overall accuracy, sensitivity and specificity for the classification were 89 %, 86 % and 96 %. In conclusion, the proposed hybrid analysis has the potential to be explored as a computer-aided tool to detect and diagnose of human brain lesion

    Multi-stage Wireless Signal Identification for Blind Interception Receiver Design

    Get PDF
    Protection of critical wireless infrastructure from malicious attacks has become increasingly important in recent years, with the widespread deployment of various wireless technologies and dramatic growth in user populations. This brings substantial technical challenges to the interception receiver design to sense and identify various wireless signals using different transmission technologies. The key requirements for the receiver design include estimation of the signal parameters/features and classification of the modulation scheme. With the proper identification results, corresponding signal interception techniques can be developed, which can be further employed to enhance the network behaviour analysis and intrusion detection. In detail, the initial stage of the blind interception receiver design is to identify the signal parameters. In the thesis, two low-complexity approaches are provided to realize the parameter estimation, which are based on iterative cyclostationary analysis and envelope spectrum estimation, respectively. With the estimated signal parameters, automatic modulation classification (AMC) is performed to automatically identify the modulation schemes of the transmitted signals. A novel approach is presented based on Gaussian Mixture Models (GMM) in Chapter 4. The approach is capable of mitigating the negative effect from multipath fading channel. To validate the proposed design, the performance is evaluated under an experimental propagation environment. The results show that the proposed design is capable of adapting blind parameter estimation, realize timing and frequency synchronization and classifying the modulation schemes with improved performances

    The characterisation and automatic classification of transmission line faults

    Get PDF
    Includes bibliographical references.A country's ability to sustain and grow its industrial and commercial activities is highly dependent on a reliable electricity supply. Electrical faults on transmission lines are a cause of both interruptions to supply and voltage dips. These are the most common events impacting electricity users and also have the largest financial impact on them. This research focuses on understanding the causes of transmission line faults and developing methods to automatically identify these causes. Records of faults occurring on the South African power transmission system over a 16-year period have been collected and analysed to find statistical relationships between local climate, key design parameters of the overhead lines and the main causes of power system faults. The results characterize the performance of the South African transmission system on a probabilistic basis and illustrate differences in fault cause statistics for the summer and winter rainfall areas of South Africa and for different times of the year and day. This analysis lays a foundation for reliability analysis and fault pattern recognition taking environmental features such as local geography, climate and power system parameters into account. A key aspect of using pattern recognition techniques is selecting appropriate classifying features. Transmission line fault waveforms are characterised by instantaneous symmetrical component analysis to describe the transient and steady state fault conditions. The waveform and environmental features are used to develop single nearest neighbour classifiers to identify the underlying cause of transmission line faults. A classification accuracy of 86% is achieved using a single nearest neighbour classifier. This classification performance is found to be superior to that of decision tree, artificial neural network and naĆÆve Bayes classifiers. The results achieved demonstrate that transmission line faults can be automatically classified according to cause
    • ā€¦
    corecore