3 research outputs found

    Feature Selection and Classification of Microarray Data using MapReduce based ANOVA and K-Nearest Neighbor

    Get PDF
    AbstractThe major drawback of microarray data is the ‘curse of dimensionality problem’, this hinders the useful information of dataset and leads to computational instability. Therefore, selecting relevant genes is an imperative in microarray data analysis. Most of the existing schemes employ a two-phase processes: feature selection/extraction followed by classification. In this paper, a statistical test, ANOVA based on MapReduce is proposed to select the relevant features. After feature selection, MapReduce based K-Nearest Neighbor (K-NN) classifier is also proposed to classify the microarray data. These algorithms are successfully implemented on Hadoop framework and comparative analysis is done using various datasets

    Knowledge management overview of feature selection problem in high-dimensional financial data: Cooperative co-evolution and Map Reduce perspectives

    Get PDF
    The term big data characterizes the massive amounts of data generation by the advanced technologies in different domains using 4Vs volume, velocity, variety, and veracity-to indicate the amount of data that can only be processed via computationally intensive analysis, the speed of their creation, the different types of data, and their accuracy. High-dimensional financial data, such as time-series and space-Time data, contain a large number of features (variables) while having a small number of samples, which are used to measure various real-Time business situations for financial organizations. Such datasets are normally noisy, and complex correlations may exist between their features, and many domains, including financial, lack the al analytic tools to mine the data for knowledge discovery because of the high-dimensionality. Feature selection is an optimization problem to find a minimal subset of relevant features that maximizes the classification accuracy and reduces the computations. Traditional statistical-based feature selection approaches are not adequate to deal with the curse of dimensionality associated with big data. Cooperative co-evolution, a meta-heuristic algorithm and a divide-And-conquer approach, decomposes high-dimensional problems into smaller sub-problems. Further, MapReduce, a programming model, offers a ready-To-use distributed, scalable, and fault-Tolerant infrastructure for parallelizing the developed algorithm. This article presents a knowledge management overview of evolutionary feature selection approaches, state-of-The-Art cooperative co-evolution and MapReduce-based feature selection techniques, and future research directions

    Severity scoring approach using modified optical flow method and lesion identification for facial nerve paralysis assessment

    Get PDF
    The facial nerve controls facial movement and expression. Hence, a patient with facial nerve paralysis will experience affected social interactions, psychological distress, and low self-esteem. Upon the first presentation, it is crucial to determine the severity level of the paralysis and take out the possibility of stroke or any other serious causes by recognising the type of lesion in preventing any mistreatment of the patient. Clinically, the facial nerve is assessed subjectively by observing voluntary facial movement and assigning a score based on the deductions made by the clinician. However, the results are not uniform among different examiners evaluating the same patients. This is extremely undesirable for both medical diagnostic and treatment considerations. Acknowledging the importance of this assessment, this research was conducted to develop a facial nerve assessment that can classify both the severity level of facial nerve function and also the types of facial lesion, Upper Motor Neuron (UMN) and Lower Motor Neuron (LMN), in facial regional assessment and lesion assessment, respectively. For regional assessment, two optical flow techniques, Kanade-Lucas-Tomasi (KLT) and Horn-Schunck (HS) were used in this study to determine the local and global motion information of facial features. Nevertheless, there is a problem with the original KLT which is the inability of the Eigen features to distinguish the normal and patient subjects. Thus, the KLT method was modified by introducing polygonal measurements and the landmarks were placed on each facial region. Similar to the HS method, the multiple frames evaluation was proposed rather than a single frame evaluation of the original HS method to avoid the differences between frames becoming too small. The features of these modified methods, Modified Local Sparse (MLS) and Modified Global Dense (MGD), were combined, namely the Combined Modified Local-Global (CMLG), to discover both local (certain region) and global (entire image) flow features. This served as the input into the k-NN classifier to assess the performance of each of them in determining the severity level of paralysis. For the lesion assessment, the Gabor filter method was used to extract the wrinkle forehead features. Thereafter, the Gabor features combined with the previous features of CMLG, by focusing only on the forehead region to evaluate both the wrinkle and motion information of the facial features. This is because, in an LMN lesion, the patient will not be able to move the forehead symmetrically during the rising eyebrows movement and unable to wrinkle the forehead due to the damaged frontalis muscle. However, the patient with a UMN lesion exhibits the same criteria as a normal subject, where the forehead is spared and can be lifted symmetrically. The CMLG technique in regional assessment showed the best performance in distinguishing between patient and normal subjects with an accuracy of 92.26% compared to that of MLS and MGD, which were 88.38% and 90.32%, respectively. From the results, some assessment tools were developed in this study namely individual score, total score and paralysis score chart which were correlated with the House-Brackmann score and validated by a medical professional with 91.30% of accuracy. In lesion assessment, the combined features of Gabor and CMLG on the forehead region depicted a greater performance in distinguishing the UMN and LMN lesion of the patient with an accuracy of 89.03% compared to Gabor alone, which was 78.07%. In conclusion, the proposed facial nerve assessment approach consisting of both regional assessment and lesion assessment is capable of determining the level of facial paralysis severity and recognising the type of facial lesion, whether it is a UMN or LMN lesion
    corecore