491 research outputs found

    An initial state of design and development of intelligent knowledge discovery system for stock exchange database

    Get PDF
    Data mining is a challenging matter in research field for the last few years.Researchers are using different techniques in data mining.This paper discussed the initial state of Design and Development Intelligent Knowledge Discovery System for Stock Exchange (SE) Databases. We divide our problem in two modules.In first module we define Fuzzy Rule Base System to determined vague information in stock exchange databases.After normalizing massive amount of data we will apply our proposed approach, Mining Frequent Patterns with Neural Networks.Future prediction (e.g., political condition, corporation factors, macro economy factors, and psychological factors of investors) perform an important rule in Stock Exchange, so in our prediction model we will be able to predict results more precisely.In second module we will generate clustering algorithm. Generally our clustering algorithm consists of two steps including training and running steps.The training step is conducted for generating the neural network knowledge based on clustering.In running step, neural network knowledge based is used for supporting the Module in order to generate learned complete data, transformed data and interesting clusters that will help to generate interesting rules

    On Predicting Learning Styles in Conversational Intelligent Tutoring Systems using Fuzzy Classification Trees

    Get PDF
    Oscar is a conversational intelligent tutoring system (CITS) which dynamically predicts and adapts to a student's learning style throughout the tutoring conversation. Oscar aims to mimic a human tutor to improve the effectiveness of the learning experience by leading a natural language tutorial and adapting material to suit an individual's learning style. Prediction of learning style is undertaken through capturing independent variables during the conversation. The variable with the highest value determines the individuals learning style. This paper proposes a new method which uses a fuzzy classification tree to build a fuzzy predictive model using these variables which are captured through natural language dialogue Experiments have been undertaken on two of the learning style dimensions: perception (sensory-intuitive) and understanding (sequential-global). Early results show the model has substantially increased the predictive accuracy of the Oscar CITS and discovered some interesting relationships amongst these variables

    Scalable approximate FRNN-OWA classification

    Get PDF
    Fuzzy Rough Nearest Neighbour classification with Ordered Weighted Averaging operators (FRNN-OWA) is an algorithm that classifies unseen instances according to their membership in the fuzzy upper and lower approximations of the decision classes. Previous research has shown that the use of OWA operators increases the robustness of this model. However, calculating membership in an approximation requires a nearest neighbour search. In practice, the query time complexity of exact nearest neighbour search algorithms in more than a handful of dimensions is near-linear, which limits the scalability of FRNN-OWA. Therefore, we propose approximate FRNN-OWA, a modified model that calculates upper and lower approximations of decision classes using the approximate nearest neighbours returned by Hierarchical Navigable Small Worlds (HNSW), a recent approximative nearest neighbour search algorithm with logarithmic query time complexity at constant near-100% accuracy. We demonstrate that approximate FRNN-OWA is sufficiently robust to match the classification accuracy of exact FRNN-OWA while scaling much more efficiently. We test four parameter configurations of HNSW, and evaluate their performance by measuring classification accuracy and construction and query times for samples of various sizes from three large datasets. We find that with two of the parameter configurations, approximate FRNN-OWA achieves near-identical accuracy to exact FRNN-OWA for most sample sizes within query times that are up to several orders of magnitude faster

    CFLCA: High Performance based Heart disease Prediction System using Fuzzy Learning with Neural Networks

    Get PDF
    Human Diseases are increasing rapidly in today’s generation mainly due to the life style of people like poor diet, lack of exercises, drugs and alcohol consumption etc. But the most spreading disease that is commonly around 80% of people death direct and indirectly heart disease basis. In future (approximately after 10 years) maximum number of people may expire cause of heart diseases. Due to these reasons, many of researchers providing enormous remedy, data analysis in various proposed technologies for diagnosing heart diseases with plenty of medical data which is related to heart disease. In field of Medicine regularly receives very wide range of medical data in the form of text, image, audio, video, signal pockets, etc. This database contains raw dataset which consist of inconsistent and redundant data. The health care system is no doubt very rich in aspect of storing data but at the same time very poor in fetching knowledge. Data mining (DM) methods can help in extracting a valuable knowledge by applying DM terminologies like clustering, regression, segmentation, classification etc. After the collection of data when the dataset becomes larger and more complex than data mining algorithms and clustering algorithms (D-Tree, Neural Networks, K-means, etc.) are used. To get accuracy and precision values improved with proposed method of Cognitive Fuzzy Learning based Clustering Algorithm (CFLCA) method. CFLCA methodology creates advanced meta indexing for n-dimensional unstructured data. The heart disease dataset used after data enrichment and feature engineering with UCI machine learning algorithm, attain high level accurate and prediction rate. Through this proposed CFLCA algorithm is having high accuracy, precision and recall values of data analysis for heart diseases detection

    A novel rule induction algorithm with improved handling of continuous valued attributes

    Get PDF
    Machine learning programs can automatically learn to recognise complex patterns and make intelligent decisions based on data. Machine learning has become a powerful tool for data mining. A great deal of research in machine learning has focused on concept learning or classification learning. Among the various machine learning approaches that have been developed for classification, inductive learning from examples is the most commonly adopted in real-life applications. Due to non-uniform data formats and huge volume of data, it is a challenge for scientists across different disciplines to optimise the process of knowledge acquisition from data with naïve inductive learning techniques. The overarching purpose of this research is to develop a novel and efficient rule induction algorithm a learning algorithm for inducing general rules from specific examples that can deal with both discrete and continuous variables without the need for data pre-processing. This thesis presents a novel rule induction algorithm known as RULES-8 which utilises guidelines for the selection of seed examples, together with a simple method to form rules. The research also aims to improve current pruning methods for handling noisy examples. Another major concern of the work is designing a new heuristic for controlling the rule formation and selection processes. Finally, it concentrates on developing a new efficient learning algorithm for continuous output using fuzzy logic theory. The proposed algorithm allows automatic creation of membership functions and produces accurate as well as compact fuzzy sets.EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    Machine learning based data pre-processing for the purpose of medical data mining and decision support

    Get PDF
    Building an accurate and reliable model for prediction for different application domains, is one of the most significant challenges in knowledge discovery and data mining. Sometimes, improved data quality is itself the goal of the analysis, usually to improve processes in a production database and the designing of decision support. As medicine moves forward there is a need for sophisticated decision support systems that make use of data mining to support more orthodox knowledge engineering and Health Informatics practice. However, the real-life medical data rarely complies with the requirements of various data mining tools. It is often inconsistent, noisy, containing redundant attributes, in an unsuitable format, containing missing values and imbalanced with regards to the outcome class label.Many real-life data sets are incomplete, with missing values. In medical data mining the problem with missing values has become a challenging issue. In many clinical trials, the medical report pro-forma allow some attributes to be left blank, because they are inappropriate for some class of illness or the person providing the information feels that it is not appropriate to record the values for some attributes. The research reported in this thesis has explored the use of machine learning techniques as missing value imputation methods. The thesis also proposed a new way of imputing missing value by supervised learning. A classifier was used to learn the data patterns from a complete data sub-set and the model was later used to predict the missing values for the full dataset. The proposed machine learning based missing value imputation was applied on the thesis data and the results are compared with traditional Mean/Mode imputation. Experimental results show that all the machine learning methods which we explored outperformed the statistical method (Mean/Mode).The class imbalance problem has been found to hinder the performance of learning systems. In fact, most of the medical datasets are found to be highly imbalance in their class label. The solution to this problem is to reduce the gap between the minority class samples and the majority class samples. Over-sampling can be applied to increase the number of minority class sample to balance the data. The alternative to over-sampling is under-sampling where the size of majority class sample is reduced. The thesis proposed one cluster based under-sampling technique to reduce the gap between the majority and minority samples. Different under-sampling and over-sampling techniques were explored as ways to balance the data. The experimental results show that for the thesis data the new proposed modified cluster based under-sampling technique performed better than other class balancing techniques.In further research it is found that the class imbalance problem not only affects the classification performance but also has an adverse effect on feature selection. The thesis proposed a new framework for feature selection for class imbalanced datasets. The research found that, using the proposed framework the classifier needs less attributes to show high accuracy, and more attributes are needed if the data is highly imbalanced.The research described in the thesis contains the flowing four novel main contributions.a) Improved data mining methodology for mining medical datab) Machine learning based missing value imputation methodc) Cluster Based semi-supervised class balancing methodd) Feature selection framework for class imbalance datasetsThe performance analysis and comparative study show that the use of proposed method of missing value imputation, class balancing and feature selection framework can provide an effective approach to data preparation for building medical decision support

    A novel rule induction algorithm with improved handling of continuous valued attributes

    Get PDF
    Machine learning programs can automatically learn to recognise complex patterns and make intelligent decisions based on data. Machine learning has become a powerful tool for data mining. A great deal of research in machine learning has focused on concept learning or classification learning. Among the various machine learning approaches that have been developed for classification, inductive learning from examples is the most commonly adopted in real-life applications. Due to non-uniform data formats and huge volume of data, it is a challenge for scientists across different disciplines to optimise the process of knowledge acquisition from data with naïve inductive learning techniques. The overarching purpose of this research is to develop a novel and efficient rule induction algorithm a learning algorithm for inducing general rules from specific examples that can deal with both discrete and continuous variables without the need for data pre-processing. This thesis presents a novel rule induction algorithm known as RULES-8 which utilises guidelines for the selection of seed examples, together with a simple method to form rules. The research also aims to improve current pruning methods for handling noisy examples. Another major concern of the work is designing a new heuristic for controlling the rule formation and selection processes. Finally, it concentrates on developing a new efficient learning algorithm for continuous output using fuzzy logic theory. The proposed algorithm allows automatic creation of membership functions and produces accurate as well as compact fuzzy sets

    Combining rough and fuzzy sets for feature selection

    Get PDF

    Decision tree learning for intelligent mobile robot navigation

    Get PDF
    The replication of human intelligence, learning and reasoning by means of computer algorithms is termed Artificial Intelligence (Al) and the interaction of such algorithms with the physical world can be achieved using robotics. The work described in this thesis investigates the applications of concept learning (an approach which takes its inspiration from biological motivations and from survival instincts in particular) to robot control and path planning. The methodology of concept learning has been applied using learning decision trees (DTs) which induce domain knowledge from a finite set of training vectors which in turn describe systematically a physical entity and are used to train a robot to learn new concepts and to adapt its behaviour. To achieve behaviour learning, this work introduces the novel approach of hierarchical learning and knowledge decomposition to the frame of the reactive robot architecture. Following the analogy with survival instincts, the robot is first taught how to survive in very simple and homogeneous environments, namely a world without any disturbances or any kind of "hostility". Once this simple behaviour, named a primitive, has been established, the robot is trained to adapt new knowledge to cope with increasingly complex environments by adding further worlds to its existing knowledge. The repertoire of the robot behaviours in the form of symbolic knowledge is retained in a hierarchy of clustered decision trees (DTs) accommodating a number of primitives. To classify robot perceptions, control rules are synthesised using symbolic knowledge derived from searching the hierarchy of DTs. A second novel concept is introduced, namely that of multi-dimensional fuzzy associative memories (MDFAMs). These are clustered fuzzy decision trees (FDTs) which are trained locally and accommodate specific perceptual knowledge. Fuzzy logic is incorporated to deal with inherent noise in sensory data and to merge conflicting behaviours of the DTs. In this thesis, the feasibility of the developed techniques is illustrated in the robot applications, their benefits and drawbacks are discussed
    • …
    corecore