613 research outputs found

    An ensemble of intelligent water drop algorithm for feature selection optimization problem

    Get PDF
    Master River Multiple Creeks Intelligent Water Drops (MRMC-IWD) is an ensemble model of the intelligent water drop, whereby a divide-and-conquer strategy is utilized to improve the search process. In this paper, the potential of the MRMC-IWD using real-world optimization problems related to feature selection and classification tasks is assessed. An experimental study on a number of publicly available benchmark data sets and two real-world problems, namely human motion detection and motor fault detection, are conducted. Comparative studies pertaining to the features reduction and classification accuracies using different evaluation techniques (consistency-based, CFS, and FRFS) and classifiers (i.e., C4.5, VQNN, and SVM) are conducted. The results ascertain the effectiveness of the MRMC-IWD in improving the performance of the original IWD algorithm as well as undertaking real-world optimization problems

    Discovering Knowledge through Highly Interactive Information Based Systems

    Get PDF
    [EN] The new Internet era has increased a production of digital data. The mankind had an easy way to the knowledge access never before, but at the same time the rapidly increasing rate of new data, the ease of duplication and transmission of these data across the Net, the new available channels for information dissemination, the large amounts of historical data, questionable quality of the existing data and so on are issues for information overload that causes more difficult to make decision using the right data. Soft-computing techniques for decision support systems and business intelligent systems present pretty interesting and necessary solutions for data management and supporting decision-making processes, but the last step at the decision chain is usually supported by a human agent that has to process the system outcomes in form of reports or visualizations. These kinds of information representations are not enough to make decisions because of behind them could be hidden information patterns that are not obvious for automatic data processing and humans must interact with these data representation in order to discover knowledge. According to this, the current special issue is devoted to present nine experiences that combine visualization and visual analytics techniques, data mining methods, intelligent recommendation agents, user centered evaluation and usability patterns, etc. in interactive systems as a key issue for knowledge discovering in advanced and emerging information systems.[ES] La nueva era de Internet ha aumentado la producción de datos digitales. Nunca nates la humanidad ha tenido una manera más fácil el acceso a los conocimientos, pero al mismo tiempo el rápido aumento de la tasa de nuevos datos, la facilidad de duplicación y transmisión de estos datos a través de la red, los nuevos canales disponibles para la difusión de información, las grandes cantidades de los datos históricos, cuestionable calidad de los datos existentes y así sucesivamente, son temas de la sobrecarga de información que hace más difícil tomar decisiones con la información correcta. Técnicas de Soft-computing para los sistemas de apoyo a las decisiones y sistemas inteligentes de negocios presentan soluciones muy interesantes y necesarias para la gestión de datos y procesos de apoyo a la toma de decisiones, pero el último paso en la cadena de decisiones suele ser apoyados por un agente humano que tiene que procesar los resultados del sistema de en forma de informes o visualizaciones. Este tipo de representaciones de información no son suficientes para tomar decisiones debido detrás de ellos podrían ser patrones de información ocultos que no son obvios para el procesamiento automático de datos y los seres humanos deben interactuar con estos representación de datos con el fin de descubrir el conocimiento. De acuerdo con esto, el presente número especial está dedicado a nueve experiencias actuales que combinan técnicas de visualización y de análisis visual, métodos de minería de datos, agentes de recomendación inteligentes y evaluación centrada en el usuario y patrones de usabilidad, etc. En sistemas interactivos como un tema clave para el descubrimiento de conocimiento en los sistemas de información avanzados y emergentes

    Generated rules for AIDS and e-learning classifier using rough set approach

    Get PDF
    The emergence and growth of internet usage has accumulated an extensive amount of data. These data contain a wealth of undiscovered valuable information and problems of incomplete data set may lead to observation error. This research explored a technique to analyze data that transforms meaningless data to meaningful information. The work focused on Rough Set (RS) to deal with incomplete data and rules derivation. Rules with high and low left-hand-side (LHS) support value generated by RS were used as query statements to form a cluster of data. The model was tested on AIDS blog data set consisting of 146 bloggers and E-Learning@UTM (EL) log data set comprising 23105 URLs. 5-fold and 10-fold cross validation were used to split the data. Naïve algorithm and Boolean algorithm as discretization techniques and Johnson’s algorithm (Johnson) and Genetic algorithm (GA) as reduction techniques were employed to compare the results. 5-fold cross validation tended to suit AIDS data well while 10-fold cross validation was the best for EL data set. Johnson and GA yielded the same number of rules for both data sets. These findings are significant as evidence in terms of accuracy that was achieved using the proposed mode

    Generated rules for AIDS and e-learning classifier using rough set approach

    Get PDF
    The emergence and growth of internet usage has accumulated an extensive amount of data. These data contain a wealth of undiscovered valuable information and problems of incomplete data set may lead to observation error. This research explored a technique to analyze data that transforms meaningless data to meaningful information. The work focused on Rough Set (RS) to deal with incomplete data and rules derivation. Rules with high and low left-hand-side (LHS) support value generated by RS were used as query statements to form a cluster of data. The model was tested on AIDS blog data set consisting of 146 bloggers and E-Learning@UTM (EL) log data set comprising 23105 URLs. 5-fold and 10-fold cross validation were used to split the data. Naïve algorithm and Boolean algorithm as discretization techniques and Johnson’s algorithm (Johnson) and Genetic algorithm (GA) as reduction techniques were employed to compare the results. 5-fold cross validation tended to suit AIDS data well while 10-fold cross validation was the best for EL data set. Johnson and GA yielded the same number of rules for both data sets. These findings are significant as evidence in terms of accuracy that was achieved using the proposed mode

    Crow search algorithm with time varying flight length strategies for feature selection

    Get PDF
    Feature Selection (FS) is an efficient technique use to get rid of irrelevant, redundant and noisy attributes in high dimensional datasets while increasing the efficacy of machine learning classification. The CSA is a modest and efficient metaheuristic algorithm which has been used to overcome several FS issues. The flight length (fl) parameter in CSA governs crows\u27 search ability. In CSA, fl is set to a fixed value. As a result, the CSA is plagued by the problem of being hoodwinked in local minimum. This article suggests a remedy to this issue by bringing five new concepts of time dependent fl in CSA for feature selection methods including linearly decreasing flight length, sigmoid decreasing flight length, chaotic decreasing flight length, simulated annealing decreasing flight length, and logarithm decreasing flight length. The proposed approaches\u27 performance is assessed using 13 standard UCI datasets. The simulation result portrays that the suggested feature selection approaches overtake the original CSA, with the chaotic-CSA approach beating the original CSA and the other four proposed approaches for the FS task

    An Adaptive Flex-Deluge Approach to University Exam Timetabling

    Get PDF

    Binary Black Widow Optimization Algorithm for Feature Selection Problems

    Get PDF
    This thesis addresses feature selection (FS) problems, which is a primary stage in data mining. FS is a significant pre-processing stage to enhance the performance of the process with regards to computation cost and accuracy to offer a better comprehension of stored data by removing the unnecessary and irrelevant features from the basic dataset. However, because of the size of the problem, FS is known to be very challenging and has been classified as an NP-hard problem. Traditional methods can only be used to solve small problems. Therefore, metaheuristic algorithms (MAs) are becoming powerful methods for addressing the FS problems. Recently, a new metaheuristic algorithm, known as the Black Widow Optimization (BWO) algorithm, had great results when applied to a range of daunting design problems in the field of engineering, and has not yet been applied to FS problems. In this thesis, we are proposing a modified Binary Black Widow Optimization (BBWO) algorithm to solve FS problems. The FS evaluation method used in this study is the wrapper method, designed to keep a degree of balance between two significant processes: (i) minimize the number of selected features (ii) maintain a high level of accuracy. To achieve this, we have used the k-nearest-neighbor (KNN) machine learning algorithm in the learning stage intending to evaluate the accuracy of the solutions generated by the (BBWO). The proposed method is applied to twenty-eight public datasets provided by UCI. The results are then compared with up-to-date FS algorithms. Our results show that the BBWO works as good as, or even better in some cases, when compared to those FS algorithms. However, the results also show that the BBWO faces the problem of slow convergence due to the use of a population of solutions and the lack of local exploitation. To further improve the exploitation process and enhance the BBWO’s performance, we are proposing an improvement to the BBWO algorithm by combining it with a local metaheuristic algorithm based on the hill-climbing algorithm (HCA). This improvement method (IBBWO) is also tested on the twenty-eight datasets provided by UCI and the results are then compared with the basic BBWO and the up-to-date FS algorithms. Results show that the (IBBWO) produces better results in most cases when compared to basic BBWO. The results also show that IBBWO outperforms the most known FS algorithms in many cases

    An investigation of Monte Carlo tree search and local search for course timetabling problems

    Get PDF
    The work presented in this thesis focuses on solving course timetabling problems, a variant of education timetabling. Automated timetabling is a popular topic among researchers and practitioners because manual timetable construction is impractical, if not impossible, as it is known to be NP-hard. A two-stage approach is investigated. The first stage involves finding feasible solutions. Monte Carlo Tree Search (MCTS) is utilized in this stage. As far as we are aware, it is used for the first time in addressing the timetabling problem. It is a relatively new search method and has achieved breakthrough in the domain of games particularly Go. Several enhancements are attempted on MCTS such as heuristic based simulations and pruning. We also compare the effectiveness of MCTS with Graph Coloring Heuristic (GCH) and Tabu Search (TS) based methods. Initial findings show that a TS based method is more promising, so we focus on improving TS. We propose an algorithm called Tabu Search with Sampling and Perturbation (TSSP). Among the enhancements that we introduced are event sampling, a novel cost function and perturbation. Furthermore, we hybridize TSSP with Iterated Local Search (ILS). The second stage focuses on improving the quality of feasible solutions. We propose a variant of Simulated Annealing called Simulated Annealing with Reheating (SAR). SAR has three features: a novel neighborhood examination scheme, a new way of estimating local optima and a reheating scheme. The rigorous setting of initial and end temperature in conventional SA is bypassed in SAR. Precisely, reheating and cooling were applied at the right time and level, thus saving time allowing the search to be performed efficiently. One drawback of SAR is having to preset the composition of neighborhood structures for the datasets. We present an enhanced variant of the SAR algorithm called Simulated Annealing with Improved Reheating and Learning (SAIRL). We propose a reinforcement learning based method to obtain a suitable neighborhood structure composition for the search to operate effectively. We also propose to incorporate the average cost changes into the reheated temperature function. SAIRL eliminates the need for tuning parameters in conventional SA as well as neighborhood structures composition in SAR. Experiments were tested on four publicly available datasets namely Socha, International Timetabling Competition 2002 (ITC02), International Timetabling Competition 2007 (ITC07) and Hard. Our results are better or competitive when compared with other state of the art methods where new best results are obtained for many instances
    corecore