11,930 research outputs found

    Temporal Feature Selection with Symbolic Regression

    Get PDF
    Building and discovering useful features when constructing machine learning models is the central task for the machine learning practitioner. Good features are useful not only in increasing the predictive power of a model but also in illuminating the underlying drivers of a target variable. In this research we propose a novel feature learning technique in which Symbolic regression is endowed with a ``Range Terminal\u27\u27 that allows it to explore functions of the aggregate of variables over time. We test the Range Terminal on a synthetic data set and a real world data in which we predict seasonal greenness using satellite derived temperature and snow data over a portion of the Arctic. On the synthetic data set we find Symbolic regression with the Range Terminal outperforms standard Symbolic regression and Lasso regression. On the Arctic data set we find it outperforms standard Symbolic regression, fails to beat the Lasso regression, but finds useful features describing the interaction between Land Surface Temperature, Snow, and seasonal vegetative growth in the Arctic

    A survey on utilization of data mining approaches for dermatological (skin) diseases prediction

    Get PDF
    Due to recent technology advances, large volumes of medical data is obtained. These data contain valuable information. Therefore data mining techniques can be used to extract useful patterns. This paper is intended to introduce data mining and its various techniques and a survey of the available literature on medical data mining. We emphasize mainly on the application of data mining on skin diseases. A categorization has been provided based on the different data mining techniques. The utility of the various data mining methodologies is highlighted. Generally association mining is suitable for extracting rules. It has been used especially in cancer diagnosis. Classification is a robust method in medical mining. In this paper, we have summarized the different uses of classification in dermatology. It is one of the most important methods for diagnosis of erythemato-squamous diseases. There are different methods like Neural Networks, Genetic Algorithms and fuzzy classifiaction in this topic. Clustering is a useful method in medical images mining. The purpose of clustering techniques is to find a structure for the given data by finding similarities between data according to data characteristics. Clustering has some applications in dermatology. Besides introducing different mining methods, we have investigated some challenges which exist in mining skin data

    A survey of cost-sensitive decision tree induction algorithms

    Get PDF
    The past decade has seen a significant interest on the problem of inducing decision trees that take account of costs of misclassification and costs of acquiring the features used for decision making. This survey identifies over 50 algorithms including approaches that are direct adaptations of accuracy based methods, use genetic algorithms, use anytime methods and utilize boosting and bagging. The survey brings together these different studies and novel approaches to cost-sensitive decision tree learning, provides a useful taxonomy, a historical timeline of how the field has developed and should provide a useful reference point for future research in this field

    Thirty Years of Machine Learning: The Road to Pareto-Optimal Wireless Networks

    Full text link
    Future wireless networks have a substantial potential in terms of supporting a broad range of complex compelling applications both in military and civilian fields, where the users are able to enjoy high-rate, low-latency, low-cost and reliable information services. Achieving this ambitious goal requires new radio techniques for adaptive learning and intelligent decision making because of the complex heterogeneous nature of the network structures and wireless services. Machine learning (ML) algorithms have great success in supporting big data analytics, efficient parameter estimation and interactive decision making. Hence, in this article, we review the thirty-year history of ML by elaborating on supervised learning, unsupervised learning, reinforcement learning and deep learning. Furthermore, we investigate their employment in the compelling applications of wireless networks, including heterogeneous networks (HetNets), cognitive radios (CR), Internet of things (IoT), machine to machine networks (M2M), and so on. This article aims for assisting the readers in clarifying the motivation and methodology of the various ML algorithms, so as to invoke them for hitherto unexplored services as well as scenarios of future wireless networks.Comment: 46 pages, 22 fig

    Second CLIPS Conference Proceedings, volume 1

    Get PDF
    Topics covered at the 2nd CLIPS Conference held at the Johnson Space Center, September 23-25, 1991 are given. Topics include rule groupings, fault detection using expert systems, decision making using expert systems, knowledge representation, computer aided design and debugging expert systems

    A Prediction Modeling Framework For Noisy Welding Quality Data

    Get PDF
    Numerous and various research projects have been conducted to utilize historical manufacturing process data in product design. These manufacturing process data often contain data inconsistencies, and it causes challenges in extracting useful information from the data. In resistance spot welding (RSW), data inconsistency is a well-known issue. In general, such inconsistent data are treated as noise data and removed from the original dataset before conducting analyses or constructing prediction models. This may not be desirable for every design and manufacturing applications since every data can contain important information to further explain the process. In this research, we propose a prediction modeling framework, which employs bootstrap aggregating (bagging) with support vector regression (SVR) as the base learning algorithm to improve the prediction accuracy on such noisy data. Optimal hyper-parameters for SVR are selected by particle swarm optimization (PSO) with meta-modeling. Constructing bagging models require 114 more computational costs than a single model. Also, evolutionary computation algorithms, such as PSO, generally require a large number of candidate solution evaluations to achieve quality solutions. These two requirements greatly increase the overall computational cost in constructing effective bagging SVR models. Meta-modeling can be employed to reduce the computational cost when the fitness or constraints functions are associated with computationally expensive tasks or analyses. In our case, the objective function is associated with constructing bagging SVR models with candidate sets of hyper-parameters. Therefore, in regards to PSO, a large number of bagging SVR models have to be constructed and evaluated, which is computationally expensive. The meta-modeling approach, called MUGPSO, developed in this research assists PSO in evaluating these candidate solutions (i.e., sets of hyper-parameters). MUGPSO approximates the fitness function of candidate solutions. Through this method, the numbers of real fitness function evaluations (i.e., constructing bagging SVR models) are reduced, which also reduces the overall computational costs. Using the Meta2 framework, one can expect an improvement in the prediction accuracy with reduced computational time. Experiments are conducted on three artificially generated noisy datasets and a real RSW quality dataset. The results indicate that Meta2 is capable of providing promising solutions with noticeably reduced computational costs
    corecore