12 research outputs found

    Attention mechanisms in the CHREST cognitive architecture

    Get PDF
    In this paper, we describe the attention mechanisms in CHREST, a computational architecture of human visual expertise. CHREST organises information acquired by direct experience from the world in the form of chunks. These chunks are searched for, and verified, by a unique set of heuristics, comprising the attention mechanism. We explain how the attention mechanism combines bottom-up and top-down heuristics from internal and external sources of information. We describe some experimental evidence demonstrating the correspondence of CHRESTā€™s perceptual mechanisms with those of human subjects. Finally, we discuss how visual attention can play an important role in actions carried out by human experts in domains such as chess

    A New Approach Based on Quantum Clustering and Wavelet Transform for Breast Cancer Classification: Comparative Study

    Get PDF
    Feature selection involves identifying a subset of the most useful features that produce the same results as the original set of features. In this paper, we present a new approach for improving classification accuracy. This approach is based on quantum clustering for feature subset selection and wavelet transform for features extraction. The feature selection is performed in three steps. First the mammographic image undergoes a wavelet transform then some features are extracted. In the second step the original feature space is partitioned in clusters in order to group similar features. This operation is performed using the Quantum Clustering algorithm. The third step deals with the selection of a representative feature for each cluster. This selection is based on similarity measures such as the correlation coefficient (CC) and the mutual information (MI). The feature which maximizes this information (CC or MI) is chosen by the algorithm. This approach is applied for breast cancer classification. The K-nearest neighbors (KNN) classifier is used to achieve the classification. We have presented classification accuracy versus feature type, wavelet transform and K neighbors in the KNN classifier. An accuracy of 100% was reached in some cases

    Toward optimal feature selection using ranking methods and classification algorithms

    Get PDF
    We presented a comparison between several feature ranking methods used on two real datasets. We considered six ranking methods that can be divided into two broad categories: statistical and entropy-based. Four supervised learning algorithms are adopted to build models, namely, IB1, Naive Bayes, C4.5 decision tree and the RBF network. We showed that the selection of ranking methods could be important for classification accuracy. In our experiments, ranking methods with different supervised learning algorithms give quite different results for balanced accuracy. Our cases confirm that, in order to be sure that a subset of features giving the highest accuracy has been selected, the use of many different indices is recommended

    Elastic Step DQN: A novel multi-step algorithm to alleviate overestimation in Deep QNetworks

    Full text link
    Deep Q-Networks algorithm (DQN) was the first reinforcement learning algorithm using deep neural network to successfully surpass human level performance in a number of Atari learning environments. However, divergent and unstable behaviour have been long standing issues in DQNs. The unstable behaviour is often characterised by overestimation in the QQ-values, commonly referred to as the overestimation bias. To address the overestimation bias and the divergent behaviour, a number of heuristic extensions have been proposed. Notably, multi-step updates have been shown to drastically reduce unstable behaviour while improving agent's training performance. However, agents are often highly sensitive to the selection of the multi-step update horizon (nn), and our empirical experiments show that a poorly chosen static value for nn can in many cases lead to worse performance than single-step DQN. Inspired by the success of nn-step DQN and the effects that multi-step updates have on overestimation bias, this paper proposes a new algorithm that we call `Elastic Step DQN' (ES-DQN). It dynamically varies the step size horizon in multi-step updates based on the similarity of states visited. Our empirical evaluation shows that ES-DQN out-performs nn-step with fixed nn updates, Double DQN and Average DQN in several OpenAI Gym environments while at the same time alleviating the overestimation bias

    Data Patterns Discovery Using Unsupervised Learning

    Get PDF
    Self-care activities classification poses significant challenges in identifying childrenā€™s unique functional abilities and needs within the exceptional children healthcare system. The accuracy of diagnosing a child\u27s self-care problem, such as toileting or dressing, is highly influenced by an occupational therapistsā€™ experience and time constraints. Thus, there is a need for objective means to detect and predict in advance the self-care problems of children with physical and motor disabilities. We use clustering to discover interesting information from self-care problems, perform automatic classification of binary data, and discover outliers. The advantages are twofold: the advancement of knowledge on identifying self-care problems in children and comprehensive experimental results on clustering binary healthcare data. By using various distances and linkage methods, resampling techniques of imbalanced data, and feature selection preprocessing in a clustering framework, we find associations among patients and an Adjusted Rand Index (ARI) of 76.26\

    Empirical Analysis ot the Top 800 Cryptocurrencies using Machine Learning Techniques

    Get PDF
    The International Token Classification (ITC) Framework by the Blockchain Center in Frankfurt classifies 795 cryptocurrency tokens based on their economic, technological, legal and industry categorization. This work analyzes cryptocurrency data to evaluate the categorization with real-world market data. The feature space includes price, volume and market capitalization data. Additional metrics such as the moving average and the relative strengh index are added to get a more in-depth understanding of market movements. The data set is used to build supervised and unsupervised machine learning models. The prediction accuracies varied amongst labels and all remained below 90%. The technological label had the highest prediction accuracy at 88.9% using Random Forests. The economic label could be predicted with an accuracy of 81.7% using K-Nearest Neighbors. The classification using machine learning techniques is not yet accurate enough to automate the classification process. But it can be improved by adding additional features. The unsupervised clustering shows that there are more layers to the data that can be added to the ITC. The additional categories are built upon a combination of token mining, maximal supply, volume and market capitalization data. As a result we suggest that a data-driven extension of the categorization in to a token profile would allow investors and regulators to gain a deeper understanding of token performance, maturity and usage

    Feature Selection as a Preprocessing Step for Hierarchical Clustering

    No full text
    Although feature selection is a central problem in inductive learning as suggested by the growing amount of research in this area, most of the work has been carried out under the supervised learning paradigm, paying little attention to unsupervised learning tasks and, particularly, clustering tasks. In this paper, we analyze the particular benefits that feature selection may provide in hierarchical clustering tasks and explore the power of feature selection methods applied as a preprocessing step under the proposed dimensions. Instead of only predicting class labels, the focus is on a more general inference tasks over all the features. Empirical results suggest that feature selection as preprocessing only provides limited improvements in the performance task. In addition, they raise the problem of the notion of irrelevance in unsupervised settings. 1 INTRODUCTION Inductive learning systems are a powerful approach for automatically extracting useful information from data or for assisti..

    High-dimensional Sparse Count Data Clustering Using Finite Mixture Models

    Get PDF
    Due to the massive amount of available digital data, automating its analysis and modeling for different purposes and applications has become an urgent need. One of the most challenging tasks in machine learning is clustering, which is defined as the process of assigning observations sharing similar characteristics to subgroups. Such a task is significant, especially in implementing complex algorithms to deal with high-dimensional data. Thus, the advancement of computational power in statistical-based approaches is increasingly becoming an interesting and attractive research domain. Among the successful methods, mixture models have been widely acknowledged and successfully applied in numerous fields as they have been providing a convenient yet flexible formal setting for unsupervised and semi-supervised learning. An essential problem with these approaches is to develop a probabilistic model that represents the data well by taking into account its nature. Count data are widely used in machine learning and computer vision applications where an object, e.g., a text document or an image, can be represented by a vector corresponding to the appearance frequencies of words or visual words, respectively. Thus, they usually suffer from the well-known curse of dimensionality as objects are represented with high-dimensional and sparse vectors, i.e., a few thousand dimensions with a sparsity of 95 to 99%, which decline the performance of clustering algorithms dramatically. Moreover, count data systematically exhibit the burstiness and overdispersion phenomena, which both cannot be handled with a generic multinomial distribution, typically used to model count data, due to its dependency assumption. This thesis is constructed around six related manuscripts, in which we propose several approaches for high-dimensional sparse count data clustering via various mixture models based on hierarchical Bayesian modeling frameworks that have the ability to model the dependency of repetitive word occurrences. In such frameworks, a suitable distribution is used to introduce the prior information into the construction of the statistical model, based on a conjugate distribution to the multinomial, e.g. the Dirichlet, generalized Dirichlet, and the Beta-Liouville, which has numerous computational advantages. Thus, we proposed a novel model that we call the Multinomial Scaled Dirichlet (MSD) based on using the scaled Dirichlet as a prior to the multinomial to allow more modeling flexibility. Although these frameworks can model burstiness and overdispersion well, they share similar disadvantages making their estimation procedure is very inefficient when the collection size is large. To handle high-dimensionality, we considered two approaches. First, we derived close approximations to the distributions in a hierarchical structure to bring them to the exponential-family form aiming to combine the flexibility and efficiency of these models with the desirable statistical and computational properties of the exponential family of distributions, including sufficiency, which reduce the complexity and computational efforts especially for sparse and high-dimensional data. Second, we proposed a model-based unsupervised feature selection approach for count data to overcome several issues that may be caused by the high dimensionality of the feature space, such as over-fitting, low efficiency, and poor performance. Furthermore, we handled two significant aspects of mixture based clustering methods, namely, parameters estimation and performing model selection. We considered the Expectation-Maximization (EM) algorithm, which is a broadly applicable iterative algorithm for estimating the mixture model parameters, with incorporating several techniques to avoid its initialization dependency and poor local maxima. For model selection, we investigated different approaches to find the optimal number of components based on the Minimum Message Length (MML) philosophy. The effectiveness of our approaches is evaluated using challenging real-life applications, such as sentiment analysis, hate speech detection on Twitter, topic novelty detection, human interaction recognition in films and TV shows, facial expression recognition, face identification, and age estimation
    corecore