Search CORE

4,536 research outputs found

A Short Survey on Data Clustering Algorithms

Author: Wong Ka-Chun
Publication venue
Publication date: 25/11/2015
Field of study

With rapidly increasing data, clustering algorithms are important tools for data analytics in modern research. They have been successfully applied to a wide range of domains; for instance, bioinformatics, speech recognition, and financial analysis. Formally speaking, given a set of data instances, a clustering algorithm is expected to divide the set of data instances into the subsets which maximize the intra-subset similarity and inter-subset dissimilarity, where a similarity measure is defined beforehand. In this work, the state-of-the-arts clustering algorithms are reviewed from design concept to methodology; Different clustering paradigms are discussed. Advanced clustering algorithms are also discussed. After that, the existing clustering evaluation metrics are reviewed. A summary with future insights is provided at the end

arXiv.org e-Print Archive

Crossref

An improved EEG pattern classification system based on dimensionality reduction and classifier fusion

Author: Alsukker ASM
Publication venue
Publication date: 01/01/2012
Field of study

University of Technology, Sydney. Faculty of Engineering and Information Technology.Analysis of brain electrical activities (Electroencephalography, EEG) presents a rich source of information that helps in the advancement of affordable and effective biomedical applications such as psychotropic drug research, sleep studies, seizure detection and brain computer interface (BCI). Interpretation and understanding of EEG signal will provide clinicians and physicians with useful information for disease diagnosis and monitoring biological activities. It will also help in creating a new way of communication through brain waves. This thesis aims to investigate new algorithms for improving pattern recognition systems in two main EEG-based applications. The first application represents a simple Brain Computer Interface (BCI) based on imagined motor tasks, whilst the second one represents an automatic sleep scoring system in intensive care unit. BCI system in general aims to create a lion-muscular link between brain and external devices, thus providing a new control scheme that can most benefit the extremely immobilised persons. This link is created by utilizing pattern recognition approach to interpret EEG into device commands. The commands can then be used to control wheelchairs, computers or any other equipment. The second application relates to creating an automatic scoring system through interpreting certain properties of several biomedical signals. Traditionally, sleep specialists record and analyse brain signal using electroencephalogram (EEG), muscle tone (EMG), eye movement (EOG), and other biomedical signals to detect five sleep stages: Rapid Eye Movement (REM), stage 1,... to stage 4. Acquired signals are then scored based on 30 seconds intervals that require manually inspecting one segment at a time for certain properties to interpret sleep stages. The process is time consuming and demands competence. It is thought that an automatic scoring system mimicking sleep expert rules will speed up the process and reduce the cost. Practicality of any EEG-based system depends upon accuracy and speed. The more accurate and faster classification systems are, the better will be the chance to integrate them in wider range of applications. Thus, the performance of the previous systems is further enhanced using improved feature selection, projection and classification algorithms. As processing EEG signals requires dealing with multi-dimensional data, there is a need to minimize the dimensionality in order to achieve acceptable performance with less computational cost. The first possible candidate for dimensionality reduction is employed using channel feature selection approach. Four novel feature selection methods are developed utilizing genetic algorithms, ant colony, particle swarm and differential evolution optimization. The methods provide fast and accurate implementation in selecting the most informative features/channels that best represent mental tasks. Thus, computational burden of the classifier is kept as light as possible by removing irrelevant and highly redundant features. As an alternative to dimensionality reduction approach, a novel feature projection method is also introduced. The method maps the original feature set into a small informative subset of features that can best discriminate between the different class. Unlike most existing methods based on discriminant analysis, the proposed method considers fuzzy nature of input measurements in discovering the local manifold structure. It is able to find a projection that can maximize the margin between data points from different classes at each local area while considering the fuzzy nature. In classification phase, a number of improvements to traditional nearest neighbour classifier (kNN) are introduced. The improvements address kNN weighting scheme limitations. The traditional kNN does not take into account class distribution, importance of each feature, contribution of each neighbour, and the number of instances for each class. The proposed kNN variants are based on improved distance measure and weight optimization using differential evolution. Differential evolution optimizer is utilized to enhance kNN performance through optimizing the metric weights of features, neighbours and classes. Additionally, a Fuzzy kNN variant has also been developed to favour classification of certain classes. This variant may find use in medical examination. An alternative classifier fusion method is introduced that aims to create a set of diverse neural network ensemble. The diversity is enhanced by altering the target output of each network to create a certain amount of bias towards each class. This enables the construction of a set of neural network classifiers that complement each other

OPUS - University of Technology Sydney

Arranging program statements for locality on the basis of neighbourhood preferences

Author: Leopold Claudia
Publication venue: Published by Elsevier Inc.
Publication date: 31/08/1998
Field of study

AbstractThe gradual property of computer programs, that their successive operations preferably access data from the same memory block, is called locality. The paper deals with locality optimization, more specifically with the sequencing aspect that N operations are to be brought into sequence such that locality is maximized. We assume to be given a matrix D = [Dij] of neighbourhood preferences, where entry Dij is the smaller the higher the expected gain in locality when arranging operations oi and oj closely. The gain is supposed to have been estimated from so far accumulated but still incomplete knowledge of an overall locality optimization process. Our task consists in finding a sequencing function T : {o1 … oN} → [l … N] ⊆ R that assigns to each operation a real time at which it will be approximately carried out. The motivation for T mapping into reals instead of integers is to transfer more knowledge on the certainty of operation ordering decisions into the next step of the overall locality optimization process. The goal for T consists in minimizing an objective function that was empirically designed to approximately quantify the intuitive meaning of the degree of locality. In addition, T has to spread the values T(oi) quite evenly over the interval [l … N]. We suggest a heuristic algorithm that approximately solves the problem, and report on experiments with the algorithm and several variants of it. Briefly, the algorithm starts with a random sequencing that is iteratively improved, by alternatingly moving each T(oi) in the direction of the value that minimizes the objective function for fixed T(oj)(j ≠ i), and spreading the T(oi) over [l … N]. Experimental results indicate that our algorithm is efficient and reasonably accurate

Elsevier - Publisher Connector

Towards the Evolution of Multi-Layered Neural Networks: A Dynamic Structured Grammatical Evolution Approach

Author: Assuncao Filipe
Fedorovici Lucian-Ovidiu
Gomez Faustino
Ryan Conor
Si Tapas
Sigillito Vincent G.
Street W. Nick
Tirumala Sreenivas Sremath
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 26/06/2017
Field of study

Current grammar-based NeuroEvolution approaches have several shortcomings. On the one hand, they do not allow the generation of Artificial Neural Networks (ANNs composed of more than one hidden-layer. On the other, there is no way to evolve networks with more than one output neuron. To properly evolve ANNs with more than one hidden-layer and multiple output nodes there is the need to know the number of neurons available in previous layers. In this paper we introduce Dynamic Structured Grammatical Evolution (DSGE): a new genotypic representation that overcomes the aforementioned limitations. By enabling the creation of dynamic rules that specify the connection possibilities of each neuron, the methodology enables the evolution of multi-layered ANNs with more than one output neuron. Results in different classification problems show that DSGE evolves effective single and multi-layered ANNs, with a varying number of output neurons

arXiv.org e-Print Archive

Crossref