17 research outputs found

    Neuroengineering of Clustering Algorithms

    Get PDF
    Cluster analysis can be broadly divided into multivariate data visualization, clustering algorithms, and cluster validation. This dissertation contributes neural network-based techniques to perform all three unsupervised learning tasks. Particularly, the first paper provides a comprehensive review on adaptive resonance theory (ART) models for engineering applications and provides context for the four subsequent papers. These papers are devoted to enhancements of ART-based clustering algorithms from (a) a practical perspective by exploiting the visual assessment of cluster tendency (VAT) sorting algorithm as a preprocessor for ART offline training, thus mitigating ordering effects; and (b) an engineering perspective by designing a family of multi-criteria ART models: dual vigilance fuzzy ART and distributed dual vigilance fuzzy ART (both of which are capable of detecting complex cluster structures), merge ART (aggregates partitions and lessens ordering effects in online learning), and cluster validity index vigilance in fuzzy ART (features a robust vigilance parameter selection and alleviates ordering effects in offline learning). The sixth paper consists of enhancements to data visualization using self-organizing maps (SOMs) by depicting in the reduced dimension and topology-preserving SOM grid information-theoretic similarity measures between neighboring neurons. This visualization\u27s parameters are estimated using samples selected via a single-linkage procedure, thereby generating heatmaps that portray more homogeneous within-cluster similarities and crisper between-cluster boundaries. The seventh paper presents incremental cluster validity indices (iCVIs) realized by (a) incorporating existing formulations of online computations for clusters\u27 descriptors, or (b) modifying an existing ART-based model and incrementally updating local density counts between prototypes. Moreover, this last paper provides the first comprehensive comparison of iCVIs in the computational intelligence literature --Abstract, page iv

    A data science approach to portuguese road accidents’ data

    Get PDF
    Dissertação de mestrado integrado em Informatics EngineeringWe frequently hear about accidents and traffic news on television, radio and even social networks. Even though we have witnessed a decrease in mortality rate in Portuguese roads, the number of road victims have been increasing recently so we should be more aware of this problem, study it and come up with solutions to decrease the mortality rate and the number of victims in Portuguese roads. One possible solution to this problem is the identification of blackspots (areas with a high number of accidents or an abnormal number of fatalities) associated with temporal and spatial analysis, and relations between them. By doing this, we will be closer to decreasing accidents as well as the mortality rate on Portuguese roads. This dissertation is going to focus on these concerns using the information present on ANSR (Autoridade Nacional de Segurança Rodoviária) reports as well as other data gathered by the research team regarding road traffic incidents in Portuguese cities. After researching about the state of the art, we realize that, on one hand, there’s a big problem which is traffic accidents and resultant victims that are still to this day very concerning to society, on the other hand, many techniques and methods have been developed and improved to help mitigate this problem. The data have shown that Portugal still has work to do on decreasing the number of accidents and victims according to those evolution curves, data collected in ANSR reports and the comparison between traffic numbers in EU countries. This dissertation focused on understanding, processing and exploring data in-depth, developing models to analyze data, preventing accidents and enhancing road safety and coming up with useful insights about the road network and publishing them in a dashboard platform open to the community.Frequentemente, ouvimos falar de acidentes e notícias sobre trânsito na televisão, rádio e redes sociais. Apesar de estarmos a testemunhar um decréscimo da taxa de mortalidade em estradas portuguesas, o número de vítimas resultantes de acidentes têm vindo a aumentar recentemente, por isso, devemos estar mais atentos a este problema, estudá-lo e arranjar soluções para diminuir a taxa de mortalidade e o número de pessoas vítimas de acidentes em estradas portuguesas. Uma possível solução para este problema é a identificação de zonas negras (zonas com um número elevado de acidentes ou um número anormal de óbitos) associado a uma análise temporal e espacial, juntamente com as relações entre eles. Ao fazer isto, estaremos mais perto de diminuir o número de acidentes, bem como a taxa de mortalidade nas estradas portuguesas. Esta dissertação irá focar-se nestes aspetos, utilizando a informação presente no relatórios da ANSR (Autoridade Nacional de Segurança Rodoviária) e também outros dados recolhidos pela equipa de investigação relativamente a incidentes rodoviários em estradas portuguesas. Depois de recolher dados sobre o estado de arte, percebemos que, por um lado, existe um grande problema com os acidentes rodoviários e vítimas dos mesmos que são até ao dia de hoje muito preocupantes para a sociedade, por outro lado, muitas técnicas e métodos que têm vindo a ser desenvolvidos e melhorados para ajudar a mitigar este problema. Os dados mostram que Portugal ainda tem trabalho a fazer para diminuir os números de acidentes e de vítimas tendo em consideração as curvas de evolução destes indicadores, dados recolhidos em relatórios da ANSR e a comparação entre dados rodoviários entre países da UE. Esta dissertação focou-se em perceber, processar e explorar os dados a fundo, desenvolver modelos para analisar os dados, prevenir acidentes e aumentar a segurança rodoviária e encontrar perceções sobre a rede rodoviária e publicá-las numa plataforma com painéis de informação disponíveis para a comunidade

    A Curious Robot Learner for Interactive Goal-Babbling (Strategically Choosing What, How, When and from Whom to Learn)

    Get PDF
    Les dé s pour voir des robots opérant dans l environnement de tous les jours des humains et sur unelongue durée soulignent l importance de leur adaptation aux changements qui peuvent être imprévisiblesau moment de leur construction. Ils doivent être capable de savoir quelles parties échantillonner, et quelstypes de compétences il a intérêt à acquérir. Une manière de collecter des données est de décider par soi-même où explorer. Une autre manière est de se référer à un mentor. Nous appelons ces deux manièresde collecter des données des modes d échantillonnage. Le premier mode d échantillonnage correspondà des algorithmes développés dans la littérature pour automatiquement pousser l agent vers des partiesintéressantes de l environnement ou vers des types de compétences utiles. De tels algorithmes sont appelésdes algorithmes de curiosité arti cielle ou motivation intrinsèque. Le deuxième mode correspond au guidagesocial ou l imitation, où un partenaire humain indique où explorer et où ne pas explorer.Nous avons construit une architecture algorithmique intrinsèquement motivée pour apprendre commentproduire par ses actions des e ets et conséquences variées. Il apprend de manière active et en ligne encollectant des données qu il choisit en utilisant plusieurs modes d échantillonnage. Au niveau du metaapprentissage, il apprend de manière active quelle stratégie d échantillonnage est plus e cace pour améliorersa compétence et généraliser à partir de son expérience à un grand éventail d e ets. Par apprentissage parinteraction, il acquiert de multiples compétences de manière structurée, en découvrant par lui-même lesséquences développementale.The challenges posed by robots operating in human environments on a daily basis and in the long-termpoint out the importance of adaptivity to changes which can be unforeseen at design time. The robot mustlearn continuously in an open-ended, non-stationary and high dimensional space. It must be able to knowwhich parts to sample and what kind of skills are interesting to learn. One way is to decide what to exploreby oneself. Another way is to refer to a mentor. We name these two ways of collecting data sampling modes.The rst sampling mode correspond to algorithms developed in the literature in order to autonomously drivethe robot in interesting parts of the environment or useful kinds of skills. Such algorithms are called arti cialcuriosity or intrinsic motivation algorithms. The second sampling mode correspond to social guidance orimitation where the teacher indicates where to explore as well as where not to explore. Starting fromthe study of the relationships between these two concurrent methods, we ended up building an algorithmicarchitecture with a hierarchical learning structure, called Socially Guided Intrinsic Motivation (SGIM).We have built an intrinsically motivated active learner which learns how its actions can produce variedconsequences or outcomes. It actively learns online by sampling data which it chooses by using severalsampling modes. On the meta-level, it actively learns which data collection strategy is most e cient forimproving its competence and generalising from its experience to a wide variety of outcomes. The interactivelearner thus learns multiple tasks in a structured manner, discovering by itself developmental sequences.BORDEAUX1-Bib.electronique (335229901) / SudocSudocFranceF

    Feature Subset Selection in Intrusion Detection Using Soft Computing Techniques

    Get PDF
    Intrusions on computer network systems are major security issues these days. Therefore, it is of utmost importance to prevent such intrusions. The prevention of such intrusions is entirely dependent on their detection that is a main part of any security tool such as Intrusion Detection System (IDS), Intrusion Prevention System (IPS), Adaptive Security Alliance (ASA), checkpoints and firewalls. Therefore, accurate detection of network attack is imperative. A variety of intrusion detection approaches are available but the main problem is their performance, which can be enhanced by increasing the detection rates and reducing false positives. Such weaknesses of the existing techniques have motivated the research presented in this thesis. One of the weaknesses of the existing intrusion detection approaches is the usage of a raw dataset for classification but the classifier may get confused due to redundancy and hence may not classify correctly. To overcome this issue, Principal Component Analysis (PCA) has been employed to transform raw features into principal features space and select the features based on their sensitivity. The sensitivity is determined by the values of eigenvalues. The recent approaches use PCA to project features space to principal feature space and select features corresponding to the highest eigenvalues, but the features corresponding to the highest eigenvalues may not have the optimal sensitivity for the classifier due to ignoring many sensitive features. Instead of using traditional approach of selecting features with the highest eigenvalues such as PCA, this research applied a Genetic Algorithm (GA) to search the principal feature space that offers a subset of features with optimal sensitivity and the highest discriminatory power. Based on the selected features, the classification is performed. The Support Vector Machine (SVM) and Multilayer Perceptron (MLP) are used for classification purpose due to their proven ability in classification. This research work uses the Knowledge Discovery and Data mining (KDD) cup dataset, which is considered benchmark for evaluating security detection mechanisms. The performance of this approach was analyzed and compared with existing approaches. The results show that proposed method provides an optimal intrusion detection mechanism that outperforms the existing approaches and has the capability to minimize the number of features and maximize the detection rates

    Non intrusive load monitoring & identification for energy management system using computational intelligence approach

    Get PDF
    Includes bibliography.Electrical energy is the life line to every nation’s or continent development and economic progress. Referable to the recent growth in the demand for electricity and shortage in production, it is indispensable to develop strategies for effective energy management and system delivery. Load monitoring such as intrusive load monitoring, non-intrusive load monitoring, and identification of domestic electrical appliances is proposed especially at the residential level since it is the major energy consumer. The intrusive load monitoring provides accurate results and would allow each individual appliance's energy consumption to be transmitted to a central hub. Nevertheless, there are many practical disadvantages to this method that have motivated the introduction of non-intrusive load monitoring system. The fiscal cost of manufacturing and installing enough monitoring devices to match the number of domestic appliances is considered to be a disadvantage. In addition, the installation of one meter per household appliances would lead to congestion in the house and thus cause inconvenience to the occupants of the house, therefore, non-intrusive load monitoring technique was developed to alleviate the aforementioned challenges of intrusive load monitoring. Non-intrusive load monitoring (NILM) is the process of disaggregating a household’s total energy consumption into its contributing appliances. The total household load is monitored via a single monitoring device such as smart meter (SM). NILM provides cost effective and convenient means of load monitoring and identification. Several nonintrusive load monitoring and identification techniques are reviewed. However, the literature lacks a comprehensive system that can identify appliances with small energy consumption, appliances with overlapping energy consumption and a group of appliance ranges at once. This has been the major setback to most of the adopted techniques. In this dissertation, we propose techniques that overcome these setbacks by combining artificial neural networks (ANN) with a developed algorithm to identify appliances ranges that contribute to the energy consumption within a given period of time usually an hour interval

    Feature Subset Selection in Intrusion Detection Using Soft Computing Techniques

    Get PDF
    Intrusions on computer network systems are major security issues these days. Therefore, it is of utmost importance to prevent such intrusions. The prevention of such intrusions is entirely dependent on their detection that is a main part of any security tool such as Intrusion Detection System (IDS), Intrusion Prevention System (IPS), Adaptive Security Alliance (ASA), checkpoints and firewalls. Therefore, accurate detection of network attack is imperative. A variety of intrusion detection approaches are available but the main problem is their performance, which can be enhanced by increasing the detection rates and reducing false positives. Such weaknesses of the existing techniques have motivated the research presented in this thesis. One of the weaknesses of the existing intrusion detection approaches is the usage of a raw dataset for classification but the classifier may get confused due to redundancy and hence may not classify correctly. To overcome this issue, Principal Component Analysis (PCA) has been employed to transform raw features into principal features space and select the features based on their sensitivity. The sensitivity is determined by the values of eigenvalues. The recent approaches use PCA to project features space to principal feature space and select features corresponding to the highest eigenvalues, but the features corresponding to the highest eigenvalues may not have the optimal sensitivity for the classifier due to ignoring many sensitive features. Instead of using traditional approach of selecting features with the highest eigenvalues such as PCA, this research applied a Genetic Algorithm (GA) to search the principal feature space that offers a subset of features with optimal sensitivity and the highest discriminatory power. Based on the selected features, the classification is performed. The Support Vector Machine (SVM) and Multilayer Perceptron (MLP) are used for classification purpose due to their proven ability in classification. This research work uses the Knowledge Discovery and Data mining (KDD) cup dataset, which is considered benchmark for evaluating security detection mechanisms. The performance of this approach was analyzed and compared with existing approaches. The results show that proposed method provides an optimal intrusion detection mechanism that outperforms the existing approaches and has the capability to minimize the number of features and maximize the detection rates

    Intelligent support system for CVA diagnosis by cerebral computerized tomography

    Get PDF
    The Cerebral Vascular Accident (CVA) is one of the major causes of death in USA and developed countries, immediately following cardiac diseases and tumors. The increasing number of CVA’s and the requirement of short time diagnosis to minimize morbidity and mortality encourages the development of computer aided diagnosis systems. Early stages of CVA are often undetected by human eye observation of Computer Tomographic (CT) images, thus incorporation of intelligent based techniques on such systems is expected to highly improve their performance. This thesis presents a Radial Basis Functions Neural Network (RBFNN) based diagnosis system for automatic identification of CVA through analysis of CT images. The research hereby reported included construction of a database composed of annotated CT images, supported by a web-based tool for Neuroradiologist registration of his/her normal or abnormal interpretation of each CT image; in case of an abnormal identification the medical doctor was indicted by the software application to designate the lesion type and to identify the abnormal region on each CT’s slice image. Once provided the annotated database each CT image processing considered a pre-processing stage for artefact removal and tilted images’ realignment followed by a feature extraction stage. A large number of features was considered, comprising first and second order pixel intensity statistics as well as symmetry/asymmetry information with respect to the ideal mid-sagittal line of each image. The policy conducted during the intelligent-driven image processing system development included the design of a neural network classifier. The architecture was determined by a Multi Objective Genetic Algorithm (MOGA) where the classifier structure, parameters and image features (input features) were chosen based on the use of different (often conflicting) objectives, ensuring maximization of the classification precision and a good generalization of its performance for unseen data Several scenarios of choosing proper MOGA’s data sets were conducted. The best result was obtained from the scenario where all boundary data points of an enlarged dataset were included in the MOGA training set. Confronted with the NeuroRadiologist annotations, specificity values of 98.01% and sensitivity values of 98.22% were obtained by the computer aided system, at pixel level. These values were achieved when an ensemble of non-dominated models generated by MOGA in the best scenario, was applied to a set of 150 CT slices (1,867,602 pixels). Present results show that the MOGA designed RBFNN classifier achieved better classification results than Support Vector Machines (SVM), despite the huge difference in complexity of the two classifiers. The proposed approach compares also favorably with other similar published solutions, both at lesion level specificity and at the degree of coincidence of marked lesions
    corecore