13 research outputs found

    Evolving classifier TEDAClass for big data

    Get PDF
    In the era of big data, huge amounts of data are generated and updated every day, and their processing and analysis is an important challenge today. In order to tackle this challenge, it is necessary to develop specific techniques which can process large volume of data within limited run times. TEDA is a new systematic framework for data analytics, which is based on the typicality and eccentricity of the data. This framework is spatially-aware, non-frequentist and non-parametric. TEDA can be used for development of alternative machine learning methods, in this work, we will use it for classification (TEDAClass). Specifically, we present a TEDAClass based approach which can process huge amounts of data items using a novel parallelization technique. Using this parallelization, we make possible the scalability of TEDAClass. In that way, the proposed approach is particularly useful for various applications, as it opens the doors for high-performance big data processing, which could be particularly useful for healthcare, banking, scientific and many other purposes

    Evolving Ensemble Fuzzy Classifier

    Full text link
    The concept of ensemble learning offers a promising avenue in learning from data streams under complex environments because it addresses the bias and variance dilemma better than its single model counterpart and features a reconfigurable structure, which is well suited to the given context. While various extensions of ensemble learning for mining non-stationary data streams can be found in the literature, most of them are crafted under a static base classifier and revisits preceding samples in the sliding window for a retraining step. This feature causes computationally prohibitive complexity and is not flexible enough to cope with rapidly changing environments. Their complexities are often demanding because it involves a large collection of offline classifiers due to the absence of structural complexities reduction mechanisms and lack of an online feature selection mechanism. A novel evolving ensemble classifier, namely Parsimonious Ensemble pENsemble, is proposed in this paper. pENsemble differs from existing architectures in the fact that it is built upon an evolving classifier from data streams, termed Parsimonious Classifier pClass. pENsemble is equipped by an ensemble pruning mechanism, which estimates a localized generalization error of a base classifier. A dynamic online feature selection scenario is integrated into the pENsemble. This method allows for dynamic selection and deselection of input features on the fly. pENsemble adopts a dynamic ensemble structure to output a final classification decision where it features a novel drift detection scenario to grow the ensemble structure. The efficacy of the pENsemble has been numerically demonstrated through rigorous numerical studies with dynamic and evolving data streams where it delivers the most encouraging performance in attaining a tradeoff between accuracy and complexity.Comment: this paper has been published by IEEE Transactions on Fuzzy System

    Intelligent video surveillance

    Get PDF
    In the focus of this thesis are the new and modified algorithms for object detection, recognition and tracking within the context of video analytics. The manual video surveillance has been proven to have low effectiveness and, at the same time, high expense because of the need in manual labour of operators, which are additionally prone to erroneous decisions. Along with increase of the number of surveillance cameras, there is a strong need to push for automatisation of the video analytics. The benefits of this approach can be found both in military and civilian applications. For military applications, it can help in localisation and tracking of objects of interest. For civilian applications, the similar object localisation procedures can make the criminal investigations more effective, extracting the meaningful data from the massive video footage. Recently, the wide accessibility of consumer unmanned aerial vehicles has become a new threat as even the simplest and cheapest airborne vessels can carry some cargo that means they can be upgraded to a serious weapon. Additionally they can be used for spying that imposes a threat to a private life. The autonomous car driving systems are now impossible without applying machine vision methods. The industrial applications require automatic quality control, including non-destructive methods and particularly methods based on the video analysis. All these applications give a strong evidence in a practical need in machine vision algorithms for object detection, tracking and classification and gave a reason for writing this thesis. The contributions to knowledge of the thesis consist of two main parts: video tracking and object detection and recognition, unified by the common idea of its applicability to video analytics problems. The novel algorithms for object detection and tracking, described in this thesis, are unsupervised and have only a small number of parameters. The approach is based on rigid motion segmentation by Bayesian filtering. The Bayesian filter, which was proposed specially for this method and contributes to its novelty, is formulated as a generic approach, and then applied to the video analytics problems. The method is augmented with optional object coordinate estimation using plain two-dimensional terrain assumption which gives a basis for the algorithm usage inside larger sensor data fusion models. The proposed approach for object detection and classification is based on the evolving systems concept and the new Typicality-Eccentricity Data Analytics (TEDA) framework. The methods are capable of solving classical problems of data mining: clustering, classification, and regression. The methods are proposed in a domain-independent way and are capable of addressing shift and drift of the data streams. Examples are given for the clustering and classification of the imagery data. For all the developed algorithms, the experiments have shown sustainable results on the testing data. The practical applications of the proposed algorithms are carefully examined and tested

    Real-Time Recognition of Calling Pattern and Behaviour of Mobile Phone Users through Anomaly Detection and Dynamically-Evolving Clustering

    Get PDF
    In the competitive telecommunications market, the information that the mobile telecom operators can obtain by regularly analysing their massive stored call logs, is of great interest. Although the data that can be extracted nowadays from mobile phones have been enriched with much information, the data solely from the call logs can give us vital information about the customers. This information is usually related with the calling behaviour of their customers and it can be used to manage them. However, the analysis of these data is normally very complex because of the vast data stream to analyse. Thus, efficient data mining techniques need to be used for this purpose. In this paper, a novel approach to analyse call detail records (CDR) is proposed, with the main goal to extract and cluster different calling patterns or behaviours, and to detect outliers. The main novelty of this approach is that it works in real-time using an evolving and recursive framework.This work has been supported by the Spanish Ministry of Science and Innovation (MICINN) under projects: TRA2015-63708-R and TRA2016-78886-C3-1-R

    Evolving fuzzy and neuro-fuzzy approaches in clustering, regression, identification, and classification: A Survey

    Get PDF
    Major assumptions in computational intelligence and machine learning consist of the availability of a historical dataset for model development, and that the resulting model will, to some extent, handle similar instances during its online operation. However, in many real world applications, these assumptions may not hold as the amount of previously available data may be insufficient to represent the underlying system, and the environment and the system may change over time. As the amount of data increases, it is no longer feasible to process data efficiently using iterative algorithms, which typically require multiple passes over the same portions of data. Evolving modeling from data streams has emerged as a framework to address these issues properly by self-adaptation, single-pass learning steps and evolution as well as contraction of model components on demand and on the fly. This survey focuses on evolving fuzzy rule-based models and neuro-fuzzy networks for clustering, classification and regression and system identification in online, real-time environments where learning and model development should be performed incrementally. (C) 2019 Published by Elsevier Inc.Igor Škrjanc, Jose Antonio Iglesias and Araceli Sanchis would like to thank to the Chair of Excellence of Universidad Carlos III de Madrid, and the Bank of Santander Program for their support. Igor Škrjanc is grateful to Slovenian Research Agency with the research program P2-0219, Modeling, simulation and control. Daniel Leite acknowledges the Minas Gerais Foundation for Research and Development (FAPEMIG), process APQ-03384-18. Igor Škrjanc and Edwin Lughofer acknowledges the support by the ”LCM — K2 Center for Symbiotic Mechatronics” within the framework of the Austrian COMET-K2 program. Fernando Gomide is grateful to the Brazilian National Council for Scientific and Technological Development (CNPq) for grant 305906/2014-3

    Empirical data analytics

    Get PDF
    In this paper, we propose an approach to data analysis, which is based entirely on the empirical observations of discrete data samples and the relative proximity of these points in the data space. At the core of the proposed new approach is the typicality—an empirically derived quantity that resembles probability. This nonparametric measure is a normalized form of the square centrality (centrality is a measure of closeness used in graph theory). It is also closely linked to the cumulative proximity and eccentricity (a measure of the tail of the distributions that is very useful for anomaly detection and analysis of extreme values). In this paper, we introduce and study two types of typicality, namely its local and global versions. The local typicality resembles the well-known probability density function (pdf), probability mass function, and fuzzy set membership but differs from all of them. The global typicality, on the other hand, resembles well-known histograms but also differs from them. A distinctive feature of the proposed new approach, empirical data analysis (EDA), is that it is not limited by restrictive impractical prior assumptions about the data generation model as the traditional probability theory and statistical learning approaches are. Moreover, it does not require an explicit and binary assumption of either randomness or determinism of the empirically observed data, their independence, or even their number (it can be as low as a couple of data samples). The typicality is considered as a fundamental quantity in the pattern analysis, which is derived directly from data and is stated in a discrete form in contrast to the traditional approach where a continuous pdf is assumed a priori and estimated from data afterward. The typicality introduced in this paper is free from the paradoxes of the pdf. Typicality is objectivist while the fuzzy sets and the belief-based branch of the probability theory are subjectivist. The local typicality is expressed in a closed analytical form and can be calculated recursively, thus, computationally very efficiently. The other nonparametric ensemble properties of the data introduced and studied in this paper, namely, the square centrality, cumulative proximity, and eccentricity, can also be updated recursively for various types of distance metrics. Finally, a new type of classifier called naïve typicality-based EDA class is introduced, which is based on the newly introduced global typicality. This is only one of the wide range of possible applications of EDA including but not limited for anomaly detection, clustering, classification, control, prediction, control, rare events analysis, etc., which will be the subject of further research

    Autonomous Data Density pruning fuzzy neural network for Optical Interconnection Network

    Get PDF
    Traditionally, fuzzy neural networks have parametric clustering methods based on equally spaced membership functions to fuzzify inputs of the model. In this sense, it produces an excessive number calculations for the parameters’ definition of the network architecture, which may be a problem especially for real-time large-scale tasks. Therefore, this paper proposes a new model that uses a non-parametric technique for the fuzzification process. The proposed model uses an autonomous data density approach in a pruned fuzzy neural network, wich favours the compactness of the model. The performance of the proposed approach is evaluated through the usage of databases related to the Optical Interconnection Network. Finally, binary patterns classification tests for the identification of temporal distribution (asynchronous or client–server) were performed and compared with state-of-the-art fuzzy neural-based and traditional machine learning approaches. Results demonstrated that the proposed model is an efficient tool for these challenging classification tasks

    Uma abordagem baseada em regras fuzzy auto-organizáveis para classificação de ambientes internos em aplicações de IoT

    Get PDF
    Nowadays, a great part of the sensors adopted in IoT use wireless technology to facilitate the construction of sensor networks. In this sense, the classification of the type of environment in which these sensors are located plays an important role in the performance of these sensor networks, since it leads to efficient power consumption when operating the deployed IoT sensors. Thus, this dissertation presents an enhancement in the SelfOrganizing Fuzzy Classifier model, which makes the classification of indoor environments from real-time measurements of the radio-frequency signal of a real wireless sensor network. A comparison between the original classifier model and the model proposed in this dissertation was made, as well as other common machine learning methods literature. The evaluated metrics were Accuracy, F-Score, Kappa coefficient, and MSE. The experimental results show that the proposed approach obtained high performance in solving the presented problem.Atualmente, grande parte dos sensores utilizados em Internet das Coisas adota tecnologia sem fio, a fim de facilitar a construção de redes de sensoriamento. Neste sentido, a classificação do tipo de ambiente no qual estes sensores estão localizados exerce um importante papel no desempenho de tais redes de sensoriamento, uma vez que pode ser utilizada na determinação de níveis mais eficientes de consumo de energia dos sensores que as compõe. Assim, neste trabalho é apresentada a proposição de uma versão estendida do modelo classificador Fuzzy Auto-Organizável, que faz a classificação de ambientes internos a partir de medições do sinal de radiofrequência de uma rede de sensoriamento sem fio em um ambiente real. Foi realizada uma comparação do modelo de classificador original com o modelo proposto nesse trabalho, bem como outros métodos de aprendizado de máquina comuns na literatura. Como métricas foram avaliados: Acurácia média, F-Score, coeficiente Kappa e MSE. Os resultados experimentais mostram que a abordagem proposta obteve alto desempenho na solução do problema apresentado.CAPES - Coordenação de Aperfeiçoamento de Pessoal de Nível Superio
    corecore