22 research outputs found
Evolving clustering, classification and regression with TEDA
In this article the novel clustering and regression methods TEDACluster and TEDAPredict methods are described additionally to recently proposed evolving classifier TEDAClass. The algorithms for classification, clustering and regression are based on the recently proposed AnYa type fuzzy rule based system. The novel methods use the recently proposed TEDA framework capable of recursive processing of large amounts of data. The framework is capable of computationally cheap exact update of data per sample, and can be used for training `from scratch'. All three algorithms are evolving that is they are capable of changing its own structure during the update stage, which allows to follow the changes within the model pattern
A nested hierarchy of dynamically evolving clouds for big data structuring and searching
The need to analyse big data streams and prescribe actions pro-actively is pervasive in nearly every industry. As growth of unstructured data increases, using analytical systems to assimilate and interpret images and videos as well as interpret structured data is essential. In this paper, we proposed a novel approach to transform image dataset into higher-level constructs that can be analysed more computationally efficiently, reliably and extremely fast. The proposed approach provides a high visual quality result between the query image and data clouds with hierarchical dynamically nested evolving structure. The results illustrate that the introduced approach can be an effective yet computationally efficient way to analyse and manipulate storedimages which has become the centre of attention of many professional fields and institutional sectors over the last few years
Evolving Ensemble Fuzzy Classifier
The concept of ensemble learning offers a promising avenue in learning from
data streams under complex environments because it addresses the bias and
variance dilemma better than its single model counterpart and features a
reconfigurable structure, which is well suited to the given context. While
various extensions of ensemble learning for mining non-stationary data streams
can be found in the literature, most of them are crafted under a static base
classifier and revisits preceding samples in the sliding window for a
retraining step. This feature causes computationally prohibitive complexity and
is not flexible enough to cope with rapidly changing environments. Their
complexities are often demanding because it involves a large collection of
offline classifiers due to the absence of structural complexities reduction
mechanisms and lack of an online feature selection mechanism. A novel evolving
ensemble classifier, namely Parsimonious Ensemble pENsemble, is proposed in
this paper. pENsemble differs from existing architectures in the fact that it
is built upon an evolving classifier from data streams, termed Parsimonious
Classifier pClass. pENsemble is equipped by an ensemble pruning mechanism,
which estimates a localized generalization error of a base classifier. A
dynamic online feature selection scenario is integrated into the pENsemble.
This method allows for dynamic selection and deselection of input features on
the fly. pENsemble adopts a dynamic ensemble structure to output a final
classification decision where it features a novel drift detection scenario to
grow the ensemble structure. The efficacy of the pENsemble has been numerically
demonstrated through rigorous numerical studies with dynamic and evolving data
streams where it delivers the most encouraging performance in attaining a
tradeoff between accuracy and complexity.Comment: this paper has been published by IEEE Transactions on Fuzzy System
Intelligent video surveillance
In the focus of this thesis are the new and modified algorithms for object detection, recognition and tracking within the context of video analytics. The manual video surveillance has been proven to have low effectiveness and, at the same time, high expense because of the need in manual labour of operators, which are additionally prone to erroneous decisions. Along with increase of the number of surveillance cameras, there is a strong need to push for automatisation of the video analytics. The benefits of this approach can be found both in military and civilian applications. For military applications, it can help in localisation and tracking of objects of interest. For civilian applications, the similar object localisation procedures can make the criminal investigations more effective, extracting the meaningful data from the massive video footage. Recently, the wide accessibility of consumer unmanned aerial vehicles has become a new threat as even the simplest and cheapest airborne vessels can carry some cargo that means they can be upgraded to a serious weapon. Additionally they can be used for spying that imposes a threat to a private life. The autonomous car driving systems are now impossible without applying machine vision methods. The industrial applications require automatic quality control, including non-destructive methods and particularly methods based on the video analysis. All these applications give a strong evidence in a practical need in machine vision algorithms for object detection, tracking and classification and gave a reason for writing this thesis. The contributions to knowledge of the thesis consist of two main parts: video tracking and object detection and recognition, unified by the common idea of its applicability to video analytics problems. The novel algorithms for object detection and tracking, described in this thesis, are unsupervised and have only a small number of parameters. The approach is based on rigid motion segmentation by Bayesian filtering. The Bayesian filter, which was proposed specially for this method and contributes to its novelty, is formulated as a generic approach, and then applied to the video analytics problems. The method is augmented with optional object coordinate estimation using plain two-dimensional terrain assumption which gives a basis for the algorithm usage inside larger sensor data fusion models. The proposed approach for object detection and classification is based on the evolving systems concept and the new Typicality-Eccentricity Data Analytics (TEDA) framework. The methods are capable of solving classical problems of data mining: clustering, classification, and regression. The methods are proposed in a domain-independent way and are capable of addressing shift and drift of the data streams. Examples are given for the clustering and classification of the imagery data. For all the developed algorithms, the experiments have shown sustainable results on the testing data. The practical applications of the proposed algorithms are carefully examined and tested
Online fault detection based on typicality and eccentricity data analytics
Fault detection is a task of major importance in industry nowadays, since that it can considerably reduce the risk of accidents involving human lives, in addition to production and, consequently, financial losses. Therefore, fault detection systems have been largely studied in the past few years, resulting in many different methods and approaches to solve such problem. This paper presents a detailed study on fault detection on industrial processes based on the recently introduced eccentricity and typicality data analytics (TEDA) approach. TEDA is a recursive and non-parametric method, firstly proposed to the general problem of anomaly detection on data streams. It is based on the measures of data density and proximity from each read data point to the analyzed data set. TEDA is an online autonomous learning algorithm that does not require a priori knowledge about the process, is completely free of user- and problem-defined parameters, requires very low computational effort and, thus, is very suitable for real-time applications. The results further presented were generated by the application of TEDA to a pilot plant for industrial process
An evolving approach to data streams clustering based on typicality and eccentricity data analytics
In this paper we propose an algorithm for online clustering of data streams. This algorithm is called AutoCloud and is based on the recently introduced concept of Typicality and Eccentricity Data Analytics, mainly used for anomaly detection tasks. AutoCloud is an evolving, online and recursive technique that does not need training or prior knowledge about the data set. Thus, AutoCloud is fully online, requiring no offline processing. It allows creation and merging of clusters autonomously as new data observations become available. The clusters created by AutoCloud are called data clouds, which are structures without pre-defined shape or boundaries. AutoCloud allows each data sample to belong to multiple data clouds simultaneously using fuzzy concepts. AutoCloud is also able to handle concept drift and concept evolution, which are problems that are inherent in data streams in general. Since the algorithm is recursive and online, it is suitable for applications that require a real-time response. We validate our proposal with applications to multiple well known data sets in the literature
Multi-Objective Evolutionary Optimisation for Prototype-Based Fuzzy Classifiers
Evolving intelligent systems (EISs), particularly, the zero-order ones have demonstrated strong performance on many real-world problems concerning data stream classification, while offering high model transparency and interpretability thanks to their prototype-based nature. Zero-order EISs typically learn prototypes by clustering streaming data online in a “one pass” manner for greater computation efficiency. However, such identified prototypes often lack optimality, resulting in less precise classification boundaries, thereby hindering the potential classification performance of the systems. To address this issue, a commonly adopted strategy is to minimise the training error of the models on historical training data or alternatively, to iteratively minimise the intra-cluster variance of the clusters obtained via online data partitioning. This recognises the fact that the ultimate classification performance of zero-order EISs is driven by the positions of prototypes in the data space. Yet, simply minimising the training error may potentially lead to overfitting, whilst minimising the intra-cluster variance does not necessarily ensure the optimised prototype-based models to attain improved classification outcomes. To achieve better classification performance whilst avoiding overfitting for zero-order EISs, this paper presents a novel multi-objective optimisation approach, enabling EISs to obtain optimal prototypes via involving these two disparate but complementary strategies simultaneously. Five decision-making schemes are introduced for selecting a suitable solution to deploy from the final non-dominated set of the resulting optimised models. Systematic experimental studies are carried out to demonstrate the effectiveness of the proposed optimisation approach in improving the classification performance of zero-order EISs