485 research outputs found
Ensembles of wrappers for automated feature selection in fish age classification
In feature selection, the most important features must be chosen so as to decrease the number thereof while retaining their discriminatory information. Within this context, a novel feature selection method based on an ensemble of wrappers is proposed and applied for automatically select features in fish age classification. The effectiveness of this procedure using an Atlantic cod database has been tested for different powerful statistical learning classifiers. The subsets based on few features selected, e.g. otolith weight and fish weight, are particularly noticeable given current biological findings and practices in fishery research and the classification results obtained with them outperforms those of previous studies in which a manual feature selection was performed.Peer ReviewedPostprint (author's final draft
Reducing Spatial Data Complexity for Classification Models
Intelligent data analytics gradually becomes a day-to-day reality of today's businesses. However, despite rapidly
increasing storage and computational power current state-of-the-art predictive models still can not handle massive and noisy
corporate data warehouses. What is more adaptive and real-time operational environment requires multiple models to be
frequently retrained which fiirther hinders their use. Various data reduction techniques ranging from data sampling up to
density retention models attempt to address this challenge by capturing a summarised data structure, yet they either do
not account for labelled data or degrade the classification performance of the model trained on the condensed dataset. Our
response is a proposition of a new general framework for reducing the complexity of labelled data by means of controlled
spatial redistribution of class densities in the input space. On the example of Parzen Labelled Data Compressor (PLDC) we
demonstrate a simulatory data condensation process directly inspired by the electrostatic field interaction where the data are
moved and merged following the attracting and repelling interactions with the other labelled data. The process is controlled
by the class density function built on the original data that acts as a class-sensitive potential field ensuring preservation of
the original class density distributions, yet allowing data to rearrange and merge joining together their soft class partitions.
As a result we achieved a model that reduces the labelled datasets much further than any competitive approaches yet with
the maximum retention of the original class densities and hence the classification performance. PLDC leaves the reduced
dataset with the soft accumulative class weights allowing for efficient online updates and as shown in a series of experiments
if coupled with Parzen Density Classifier (PDC) significantly outperforms competitive data condensation methods in terms of
classification performance at the comparable compression levels
AutoQML: Automatic Generation and Training of Robust Quantum-Inspired Classifiers by Using Genetic Algorithms on Grayscale Images
We propose a new hybrid system for automatically generating and training
quantum-inspired classifiers on grayscale images by using multiobjective
genetic algorithms. We define a dynamic fitness function to obtain the smallest
possible circuit and highest accuracy on unseen data, ensuring that the
proposed technique is generalizable and robust. We minimize the complexity of
the generated circuits in terms of the number of entanglement gates by
penalizing their appearance. We reduce the size of the images with two
dimensionality reduction approaches: principal component analysis (PCA), which
is encoded in the individual for optimization purpose, and a small
convolutional autoencoder (CAE). These two methods are compared with one
another and with a classical nonlinear approach to understand their behaviors
and to ensure that the classification ability is due to the quantum circuit and
not the preprocessing technique used for dimensionality reduction.Comment: Submitted for review on the 7th of June 202
CCNETS: A Novel Brain-Inspired Approach for Enhanced Pattern Recognition in Imbalanced Datasets
This study introduces CCNETS (Causal Learning with Causal Cooperative Nets),
a novel generative model-based classifier designed to tackle the challenge of
generating data for imbalanced datasets in pattern recognition. CCNETS is
uniquely crafted to emulate brain-like information processing and comprises
three main components: Explainer, Producer, and Reasoner. Each component is
designed to mimic specific brain functions, which aids in generating
high-quality datasets and enhancing classification performance.
The model is particularly focused on addressing the common and significant
challenge of handling imbalanced datasets in machine learning. CCNETS's
effectiveness is demonstrated through its application to a "fraud dataset,"
where normal transactions significantly outnumber fraudulent ones (99.83% vs.
0.17%). Traditional methods often struggle with such imbalances, leading to
skewed performance metrics. However, CCNETS exhibits superior classification
ability, as evidenced by its performance metrics. Specifically, it achieved an
F1-score of 0.7992, outperforming traditional models like Autoencoders and
Multi-layer Perceptrons (MLP) in the same context. This performance indicates
CCNETS's proficiency in more accurately distinguishing between normal and
fraudulent patterns.
The innovative structure of CCNETS enhances the coherence between generative
and classification models, helping to overcome the limitations of pattern
recognition that rely solely on generative models. This study emphasizes
CCNETS's potential in diverse applications, especially where quality data
generation and pattern recognition are key. It proves effective in machine
learning, particularly for imbalanced datasets. CCNETS overcomes current
challenges in these datasets and advances machine learning with brain-inspired
approaches.Comment: 31 pages, authors (3) is Corresponding Autho
Dynamic Time Warping Averaging of Time Series Allows Faster and More Accurate Classification
Recent years have seen significant progress in improving both the efficiency and effectiveness of time series classification. However, because the best solution is typically the Nearest Neighbor algorithm with the relatively expensive Dynamic Time Warping as the distance measure, successful deployments on resource constrained devices remain elusive. Moreover, the recent explosion of interest in wearable devices, which typically have limited computational resources, has created a growing need for very efficient classification algorithms. A commonly used technique to glean the benefits of the Nearest Neighbor algorithm, without inheriting its undesirable time complexity, is to use the Nearest Centroid algorithm. However, because of the unique properties of (most) time series data, the centroid typically does not resemble any of the instances, an unintuitive and underappreciated fact. In this work we show that we can exploit a recent result to allow meaningful averaging of 'warped' times series, and that this result allows us to create ultra-efficient Nearest 'Centroid' classifiers that are at least as accurate as their more lethargic Nearest Neighbor cousins
Faster and more accurate classification of time series by exploiting a novel dynamic time warping averaging algorithm
A concerted research effort over the past two decades has heralded significant improvements in both the efficiency and effectiveness of time series classification. The consensus that has emerged in the community is that the best solution is a surprisingly simple one. In virtually all domains, the most accurate classifier is the nearest neighbor algorithm with dynamic time warping as the distance measure. The time complexity of dynamic time warping means that successful deployments on resource-constrained devices remain elusive. Moreover, the recent explosion of interest in wearable computing devices, which typically have limited computational resources, has greatly increased the need for very efficient classification algorithms. A classic technique to obtain the benefits of the nearest neighbor algorithm, without inheriting its undesirable time and space complexity, is to use the nearest centroid algorithm. Unfortunately, the unique properties of (most) time series data mean that the centroid typically does not resemble any of the instances, an unintuitive and underappreciated fact. In this paper we demonstrate that we can exploit a recent result by Petitjean et al. to allow meaningful averaging of “warped” time series, which then allows us to create super-efficient nearest “centroid” classifiers that are at least as accurate as their more computationally challenged nearest neighbor relatives. We demonstrate empirically the utility of our approach by comparing it to all the appropriate strawmen algorithms on the ubiquitous UCR Benchmarks and with a case study in supporting insect classification on resource-constrained sensors
Development and Investigation of Cost-Sensitive Pruned Decision Tree Model for Improved Schizophrenia Diagnosis
Schizophrenia is often characterized by delusions, hallucinations, and other cognitive difficulties, affects approximately seventy million adults globally. This study presents a cost-sensitive pruned Decision Tree J48 model for fast and accurate diagnosis of Schizophrenia. The model implements supervised learning procedures with 10-fold cross-validation resampling method and utilizes unstructured filter to replace missing values in the data with the modal values of corresponding features. Features are selected using Pearson’s correlation on hot-coded data to detect redundancy in data. Cost matrix is designed to minimize the tendencies of the J48 algorithm to predict false negative outcomes. This consequently reduces the error of the model in diagnosing a Schizophrenia candidate as free from the disease. The model is found to significantly diagnose Schizophrenia with 78% accuracy, 89.7% sensitivity, 57.4% specificity and Area under the Receiver Operator Characteristic (ROC) curve of 0.895. The ROC curve is also seen to distinguish Schizophrenia from other conditions with similar symptoms. These results show the potential of machine-learning models for quick, effective diagnosis of schizophrenia
Detecting and Diagnosing Incipient Building Faults Using Uncertainty Information from Deep Neural Networks
Early detection of incipient faults is of vital importance to reducing
maintenance costs, saving energy, and enhancing occupant comfort in buildings.
Popular supervised learning models such as deep neural networks are considered
promising due to their ability to directly learn from labeled fault data;
however, it is known that the performance of supervised learning approaches
highly relies on the availability and quality of labeled training data. In
Fault Detection and Diagnosis (FDD) applications, the lack of labeled incipient
fault data has posed a major challenge to applying these supervised learning
techniques to commercial buildings. To overcome this challenge, this paper
proposes using Monte Carlo dropout (MC-dropout) to enhance the supervised
learning pipeline, so that the resulting neural network is able to detect and
diagnose unseen incipient fault examples. We also examine the proposed
MC-dropout method on the RP-1043 dataset to demonstrate its effectiveness in
indicating the most likely incipient fault types
A Predictive maintenance model for heterogeneous industrial refrigeration systems
The automatic assessment of the degradation state of industrial refrigeration systems is
becoming increasingly important and constitutes a key-role within predictive maintenance
approaches. Lately, data-driven methods especially became the focus of research in this
respect. As they only rely on historical data in the development phase, they offer great
advantages in terms of flexibility and generalisability by circumventing the need for specific
domain knowledge. While most scientific contributions employ methods emerging from
the field of machine learning (ML), only very few consider their applicability amongst
different heterogeneous systems. In fact, the majority of existing contributions in this field
solely apply supervised ML models, which assume the availability of labelled fault data for
each system respectively. However, this places restrictions on the overall applicability, as
data labelling is mostly conducted by humans and therefore constitutes a non-negligible
cost and time factor. Moreover, such methods assume that all considered fault types
occurred in the past, a condition that may not always be guaranteed to be satisfied.
Therefore, this dissertation proposes a predictive maintenance model for industrial
refrigeration systems by especially addressing its transferability onto different but related heterogeneous systems. In particular, it aims at solving a sub-problem known as
condition-based maintenance (CBM) to automatically assess the system’s state of degradation. To this end, the model does not only estimate how far a possible malfunction
has progressed, but also determines the fault type being present. As will be described
in greater detail throughout this dissertation, the proposed model also utilises techniques
from the field of ML but rather bypasses the strict assumptions accompanying supervised
ML. Accordingly, it assumes the data of the target system to be primarily unlabelled
while a few labelled samples are expected to be retrievable from the fault-free operational
state, which can be obtained at low cost. Yet, to enable the model’s intended functionality, it additionally employs data from only one fully labelled source dataset and, thus,
allows the benefits of data-driven approaches towards predictive maintenance to be further
exploited.
After the introduction, the dissertation at hand introduces the related concepts as
well as the terms and definitions and delimits this work from other fields of research.
Furthermore, the scope of application is further introduced and the latest scientific work
is presented. This is then followed by the explanation of the open research gap, from which
the research questions are derived. The third chapter deals with the main principles of the
model, including the mathematical notations and the individual concepts. It furthermore
delivers an overview about the variety of problems arising in this context and presents the
associated solutions from a theoretical point of view. Subsequently, the data acquisition
phase is described, addressing both the data collection procedure and the outcome of the
test cases. In addition, the considered fault characteristics are presented and compared
with the ones obtained from the related publicly available dataset. In essence, both
datasets form the basis for the model validation, as discussed in the following chapter. This
chapter then further comprises the results obtained from the model, which are compared
with the ones retrieved from several baseline models derived from the literature. This
work then closes with a summary and the conclusions drawn from the model results.
Lastly, an outlook of the presented dissertation is provide
- …