1,266 research outputs found
Enhancing Decision Tree based Interpretation of Deep Neural Networks through L1-Orthogonal Regularization
One obstacle that so far prevents the introduction of machine learning models
primarily in critical areas is the lack of explainability. In this work, a
practicable approach of gaining explainability of deep artificial neural
networks (NN) using an interpretable surrogate model based on decision trees is
presented. Simply fitting a decision tree to a trained NN usually leads to
unsatisfactory results in terms of accuracy and fidelity. Using L1-orthogonal
regularization during training, however, preserves the accuracy of the NN,
while it can be closely approximated by small decision trees. Tests with
different data sets confirm that L1-orthogonal regularization yields models of
lower complexity and at the same time higher fidelity compared to other
regularizers.Comment: 8 pages, 18th IEEE International Conference on Machine Learning and
Applications (ICMLA) 201
Big Data and the Internet of Things
Advances in sensing and computing capabilities are making it possible to
embed increasing computing power in small devices. This has enabled the sensing
devices not just to passively capture data at very high resolution but also to
take sophisticated actions in response. Combined with advances in
communication, this is resulting in an ecosystem of highly interconnected
devices referred to as the Internet of Things - IoT. In conjunction, the
advances in machine learning have allowed building models on this ever
increasing amounts of data. Consequently, devices all the way from heavy assets
such as aircraft engines to wearables such as health monitors can all now not
only generate massive amounts of data but can draw back on aggregate analytics
to "improve" their performance over time. Big data analytics has been
identified as a key enabler for the IoT. In this chapter, we discuss various
avenues of the IoT where big data analytics either is already making a
significant impact or is on the cusp of doing so. We also discuss social
implications and areas of concern.Comment: 33 pages. draft of upcoming book chapter in Japkowicz and Stefanowski
(eds.) Big Data Analysis: New algorithms for a new society, Springer Series
on Studies in Big Data, to appea
Scalable and Interpretable One-class SVMs with Deep Learning and Random Fourier features
One-class support vector machine (OC-SVM) for a long time has been one of the
most effective anomaly detection methods and extensively adopted in both
research as well as industrial applications. The biggest issue for OC-SVM is
yet the capability to operate with large and high-dimensional datasets due to
optimization complexity. Those problems might be mitigated via dimensionality
reduction techniques such as manifold learning or autoencoder. However,
previous work often treats representation learning and anomaly prediction
separately. In this paper, we propose autoencoder based one-class support
vector machine (AE-1SVM) that brings OC-SVM, with the aid of random Fourier
features to approximate the radial basis kernel, into deep learning context by
combining it with a representation learning architecture and jointly exploit
stochastic gradient descent to obtain end-to-end training. Interestingly, this
also opens up the possible use of gradient-based attribution methods to explain
the decision making for anomaly detection, which has ever been challenging as a
result of the implicit mappings between the input space and the kernel space.
To the best of our knowledge, this is the first work to study the
interpretability of deep learning in anomaly detection. We evaluate our method
on a wide range of unsupervised anomaly detection tasks in which our end-to-end
training architecture achieves a performance significantly better than the
previous work using separate training.Comment: Accepted at European Conference on Machine Learning and Principles
and Practice of Knowledge Discovery in Databases (ECML-PKDD) 201
Defect Analysis of 3D Printed Cylinder Object Using Transfer Learning Approaches
Additive manufacturing (AM) is gaining attention across various industries
like healthcare, aerospace, and automotive. However, identifying defects early
in the AM process can reduce production costs and improve productivity - a key
challenge. This study explored the effectiveness of machine learning (ML)
approaches, specifically transfer learning (TL) models, for defect detection in
3D-printed cylinders. Images of cylinders were analyzed using models including
VGG16, VGG19, ResNet50, ResNet101, InceptionResNetV2, and MobileNetV2.
Performance was compared across two datasets using accuracy, precision, recall,
and F1-score metrics. In the first study, VGG16, InceptionResNetV2, and
MobileNetV2 achieved perfect scores. In contrast, ResNet50 had the lowest
performance, with an average F1-score of 0.32. Similarly, in the second study,
MobileNetV2 correctly classified all instances, while ResNet50 struggled with
more false positives and fewer true positives, resulting in an F1-score of
0.75. Overall, the findings suggest certain TL models like MobileNetV2 can
deliver high accuracy for AM defect classification, although performance varies
across algorithms. The results provide insights into model optimization and
integration needs for reliable automated defect analysis during 3D printing. By
identifying the top-performing TL techniques, this study aims to enhance AM
product quality through robust image-based monitoring and inspection
Explainable Deep Reinforcement Learning for Production Control
Due to the growing number of variants and smaller batch sizes manufacturing companies have to cope with increasing material flow complexity. Thus, increasing the difficulty for production planning and control (PPC) to create a feasible and economic production plan. Despite significant advances in PPC research, current PPC systems do not yet sufficiently meet the industry’s requirements (e.g., decision quality, reaction time, user trust). However, recent progress in the digitalization of production systems results in an increased amount of data being collected, thus enabling the use of data-intensive applications technologies, e.g., machine learning (ML). ML provides new possibilities for PPC to handle increasing complexity caused by rising numbers of product variants paired with smaller lot sizes. At the same time, ML can increase the decision quality and reduce the reaction time to disturbances in the production system, e.g., machine breakdowns. Partly, ML models, e.g., artificial neural networks (ANN), are perceived as black-box models, resulting in reduced user’s trust in the decision proposed by an ML-based PPC system. The approach presented in this publication aims at a more functional and user-friendly PPC system by leveraging multi-agent reinforcement-learning (MARL), an accomplished approach within the field of ML-based production control, and approaches for explaining decisions made by reinforcement learning (RL) algorithms. With the help of MARL, short reaction time and high decision quality can be realized. Subsequently, the developed MARL system is combined with methods from the field of explainable Artificial Intelligence (XAI) to increase the users’ trust. The use case results show that with the help of the developed system, rule-based controls, which are often used in industry, can be outperformed while providing explainable decisions
Modeling Persistent Trends in Distributions
We present a nonparametric framework to model a short sequence of probability
distributions that vary both due to underlying effects of sequential
progression and confounding noise. To distinguish between these two types of
variation and estimate the sequential-progression effects, our approach
leverages an assumption that these effects follow a persistent trend. This work
is motivated by the recent rise of single-cell RNA-sequencing experiments over
a brief time course, which aim to identify genes relevant to the progression of
a particular biological process across diverse cell populations. While
classical statistical tools focus on scalar-response regression or
order-agnostic differences between distributions, it is desirable in this
setting to consider both the full distributions as well as the structure
imposed by their ordering. We introduce a new regression model for ordinal
covariates where responses are univariate distributions and the underlying
relationship reflects consistent changes in the distributions over increasing
levels of the covariate. This concept is formalized as a "trend" in
distributions, which we define as an evolution that is linear under the
Wasserstein metric. Implemented via a fast alternating projections algorithm,
our method exhibits numerous strengths in simulations and analyses of
single-cell gene expression data.Comment: To appear in: Journal of the American Statistical Associatio
- …