Hardware Architecture Proposal for TEDA algorithm to Data Streaming
  Anomaly Detection by da Silva, Lucileide M. D. et al.
Hardware Architecture Proposal for TEDA algorithm to
Data Streaming Anomaly Detection
Lucileide M. D. da Silvaa,b, Maria G. F. Coutinhoa, Carlos E. B. Santosa,
Mailson R. Santosd, Luiz Affonso Guedesd, M. Dolores Ruizc, Marcelo A. C.
Fernandesa,d,1,∗
aLaboratory of Machine Learning and Intelligent Instrumentation, Federal University of Rio
Grande do Norte, Natal 59078-970, Brazil.
bFederal Institute of Education, Science and Technology of Rio Grande do Norte, Paraiso,
Santa Cruz, RN, 59200-000, Brazil.
cDepartment of Statistics and Operations Research, University of Granada, Spain.
dDepartment of Computer Engineering and Automation, Federal University of Rio Grande
do Norte, Natal, RN, 59078-970, Brazil.
Abstract
The amount of data in real-time, such as time series and streaming data, avail-
able today continues to grow. Being able to analyze this data the moment it
arrives can bring an immense added value. However, it also requires a lot of com-
putational effort and new acceleration techniques. As a possible solution to this
problem, this paper proposes a hardware architecture for Typicality and Eccen-
tricity Data Analytic (TEDA) algorithm implemented on Field Programmable
Gate Arrays (FPGA) for use in data streaming anomaly detection. TEDA is
based on a new approach to outlier detection in the data stream context. In
order to validate the proposals, results of the occupation and throughput of the
proposed hardware are presented. Besides, the bit accurate simulation results
are also presented. The project aims to Xilinx Virtex-6 xc6vlx240t-1ff1156 as
the target FPGA.
∗Corresponding author
Email addresses: lucileide.dantas@ifrn.edu.br (Lucileide M. D. da Silva),
gracielly@dca.ufrn.br (Maria G. F. Coutinho), ceduardobsantos@gmail.com (Carlos E. B.
Santos), mailsonribeiro@ufrn.edu.br (Mailson R. Santos), affonso@dca.ufrn.br (Luiz
Affonso Guedes), mariloruiz@ugr.es (M. Dolores Ruiz), mfernandes@dca.ufrn.br (Marcelo
A. C. Fernandes)
1Present address: John A. Paulson School of Engineering and Applied Sciences, Harvard
University, Cambridge, MA 02138, USA.
Preprint submitted to arXiv.org March 10, 2020
ar
X
iv
:2
00
3.
03
83
7v
1 
 [c
s.D
C]
  8
 M
ar 
20
20
Keywords: FPGA, TEDA, data streaming, reconfigurable computing
1. Introduction
Outlier detection or anomaly detection consists in detect rare events in a
data set. It is a central problem in many application areas such as time series
forecasting, data mining and industrial process monitoring. Due the increasing
number of sensors in the most diverse areas and applications, there is a huge raise
in the availability of data from time series. Thus, outlier detection for temporal
data has become a central problem [1], especially when data are captured and
processed continuously in online way. In this case, the data are considered as
data streams [2].
Some important aspects need to be considered when choosing an anomaly
detection method, such as the computational effort to handle large streaming
data. Since the received information need to be stored and analyzed without
compromising memory and run-time. Many of the solutions presented in the lit-
erature require prior knowledge of the process and system, such as mathematical
models, data distribution, and predefined parameters [3]. Anomaly detection
is traditionally done from statistical analysis, using probability and making a
series of initial assumptions that in most cases are not in practice applied.
A disadvantage of the traditional statistical method is comparing a single
point with the average of all points rather than comparing with sample or data
pairs. This way, the information is no longer punctual and local. Moreover,
probability theory was developed from examples where processes and variables
are purely random. However, real processes are not purely random and shows
dependency between samples. Thus, real problems are addressed from offline
processes, where the entire data set needs to be known. Being a potential prob-
lem of the traditional method. Another problem with traditional approaches
is that they often use an offline dataset. Thus, all samples must be previously
available from the beginning of the algorithm execution [4], making it impossi-
ble to use in real-time and data stream applications. This type of data presents
2
new technical challenges and opportunities in new fields of work. Detecting real-
time anomalies can provide valuable information in critical scenarios, but it is
a high computational demand problem that still lacks reliable solutions capable
of providing high processing capabilities.
Typicality and Eccentricity Data Analytic (TEDA) is based on new ap-
proach to outlier detection in data stream context [5] and it can applied with
an algorithm to detect autonomous behavior in industrial process operation, for
example. TEDA analyzes the density of each sample of data read, calculated
according to the distance from the sample to the other samples previously read.
It is an online algorithm that learns autonomously without the need for prior
knowledge about the process or parameters. Therefore, the computational effort
required is smaller, allowing the use in real time applications [3].
TEDA can be used as an alternative statistical framework for analyzing most
data, except for purely random processes. It is based on new metrics, all based
on similarity/proximity of data in the data space, not in density or entropy, as
in traditional methods. The metrics used with TEDA are typicality, defined
in [5] as the extent to which objects are ?good examples? of a concept, and
eccentricity, defined as how distinct the object is from the rest of the group. A
high eccentricity data has a low typicality and is usually an outlier [3].
Eccentricity can be very useful for anomaly detection, image processing,
fault detection, particle physics, etc. Allows analysis for data samples (which
can also be done in real time for data stream) [6]. It is also relevant in clustering
processes, since elements of a cluster are naturally opposed to the atypical [5].
Another area where anomaly detection has been increasingly used is in in-
dustry 4.0 projects. One of the challenges of the Industry 4.0 is the detection
of production failures and defects [7]. New technologies aim to add value and
increase process productivity, but face difficulties in performing complex and
massive-scale computing due to the large amount of data generated [8]. The
huge accumulation of real time data to flow in a network, for example, can
quickly overload traditional computing systems due to the large amount of data
that originates from the sensors and the requirement for intensive processing
3
and high performance. The development of specialized hardware presents it-
self as a possible solution to overcome the bottlenecks, making it possible to
create solutions for mass data processing and, at the same time, consider ultra-
low-latency, low-power, high-throughput, security and ultra-reliable conditions,
important requirements for increasing productivity and quality in industry 4.0
processes.
Thinking about the challenges presented, this work proposes a specialized
hardware architecture of TEDA for anomaly detection. The development of
the hardware technique allows systems to be made even faster than their soft-
ware counterparts, extending the possibilities of use for situations where time
constraints are even more severe. In addition allowing its use in applications
with large data processing. The works [9, 10, 11, 12, 13] were developed in
hardware, specifically on FPGA, for the acceleration of complex algorithms.
The development of machine learning algorithms in hardware has grown signifi-
cantly. This is justified from performance data with respect to system sampling
times compared to software equivalents. One of the motivations for this work
is the possibility of accelerating the TEDA algorithm and handling large data
streams, such as streaming and real-time.
In this work, all validation and synthesis results was made using a FPGA
Virtex 6 xc6vlx240t1ff1156. The FPGA choice was because it has high per-
formance. Modern FPGAs can deliver performance and density comparable to
Application Specific Integrated Circuits (ASICs), without the disadvantages of
high development time and enabling reprogramming, as FPGAs have a flexible
architecture.
The rest of this paper is organized as follows: This first section presented
a introduction about the work explaining the motivation behind it and major
contributions. Section 2 discusses some related works and the state of the art.
In Section 3 will be presented a theoretical foundation regarding the TEDA
technique. Section 4 presents the implementation description details for the ar-
chitecture proposed. Section 5 will present the validation and synthesis results
of the proposed hardware, as well as comparisons with software implementa-
4
tions. Finally, Section 6 will present the conclusions regarding the obtained
results.
2. Related work
Real-time anomaly detection in data stream has potential applications in
many areas. Such as: preventive maintenance, fault detection, fraud detection,
signals monitoring, among others. Concepts that can be used in many different
ranges of industry, such as information technology, finance, medicine, security,
energy, e-commerce, agriculture, social media, among others. In the literature
there are some uses of the TEDA technique for anomaly detection and even for
classification.
The article presented in [6] shows a proposal for a new TEDA-based anomaly
detection algorithm. The proposed method, called by the author σ gap, com-
bines the accumulated proximity information for all samples with the compar-
ison of specific point pairs suspected of being anomalies. Using local spatial
distribution information about the vicinity of the suspect point. In the journal,
TEDA is compared to an approach using traditional statistical methods, em-
phasizing that the set of initial assumptions is different. TEDA has been shown
to be a generalization of traditional statistics compared to a known analysis, n
σ, which is a widely used principle for threshold anomaly detection. The same
result was obtained for both approaches, although TEDA does not need the
initial assumptions. In addition, for various types of proximity measurements
(such as Euclidean, Cosine, Mahalanobis), it has been shown that due to the re-
cursion feature, TEDA is computationally more efficient and suitable for online
and real-time applications.
In the work [14] a study is presented about the use of TEDA for fault de-
tection in industrial processes. The work is pioneering the use of this approach
for real industry data. For the experiments, TEDA was applied online to the
dataset provided by the DAMADICS (Development and Application of Meth-
ods for Actuator Diagnosis in Industrial Control Systems) database, one of the
5
most widely used benchmarks in fault detection and diagnosis applications. The
experiments showed good results both in accuracy and execution time, which
shows the suitability of this approach for real-time industrial applications. Fi-
nally, it was found that the TEDA algorithm is capable of dealing with the
limitations and particularities of the industrial environment.
The paper of [15] is intended to enable the use of TEDAClass, which con-
sists of the TEDA algorithm for classification, in big data processing. The main
feature of the proposed algorithm, called TEDAClassBDp, is the processing of
block data, where each block uses the TEDAClass so that all blocks operate in
parallel. As with TEDAClass, the proposed algorithm does not require infor-
mation from previous data, and its operation occurs recursively, online and in
real-time. The results indicated a reduction in time and computational complex-
ity, without significantly compromising its accuracy, which indicates the strong
possibility of using the proposal in problems where it is necessary to process
large volumes of data quickly.
The work presented in [16] proposes a new non-frequency and density-based
data analysis tool. Classified by the author as a further development of TEDA
and an effective alternative to the probability distribution function (pdf). Typ-
icality Distribution Function (TDF) can provide valuable information for ex-
treme process analysis, fault detection and identification, where the number
of extreme event or fault observations is often disproportionately small. The
proposed offers a closed non-parametric analytical (quadratic) description, ex-
tracted from the actual realizations of the data exactly in contrast to the usual
practice in which these distributions are being assumed or approximated. In
addition, for various types of proximity and similarity measures (such as Eu-
clidean, Mahalonobis, and cosine distances), it can be recursively calculated,
thus computationally efficient and suitable for online and real-time algorithms.
As a fundamental theoretical innovation, TDF and TEDA application areas can
range from anomaly detection, grouping, classification, prediction, control, filter
regression (similar to Kalman). Practical applications may be even broader, so
it is difficult to list them all.
6
The paper presented in [3] proposes the application of TEDA for fault detec-
tion in industrial processes. The effectiveness of the proposal has been demon-
strated with two real industrial plants, using data streaming, and compared
with traditional failure detection methods. This paper presents a practical ap-
plication of the TEDA algorithm for two fault detection problems of real indus-
trial plants. The first application uses a well-known database, DAMADICS, a
database that provides actual data on the water evaporation process of an oper-
ating plant of a Polish sugar manufacturing plant. The second application was
made from data analysis of a pilot plant of the authors’ university laboratory.
A plant equipped with real industrial instruments used for process control.
The work of [4] presents a new proposal for unsupervised fuzzy classifier,
capable of aggregating the main characteristics of evolving classifiers, as well as
making fuzzy classifications of real time data streams completely online. The
proposed algorithm uses TEDA concepts, replacing traditional clusters with
data clouds, granular structures without shape or predefined boundaries. For
data classification, the proposed approach uses the concepts of soft-labeling
rather than mutually exclusive classes. Experiments performed using data ob-
tained from different operational failures of a real industrial plant, showed very
significant results regarding unsupervised as well as semi-supervised learning,
requiring minimal human interaction.
The manuscript presented in [2] brings a new algorithm for detecting anoma-
lies based on an online memory sequence algorithm called Hierarchical Tempo-
ral Memory (HTM). The performance of the proposed algorithm was evaluated
and compared with a set of real time anomaly detection algorithms. Compara-
tive analysis was performed as a way to evaluate anomaly detection algorithms
for data streaming. All analyzes were performed from the Numenta Anomaly
Benchmark (NAB) [17], which is a benchmark of actual streaming data.
The paper published by [18] brings a study for anomaly detection in TCP /
IP networks. The purpose of the paper is to detect computer network anoma-
lies in the process of virtual machine (VM) live migration from local to cloud,
by comparing this approach between TEDA, clustering K-Means, and static
7
analysis. They used the tuple - Source IP, Destination IP, Source Port, and
Destination Port - to create a signature process and validate errors, including
those of traffic flow hidden in the legitimate network. Testing was done us-
ing the SECCRIT (SEcure Cloud Computing for CRitical Infrastructure IT -
http://www.seccrit.eu) project dataset, which allows anomalies or environmen-
tal attacks to be analyzed with Live Migration and other background traffic
conditions. The results demonstrate that the proposed method makes it possi-
ble to automatically and successfully detect anomalies in attacks, network port
scan (NPS) and network scan (NS). A major difficulty is distinguishing a high-
volume attack from a denial of service (DoS) attack, for example. Accuracy and
false negative rate calculations were made for comparison with K-Means and the
proposed solution, with TEDA having better rates in almost all measurements
performed.
As the amount of data that needs to be processed grows exponentially and
autonomous systems become increasingly important and necessary. Implemen-
tation of machine learning and streaming algorithms have been studying in
literature. The work presented in [19] describes how to use run-time reconfig-
uration on FPGAs to improve the efficiency of streaming data transmission in
shared communication channel with real-time applications. The reconfigurable
architecture proposed consists of two subsystems: the reconfiguration subsys-
tem, which running the modules, and the scheduling subsystem, that controls
which modules are loaded to the reconfiguration subsystem.
Besides, many works in the literature have been studied fault and anomaly
detection in hardware. In work [20], an implementation of target and anomaly
detection algorithms for real-time hyper-spectral imaging was proposed on FPGA.
The algorithms were implemented in streaming fashion, similar to this work.
The results, obtained from a Kintex-7 FPGA using fixed point structure, were
very satisfactory and demonstrated that the implementation can be used in dif-
ferent detection circumstances. The work [21] presented a study of the impact of
Neural Network architectures compared to statistical methods in the implemen-
tation of an Electrocardiogram (ECG) anomaly detection algorithm on FPGA.
8
The fixed point implementation contributes to reduce the amount of needed
resources. However, the design was made with High Level Sinthesys (HLS),
witch could not optimize the FPGA resources consumption. In relation to the
TEDA algorithm, no studies in the literature aimed at exploring its hardware
implementation on FPGA were identified to date this paper had been write,
which this work proposes to accomplish in a pioneering manner.
3. TEDA
TEDA was introduced by [22] as a statistical framework, influenced by recur-
sive density estimation algorithms. However, unlike algorithms that uses data
density as a measure of similarity, TEDA uses concepts of typicity and eccen-
tricity to infer whether a given sample is normal or abnormal to the dataset.
The methodology used in TEDA does not require the use of a previous data in-
formation, and can be applied to problems involving fault detection, clustering,
classification, among others [22].
TEDA is a data structure-based anomaly detection algorithm that aims to
generalize and avoid the need for well-known, but very restrictive, initial condi-
tions inherent in traditional statistics and probability theory [23]. The approach
presented in the TEDA has some advantages over traditional statistical anomaly
detection methods. Its recursive feature allows it to handle large volumes of
data, such as data streams, with low computational cost and online, enabling
faster processing.
TEDA main features include [6]:
• It is entirely based on data and its distribution in data spaces;
• No previous assumptions are made;
• Limits and parameters does not need to be pre-specified;
• No sample independence required;
• An infinite number of observations are not required.
9
The typicality of TEDA is the similarity of a given data sample to the rest
of the dataset samples to which it belongs. Eccentricity, on the other hand,
is the opposite of typicality, which indicates how much a sample is dissociated
from the other samples in its set. Thus, an outlier can be defined as a sample
with high eccentricity and low typicality, considering a threshold established
for comparison. It is important to note that for eccentricity and typicality
calculations no parameter or threshold is required.
To calculate the eccentricity of each sample, TEDA uses the sum of the
geometric distances between the analyzed sample xk and the other samples in
the set. Thus, the higher this value, the greater the eccentricity of the sample,
and consequently, the lower its typicality. [6] proposed recursively calculating
eccentricity. Thus, the eccentricity, ξ can be expressed as
ξk(x) =
1
k
+
(µxk − xk)T (µxk − xk)
k[σ2]xk
, [σ2]xk > 0 (1)
where k is discreization instant; xk is a input set of N elements in the k-th
iteration, xk = [x1k x
2
k ... x
N
k ]; µ
k
x is also a N elements vector, equal to the
average of xk at the k-th iteration and [σ2]xk is the variance of xk at the k-th
iteration. The calculation of µkx and [σ2]xk is also recursively done, using the
following equation
µxk =
(k − 1)
k
µxk−1 +
1
k
xk, k ≥ 1, µx0 = 0 (2)
and
[σ2]xk =
(k − 1)
k
[σ2]xk−1 +
1
k
‖xk − µk‖2 , k ≥ 1, [σ2]x0 = 0. (3)
The typicality of a given sample xk, at the k-th iteration, can be expressed
as a complement to eccentricity [6], as follows
τk(x) = 1− ξk(x). (4)
In addition, [6] also defined that normalized eccentricity can be calculated
as
ζk(x) =
ξk(x)
2
,
k∑
i=1
ξk(x) = 1, k ≥ 2. (5)
10
In order to separate normal state data from abnormal state data, it is neces-
sary to define a comparison threshold. For anomaly detection, the use of themσ
[24] threshold is widespread. However, this principle must first assume the dis-
tribution of the analyzed data, such as the Gaussian distribution [6]. Chebyshev
inequality can be used for any data distribution, assuming that the probability
that the data samples are more than mσ from the average is less than or equal
to 1/m2, where σ is the standard deviation of the data [25].
The condition that produces the same results as Chebyshev’s inequality,
discarding any assumptions about data and its independence, can be expressed
as [6]
ζk >
m2 + 1
2k
, m > 0 (6)
where m corresponds to the comparison threshold.
For a better understanding of the hardware implemented technique in this
work, the Algorithm 1 details the operation of TEDA, based on the equations
presented above.
As presented in the Algorithm 1, only input data samples, xk, and a com-
parison threshold, m, are used as input to the algorithm. The output for each
entry, xk, is the indication of the sample’s classification as abnormal (outlier =
true) or normal (outlier = false).
4. Implementation description
In this work, a TEDA FPGA proposal was implemented using Register
Transfer Level (RTL) such as works presented in [9, 10, 11, 12, 13]. In the
following section characteristics of the proposal will be presented, as well as
details regarding processing time. A design overview can be seen in Figure 1.
4.1. Architecture proposal overview
As illustrated in the Figure 1, the proposed implementation of TEDA has
four different block structures: The MEAN module, which implements the aver-
age described in Equation 2; The VARIANCE module, responsible for calculate
11
Algorithm 1: TEDA
Input: xk: k-th sample; m: threshold
Output: outlier: sample classification as abnormal or normal
1: begin
2: while receive x do
3: if k=1 then
4: µxk ← xk;
5: [σ2]xk ← 0;
6: else
7: update µxk using equation 2;
8: update [σ2]xk using equation 3;
9: update ξk(x) using equation 1;
10: update ζk(x) using equation 5;
11: if ζk(x) > m
2+1
2k
then
12: outlier ← true;
13: else
14: outlier ← false;
15: k ← k + 1;
the variance as presented at the equation 3; The ECCENTRICITY module,
which calculates the eccentricity, as presented in the equation 1; and the OUT-
LIER module, a block used to normalize the eccentricity as in equation 5 and
compare with the threshold, as showed in equation 6. The architecture was
developed in an attempt to pipeline the operations presented in Algorithm 1 in
order to decrease the TEDA processing time. So, the output of the ECCEN-
TRICITY and OUTLIER modules are one clock cycle delayed in relation to
VARIANCE module and two in relation to MEAN module. As well as VARI-
ANCE module is one clock cycle delayed in relation to MEAN module. Each of
the modules are detailed later in the following sections.
The implementation has the Algorithm 1 as reference. The system receives
the FPGA clock and the k-th sample vector xk as inputs. The k-th iteration
number is updated from the increment of a counter and the m threshold is used
12
as a constant, stored at OUTLIER module. As in the algorithm line 7, the
MEAN module do each single element average of xk vector. It is possible to
observe that there are N MEAN blocks, where N is the vector size. This block
will be detailed in section 4.2. After this step, moving to the next line (8), the
calculation of variance is done in VARIANCE Module, this block is detailed in
the section 4.3. ECCENTRICITY block has as inputs the signals that left the
block VARIANCE and k, as referred at line 9 and detailed in subsection 4.4.
OUTLIER block is detailed in subsection 4.5. It receives the ECCENTRICITY,
ξk(x), and calculate the normalized eccentricity to compare with the threshold
as presented in lines 10, 11 and 12.
 
1
 
MEAN 
 
1
 
 
1
 
 
 
 
.
.
.
 
 
 
 
MEAN 
 
 
 
.
.
.
 
VARIANCE 
[ 
2
]
 
 
[ 
2
]
 
 
ECCENTRICITY 
( ) 
 
‖ − ‖ 
 
 
 
2
1/ 
 
( ) 
 
       OUTLIER
 
 
Figure 1: General architecture overview.
4.2. Module I - MEAN
Each n-th MEAN module computes the average of each one of n-th elements
vector xk acquired at run time. The implementations is based on Equation 2
and it is detailed in Figure 2. In addition to receiving the n-th element of
vector xk as an input, the MEAN block uses a counter to define the number of
sample interaction, k. The implementation uses a comparator block identified
at the Figure 2 as MCOMPn witch is used to verify if the system is in the first
iteration as Line 3 of Algorithm 1. The MMUXn is a multiplexer that acts
as a conditional evaluation, using as selecting value the output of MCOMPn
comparator. The register MREGn is storing the n-th µxk element (µ
n
k ). The
13
µnk value stored in MREGn is multiplied with
k−1
k in MMULT1n and added in
MSUMn with the output of MMULT2n that has as input xnk and the inverse
value of k. Each n-th element of vector xk, xnk , requires a MEAN block.
 
 
 
 
 
 
 
 
 
D Q
Q
+
 
÷
 
 
  x
 
=
x
( − 1)/ 
1
 
MEAN
 
 
 
MCOMPn
MMUXn MREGn
MSUMn
MMULT1n
MMULT2nMDIVn
MCONST
1
Figure 2: MEAN module.
4.3. Module II - VARIANCE
The VARIANCE module is illustrated in Figure 3. It computes the vari-
ance of xk vector samples by receiving the xk vector itself and its average, µxk,
calculated in the previous MEAN blocks.
The VARIANCE module, as the MEAN module, uses a comparator identi-
fied at the Figure 3 as VCOMP1 also to verify if the system is in the first iteration
(Line 3 of Algorithm 1). The VMUX1 is a multiplexer that also implements a
conditional evaluation to release the value 0 in the register output VREG1 in
the first iteration. The register VREG1 stores the variance value, [σ2]xk, from
14
the second iteration. The other registers in the block, VREG2 register and the
N VREGn registers, are used to delay by one clock cycle the iteration number
k and the elements of xk respectively.
- x
x
+ ‖ − ‖    
 
2
1/ 
x +
D Q
Q
1
÷
( − 1)/ 
x
1 =
0
[ 
2
]
 
 
D Q
Q
.
.
.
 
 
 
-
.
.
.
.
.
.
 
1/ 
‖ − ‖ 
 
 
 
2
VARIANCE
[ 
2
]
 
 
D Q
Q
D Q
Q
 
1
 
 
 
 
.
.
.
VREG1_1
VREG1_N
.
.
.
VSUB1
VSUBN
VMULT1_1
VMULT1_1
VSUM1
VMULT2
VMULT3
VCOMP1
VREG1
VMUX1
VREG2
VSUM2
VDIV1
 
1
 
Figure 3: VARIANCE module.
As demonstrated in Equation 3, the variance calculation is done recursively.
It is necessary to calculate ‖xk − µk‖2 and to do that, N subtractors (VSUBn)
and N multipliers (VMULT1_n) are used, as well a adder (VSUM1) with N
inputs. Each element of vector µxk is subtracted from its respective element
in vector xk and the result of this operation is multiplied by itself (squared)
and then added to the other results. The ‖xk − µk‖2 value is the multiplied
(at VMULT2) by 1/k. It is then added at VSUM2 adder with the variance
calculated in the previous iteration, [σ2]xk, multiplied (VMULT3) by (k − 1)/k.
From the second iteration on, this value passes through the VMUX1 multiplexer
to the VREG1 register, delivering the calculation of the variance value at the
VARIANCE block output. The values of ‖xk − µk‖2 and 1/k are also delivered
at the output of the VARIANCE block to avoid redundant operations as they
will be used in the next block, the ECCENTRICITY block.
15
4.4. Module III - ECCENTRICITY
The ECCENTRICITY module is a simpler block than those previously pre-
sented. This is because it uses operations already performed in the VARIANCE
block to calculate eccentricity. The geometric distance ‖xk − µk‖2 (equivalent
to (µxk−xk)T (µxk−xk)) is stored in register EREG3 and 1/k is stored in EREG4
register. As the ECCENTRICITY module is the architecture design of Equation
1 (Algorithm 1 line 9), the variance value [σ2]xk is multiplied by k (EMULT1) and
used to divise (EDIV1) the geometric distance (µxk−xk)T (µxk−xk). This oper-
ation output is added to 1/k in the ESUM1 adder, calculating the eccentricity
of the samples (ξk(x)) and delivering to the ECCENTRICITY block output.
x
 
÷
D Q
Q
D Q
Q
D Q
Q
D Q
Q
( ) 
 +
[ 
2
]
 
 
‖ − ‖ 
 
 
 
2
1/ 
ECCENTRICITY
( ) 
 
EREG1 EREG2
EREG3
EREG4
EMULT1
EDIV1
ESUM1
Figure 4: ECCENTRICITY module.
4.5. Module IV - OUTLIER
Finally, in the OUTLIER block, the samples are classified into abnormal
(outlier = true) or normal (outlier = false). The design module can be seen in
Figure 5. To classify the samples, the OUTLIER block normalizes eccentricity
16
by dividing (ODIV1) by a constant, as shown in Equation 5, and compares
(OCOMP1 this normalized eccentricity with a threshold as shown in the Lines
11, 12, 13 and 14 of the Algorithm 1. The register OREG1 and OREG2 are used
to synchronize the iteration number k, since as the modules act in pipeline, the
operations carried out in the OUTLIER block (as well as in ECCENTRICITY
module) are delayed by two clock cycles in relation to the system input.
D Q
Q
D Q
Q 2
x
1 +
m2
÷
       >2 ÷
 
( ) 
 
OUTLIEROREG1 OREG2
OMULT1
OSUM1 OCOMP1
CONST
ODIV1
ODIV2
Figure 5: OUTLIER module.
4.6. Processing time
The proposed architecture has an initial delay, d, that can be expressed as
d = 3× tc (7)
where tc is the system critical path time.
The execution time of the circuit implemented for TEDA algorithm is de-
termined by the system critical path time, tc. So, after the initial delay, the
execution time of the proposed TEDA, tTEDA, can be expressed as
tTEDA = tc (8)
17
thus, in every tTEDA it is possible to obtain the output of a sample inserted,
that is, the sample classification as abnormal or normal.
The throughput of the implementation, thTEDA, in samples per second
(SPS) can be expressed as
thTEDA =
1
tTEDA
. (9)
5. Results
In this section will be presented the hardware validation and synthesis re-
sults for the architecture proposed in this work. All cases were validated and
synthesized on floating point. Validation results were used to verify the hard-
ware functionality, while synthesis results allow the system to be analyzed for
important parameters for the design of hardware architectures such as hard-
ware occupancy and processing time, considering factors such as throughput
and speedup.
5.1. Validation results
To validate the hardware architecture of the TEDA algorithm, we used the
DAMADICS (Development and Application of Methods of the Actuator Diag-
nosis in Industrial Control Systems) benchmark dataset [26]. The benchmark
provides a real data set of the water evaporation process in a Polish sugar fac-
tory. It is a plant with three actuators; a control valve, which controls the flow
of water in the pipes; a pneumatic motor, which controls variable valve openings
and a positioner. This dataset has faults at different times of the day on specific
days. There are four different fault types, as shown in Table 1.
Artificial failures were introduced on specific days to plant operation data.
The dataset has a set of 19 faults in these 3 actuators. As a way to validate
the architecture, actuator 1 failures were simulated. Table 2 shows a detailed
description of some introduced faults for actuator 1.
Figure 6 shows the results obtained for the item 1 signal of Table 2. Figure 6a
illustrates the behavior of two simulated input variables in hardware architecture
18
Table 1: Fault types [26].
Fault Description
f16 Positioner supply pressure drop
f17 Unexpected pressure change across the valve
f18 Fully or partly opened bypass valves
f19 Flow rate sensor fault
Table 2: List of artificial failures introduced to actuator 1 [26].
Item Fault Sample Date Description
1 f18 58800-59800 Oct 30, 2001 Partly opened bypass valve
2 f16 57275-57550 Nov 9, 2001 Positioner supply pressure
drop
3 f18 58830-58930 Nov 9, 2001 Partly opened bypass valve
4 f18 58520-58625 Nov 9, 2001 Partly opened bypass valve
5 f18 54600-54700 Nov 17, 2001 Partly opened bypass valve
6 f16 56670-56770 Nov 17, 2001 Positioner supply pressure
drop
7 f17 37780-38400 Nov 20, 2001 Unexpected pressure drop
across the valve
(xk = [x1k x
2
k]). It is possible to observe that a failure happens between the
moments k=58900 and k=59800. In Figure 6b it is possible to observe that there
is a sudden change in the behavior of the eccentricity (black curve), surpassing
the value of the comparison threshold with m = 3 (red curve).
In Figure 7 it is possible to observe the results obtained for the item 7
signal, from Table 2. As within Figure 6, Figures 7a illustrates the behavior
of two elements of input xk = [x1k x
2
k] in hardware architecture and in Figure
7b it is possible to observe that there is a change of eccentricity (black curve),
surpassing the value of the comparison threshold (red curve) also to m = 3. The
19
5.75 5.8 5.85 5.9 5.95 6 6.05
x 104
30
35
40
45
50
55
60
Iterations (k)
A
m
p
li
tu
d
e
 
 
xk
1
xk
2
(a) Fault item 1 - input vector xk.
5.75 5.8 5.85 5.9 5.95 6 6.05
x 104
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
x 10−3
Iterations (k)
A
m
p
li
tu
d
e
 
 
ζk (x)
5/k
(b) Fault item 1 - normalized eccentricity
ζk(x) with 5/k (m = 3) threshold.
Figure 6: Detection of outliers in the dataset: Behavior of fault item 1.
failure happens between moments k = 37700 and k = 38400.
3.7 3.75 3.8 3.85 3.9
x 104
0
50
100
150
200
250
300
350
400
Iterations (k)
A
m
p
li
tu
d
e
 
 
xk
1
xk
2
(a) Fault item 7 - input vector xk.
3.7 3.75 3.8 3.85 3.9
x 104
0
0.002
0.004
0.006
0.008
0.01
0.012
0.014
Iterations (k)
A
m
p
li
tu
d
e
 
 
ζk (x)
5/k
(b) Fault item 7 - normalized eccentricity
ζk(x) with 5/k (m = 3) threshold.
Figure 7: Detection of outliers in the dataset: Behavior of fault item 7.
Validation results in hardware architecture were compared with results ob-
tained in a python software implementation of the algorithm TEDA. The hard-
ware architecture was designed with floating point number format.
20
5.2. Synthesis results
After performing to validate the implemented circuit, the hardware synthesis
was performed to obtain the FPGA resource occupation report, as well as the
critical time information used to calculate the proposed implementation pro-
cessing time. The floating point synthesis results were obtained for a Xilinx
Virtex 6 xc6vlx240t-1ff1156 FPGA.
5.2.1. Hardware occupation
Table 3 presents data related to the hardware occupation of the circuit imple-
mented in the target FPGA. The first column shows the number of multipliers
used, the second column displays the number of registers, and the third column
shows the number of logical cells used as LUT (nLUT ) throughout the circuit.
Table 3: Hardware occupation.
Multipliers Registers nLUT
27 (3%) 414 (< 1%) 11.567 (7%)
Analyzing the data presented in Table 3 it can be seen that even using a
floating point resolution, which demands a greater amount of hardware resources
than a fixed point implementation, only a small portion of the resources were
occupied from the target FPGA, with a total of only about 3% from multipliers,
less than 1% from registers, and about 7% from logical cells used as LUT. With
this, we found that the proposed circuit could also be applied in low cost FPGAs,
where the amount of available hardware resources is even smaller. In addition,
multiple TEDA modules could be applied in parallel for anomaly detection in
the same dataset, in order to further reduce processing time.
5.2.2. Processing time
Table 4 presents information about the processing time (from Line 3 to Line
14 in Algorithm 1) of the architecture implemented for the TEDA technique.
The first column indicates the circuit critical time, tc, the second column shows
21
the initial delay, expressed by Equation 7, the third column the TEDA run-time,
expressed by Equation 8, and the last column the implementation throughput
in samples per second (SPS), expressed by Equation 9, which consists of the
amount of samples processed and classified (as normal or outlier) by TEDA
every second.
Table 4: Processing time.
Critical time Delay TEDA time Throughput
138 ns 414 ns 138 ns 7.2 MSPS
The data presented in Table 4 are quite expressive. The circuit critical time,
which also corresponds in the TEDA run-time, was only tc = 138 ns. Thus,
after the 414 ns delay, it is possible to get output for a processed sample sorted
every 138 ns, which guarantees a throughput of 7.2 million sorted samples per
second. These results indicate the feasibility of using the proposal presented in
this work to manipulate large data flows in real time.
5.3. Platforms comparison
To date, no previous literature has been found to explore TEDA hardware
implementations. Thus, this paper presents, for the first time, a proposal to
implement the TEDA technique on FPGA. To verify the advantages of the
hardware application proposed here over implementations on other software
platforms, some comparisons of the FPGA processing time with the processing
time of other software implementations were made. Table 5 presents the results
of the comparisons made. The first column indicates the hardware used, the
second presents the processing time required to obtain the classification of each
sample, and the third column, the speedup achieved by the proposal presented
in this paper.
The data presented in Table 5 reaffirm the importance of this work. The
hardware implementation on FPGA proposed here has been able to achieve
speedups of up to 3 million times compared to a Pyhton TEDA implementation
22
Table 5: Software implementations comparison.
Platform Time Speedup
This work proposal on FPGA 138 ns −
Python (Colab without GPU) 435ms 3,000,000×
Python (Colab with Tesla K80 GPU) 39.2ms 280,000×
Python (Local execution with 940 MX GPU) 23.1ms 167,000×
using the Colab tool (without GPU processing). For the same Pyhton imple-
mentation using the Tesla K80 GPU processing Colab tool, a speedup of 280
thousand times was obtained. In addition, when compared to a Python imple-
mentation on Intel(R) Core(TM) i7-7500U CPU@2.70GHz with 16 GB of RAM
and GeForce 940 MX GPU, the hardware implementation on FPGA still had a
167 thousand times advantage. Results that prove the advantages of using the
proposal presented in this work to accelerate the TEDA technique, through the
implementation on FPGA.
6. Conclusion
This work presented a proposal for hardware implementation of the TEDA
data streaming anomaly detection technique. The hardware was implemented
in RTL using floating point format. Synthesis results were obtained for a Xilinx
Virtex 6 xc6vlx240t-1ff1156 FPGA. The proposed implementation used a small
portion of the target FPGA resources, besides allowing the results to be obtained
in a short processing time. The high speedups obtained in comparison with other
software platforms reaffirmed the importance of this work, which is pioneering
the hardware implementation of the TEDA technique on FPGA. The proposed
architecture is feasible to be used in practical fault detection applications in real
industrial processes with severe time constraints, as well as to handle large data
volumes, such as data streaming, using low processing time.
23
References
[1] M. Gupta, J. Gao, C. C. Aggarwal, J. Han, Outlier detection for temporal
data: A survey, IEEE TRANSACTIONS ON KNOWLEDGE AND DATA
ENGINEERING 26 (9) (2019) 2250–2267.
[2] S. Ahmad, A. Lavin, S. Purdy, Z. Agha, Unsupervised real-time
anomaly detection for streaming data, Neurocomputing 262 (2017)
134 – 147, online Real-Time Learning Strategies for Data Streams.
doi:https://doi.org/10.1016/j.neucom.2017.04.070.
URL http://www.sciencedirect.com/science/article/pii/
S0925231217309864
[3] C. G. Bezerra, B. S. J. Costa, L. A. Guedes, P. P. Angelov, An evolving
approach to unsupervised and real-time fault detection in industrial
processes, Expert Systems with Applications 63 (2016) 134 – 144.
doi:https://doi.org/10.1016/j.eswa.2016.06.035.
URL http://www.sciencedirect.com/science/article/pii/
S0957417416303153
[4] B. S. J. Costa, C. G. Bezerra, L. A. Guedes, P. P. Angelov, Unsuper-
vised classification of data streams based on typicality and eccentricity
data analytics, in: 2016 IEEE International Conference on Fuzzy Systems
(FUZZ-IEEE), 2016, pp. 58–63. doi:10.1109/FUZZ-IEEE.2016.7737668.
[5] D. Osherson, E. E. Smith, On typicality and vagueness, Cognition 64 (2)
(1997) 189 – 206. doi:https://doi.org/10.1016/S0010-0277(97)
00025-5.
URL http://www.sciencedirect.com/science/article/pii/
S0010027797000255
[6] P. Angelov, Anomaly detection based on eccentricity analysis, in: 2014
IEEE Symposium on Evolving and Autonomous Learning Systems (EALS),
2014, pp. 1–8. doi:10.1109/EALS.2014.7009497.
24
[7] P. Napoletano, F. Piccoli, R. Schettini, Anomaly detection in nanofi-
brous materials by cnn-based self-similarity, Sensors 18 (1). doi:10.3390/
s18010209.
URL https://www.mdpi.com/1424-8220/18/1/209
[8] I. A. T. Hashem, I. Yaqoob, N. B. Anuar, S. Mokhtar, A. Gani,
S. U. Khan, The rise of âĂĲbig dataâĂİ on cloud computing: Review
and open research issues, Information Systems 47 (2015) 98 – 115.
doi:https://doi.org/10.1016/j.is.2014.07.006.
URL http://www.sciencedirect.com/science/article/pii/
S0306437914001288
[9] M. G. F. Coutinho, M. F. Torquato, M. A. C. Fernandes, Deep neural
network hardware implementation based on stacked sparse autoencoder,
IEEE Access 7 (2019) 40674–40694. doi:10.1109/ACCESS.2019.2907261.
[10] L. M. D. Da Silva, M. F. Torquato, M. A. C. Fernandes, Parallel imple-
mentation of reinforcement learning q-learning technique for fpga, IEEE
Access 7 (2019) 2782–2798. doi:10.1109/ACCESS.2018.2885950.
[11] M. F. Torquato, M. A. Fernandes, High-performance parallel implementa-
tion of genetic algorithm on fpga, Circuits, Systems, and Signal Processing
38 (9) (2019) 4014–4039.
[12] F. F. Lopes, J. C. Ferreira, M. A. Fernandes, Parallel implementation on
fpga of support vector machines using stochastic gradient descent, Elec-
tronics 8 (6) (2019) 631.
[13] A. L. X. Da Costa, C. A. D. Silva, M. F. Torquato, M. A. C. Fernan-
des, Parallel implementation of particle swarm optimization on fpga, IEEE
Transactions on Circuits and Systems II: Express Briefs 66 (11) (2019)
1875–1879. doi:10.1109/TCSII.2019.2895343.
[14] B. S. J. Costa, C. G. Bezerra, L. A. Guedes, P. P. Angelov, Online fault
detection based on typicality and eccentricity data analytics, in: 2015 In-
25
ternational Joint Conference on Neural Networks (IJCNN), 2015, pp. 1–6.
doi:10.1109/IJCNN.2015.7280712.
[15] D. Kangin, P. Angelov, J. A. Iglesias, A. Sanchis, Evolving classifier
tedaclass for big data, Procedia Computer Science 53 (2015) 9 – 18, iNNS
Conference on Big Data 2015 Program San Francisco, CA, USA 8-10
August 2015. doi:https://doi.org/10.1016/j.procs.2015.07.274.
URL http://www.sciencedirect.com/science/article/pii/
S1877050915017779
[16] P. Angelov, Typicality distribution function âĂŤ a new density-based data
analytics tool, in: 2015 International Joint Conference on Neural Networks
(IJCNN), 2015, pp. 1–8. doi:10.1109/IJCNN.2015.7280438.
[17] A. Lavin, S. Ahmad, Evaluating real-time anomaly detection algorithms –
the numenta anomaly benchmark, in: 2015 IEEE 14th International Con-
ference on Machine Learning and Applications (ICMLA), 2015, pp. 38–44.
doi:10.1109/ICMLA.2015.141.
[18] R. S. Martins, P. Angelov, B. Sielly Jales Costa, Automatic detection of
computer network traffic anomalies based on eccentricity analysis, in: 2018
IEEE International Conference on Fuzzy Systems (FUZZ-IEEE), 2018, pp.
1–8. doi:10.1109/FUZZ-IEEE.2018.8491507.
[19] T. Ziermann, J. Teich, Adaptive traffic scheduling techniques for mixed
real-time and streaming applications on reconfigurable hardware, in: 2010
IEEE International Symposium on Parallel Distributed Processing, Work-
shops and Phd Forum (IPDPSW), 2010, pp. 1–4. doi:10.1109/IPDPSW.
2010.5470738.
[20] B. Yang, M. Yang, A. Plaza, L. Gao, B. Zhang, Dual-mode fpga imple-
mentation of target and anomaly detection algorithms for real-time hy-
perspectral imaging, IEEE Journal of Selected Topics in Applied Earth
Observations and Remote Sensing 8 (6) (2015) 2950–2961. doi:10.1109/
JSTARS.2015.2388797.
26
[21] M. Wess, P. D. S. Manoj, A. Jantsch, Neural network based ecg anomaly
detection on fpga and trade-off analysis, in: 2017 IEEE International Sym-
posium on Circuits and Systems (ISCAS), 2017, pp. 1–4. doi:10.1109/
ISCAS.2017.8050805.
[22] P. Angelov, Outside the box: an alternative data analytics framework,
Journal of Automation Mobile Robotics and Intelligent Systems 8 (2)
(2014) 29–35.
[23] A. Plamen, Outside the box: an alternative data analytics framework, Jour-
nal of Automation, Mobile Robotics and Intelligent Systems 8 (2) (2014)
29–35.
[24] A. Bernieri, G. Betta, C. Liguori, On-line fault detection and diagnosis
obtained by implementing neural algorithms on a digital signal processor,
IEEE Transactions on Instrumentation and Measurement 45 (5) (1996)
894–899. doi:10.1109/19.536707.
[25] J. G. Saw, M. C. Yang, T. C. Mo, Chebyshev inequality with estimated
mean and variance, The American Statistician 38 (2) (1984) 130–132.
[26] E. F. R. T. Network, Damadics rtn information web site (2002).
URL http://diag.mchtr.pw.edu.pl/damadics/
27
