Search CORE

747 research outputs found

LSTM Networks for Detection and Classification of Anomalies in Raw Sensor Data

Author: Verner Alexander
Publication venue: NSUWorks
Publication date: 01/01/2019
Field of study

In order to ensure the validity of sensor data, it must be thoroughly analyzed for various types of anomalies. Traditional machine learning methods of anomaly detections in sensor data are based on domain-specific feature engineering. A typical approach is to use domain knowledge to analyze sensor data and manually create statistics-based features, which are then used to train the machine learning models to detect and classify the anomalies. Although this methodology is used in practice, it has a significant drawback due to the fact that feature extraction is usually labor intensive and requires considerable effort from domain experts. An alternative approach is to use deep learning algorithms. Research has shown that modern deep neural networks are very effective in automated extraction of abstract features from raw data in classification tasks. Long short-term memory networks, or LSTMs in short, are a special kind of recurrent neural networks that are capable of learning long-term dependencies. These networks have proved to be especially effective in the classification of raw time-series data in various domains. This dissertation systematically investigates the effectiveness of the LSTM model for anomaly detection and classification in raw time-series sensor data. As a proof of concept, this work used time-series data of sensors that measure blood glucose levels. A large number of time-series sequences was created based on a genuine medical diabetes dataset. Anomalous series were constructed by six methods that interspersed patterns of common anomaly types in the data. An LSTM network model was trained with k-fold cross-validation on both anomalous and valid series to classify raw time-series sequences into one of seven classes: non-anomalous, and classes corresponding to each of the six anomaly types. As a control, the accuracy of detection and classification of the LSTM was compared to that of four traditional machine learning classifiers: support vector machines, Random Forests, naive Bayes, and shallow neural networks. The performance of all the classifiers was evaluated based on nine metrics: precision, recall, and the F1-score, each measured in micro, macro and weighted perspective. While the traditional models were trained on vectors of features, derived from the raw data, that were based on knowledge of common sources of anomaly, the LSTM was trained on raw time-series data. Experimental results indicate that the performance of the LSTM was comparable to the best traditional classifiers by achieving 99% accuracy in all 9 metrics. The model requires no labor-intensive feature engineering, and the fine-tuning of its architecture and hyper-parameters can be made in a fully automated way. This study, therefore, finds LSTM networks an effective solution to anomaly detection and classification in sensor data

NSU Works

ExplainIt! -- A declarative root-cause analysis engine for time series data (extended version)

Author: Benjamini Y.
Cohen I.
Jeyakumar V.
Pedregosa F.
Seth A. K.
Shimizu S.
Tenenbaum J. B.
Wang Y.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 22/03/2019
Field of study

We present ExplainIt!, a declarative, unsupervised root-cause analysis engine that uses time series monitoring data from large complex systems such as data centres. ExplainIt! empowers operators to succinctly specify a large number of causal hypotheses to search for causes of interesting events. ExplainIt! then ranks these hypotheses, reducing the number of causal dependencies from hundreds of thousands to a handful for human understanding. We show how a declarative language, such as SQL, can be effective in declaratively enumerating hypotheses that probe the structure of an unknown probabilistic graphical causal model of the underlying system. Our thesis is that databases are in a unique position to enable users to rapidly explore the possible causal mechanisms in data collected from diverse sources. We empirically demonstrate how ExplainIt! had helped us resolve over 30 performance issues in a commercial product since late 2014, of which we discuss a few cases in detail.Comment: SIGMOD Industry Track 201

arXiv.org e-Print Archive

Crossref

Integrating genetics and epigenetics in breast cancer: biological insights, experimental, computational methods and therapeutic potential

Author: A Berchuck
A Colaprico
A Dobrovic
A Muniategui
A Oulas
A Schumacher
A Sewer
A Sharma
A Subramanian
AB Poplawski
AE Pasquinelli
AG Knudson
AJ Lowery
AL Smith
AM Cleton-Jansen
AM Gonzalez-Angulo
AM Gonzalez‐Angulo
AP Bird
AP Feinberg
AP Kumar
AP Trapé
AS Knoop
AV Ivshina
B Futcher
B Liu
B Orsetti
B Phipson
B Vogelstein
B Zhang
BG Masayesva
BN Hannafon
BS Wittner
BZ Ring
C Alkan
C Ambroise
C Blenkiron
C Cava
C Cava
C Cava
C Corzo
C Costa
C Desmedt
C Mayr
C Previti
C Ragan
C Rodriguez
C Soneson
C Sotiriou
C Wang
C Wang
C Wang
C Xue
Cancer Genome Atlas Network
CB Kingsley
CD Mayer
CJ Sherr
CJ Vaske
CK Zoon
Claudia Cava
CM Marson
D Beck
D Carling
D Chen
D Hanahan
D Li
D Lipson
D Luo
D Madhavan
D Madhavan
D Malkin
D Samantarrai
D Subramaniam
D Tsafrir
D Xu
DD Taylor
DE Hallahan
DJ Gordon
DJ Slamon
DP Bartel
DP Pandey
DR Hurst
DW Thomson
E Berezikov
E Dudziec
E Hervouet
E Hyman
E O'Day
E Rinaldis de
EC Lai
EC Lai
EC Robanus-Maandag
EJ Faivre
ER Fearon
F Andre
F Eckhardt
F Holst
F Mar-Aguilar
F Meng
F Meric-Bernstam
F Mohn
F Wessely
F Wu
F Xiao
F Yu
F Yu
FC Stingo
FP O'Malley
G Bertoli
G Imataka
G Maire
G Sales
G Song
G Terai
G Viale
GA Calin
GA Calin
GK Scott
Gloria Bertoli
H Bengtsson
H Dvinge
H Konishi
H Lee
H Liu
H Nagai
H Park
H Si
H Solvang
H Wang
H Wu
HJ Peltier
HM Muller
HS Eo
I Ali
I Auwera Van der
I Bentwich
I Bentwich
I Gonzalez
IL Hofacker
IS Oh
Isabella Castiglioni
J Allmer
J Baselga
J Baselga
J Fullgrabe
J Hertel
J Huang
J Nie
J Pollack
J Staaf
J Xu
J Yu
J Yun
J Zhang
JA Berger
JA Nielsen
JB Patel
JB Weidhaas
JC Alwine
JC Engelmann
JC Huang
JD Pollock
JE Eckel-Passow
JG Paez
JJ Goeman
JJ Goeman
JL Phillips
JM Bartlett
JM Bueno-de-Mesquita
JM Bueno-de-Mesquita
JM Korn
JS Parker
JT Bell
JW Nam
JZ Xu
K Chin
K Lundgren
K Polyak
K Salari
K-C Chen
KJ Png
KL Ng
KW Tsai
L Cascione
L Chin
L He
L Li
L Li
L Lu
L Ma
L Ma
L Yu
L Zhang
L Zhong
LF Sempere
LJ van 't Veer
LP Lim
LX Yan
LY Chuang
LY Chuang
M Bhasin
M Billam
M Buyse
M Chen
M Chimonidou
M Ehrlich
M Inomata
M Korpal
M Lindow
M Negrini
M Ortiz-Estevez
M Sachdeva
M Salman
M Schäfer
M Szyf
M Taniguchi
M Tanner
M Wanderley
M Wolf
M Yousef
M Yousef
M Zhang
M Zhou
MA Taylor
MA Valasek
MA Wiel van de
MC Pouliot
MD Edmonds
MD Mattie
ME Thompson
MJ Aryee
MJ Lodes
MJ Vijver van de
ML Si
MM Desouki
MR Aure
MS Stark
MV Iorio
MV Iorio
N Dias
N Huang
N Rosenfeld
N Srivastava
O Alter
O Bornachea
O Kan
O Monni
P Bertheau
P Du
P Jafari
P Medvedev
P Rizzolo
P Souza Rocha Simonini de
PA Gregory
PA Jones
PH Westfall
PM Neilsen
PN Munster
PS Yan
Q Li
Q Wu
QQ Li
R Battiti
R Beroukhim
R Kodzius
R Louhimo
R Menezes
R Nogales-Cadenas
R Pinto
R Radpour
R Shen
RA Veitia
RC Lee
RC Thompson
RC Zeng
RJ Webster
RS Gitan
RT Barfield
S Aulmann
S Aulmann
S Bicciato
S Brenner
S Cho
S Fan
S Kadri
S Paik
S Paik
S Sarkar
S Streit
S Tsutsui
S Valastyan
S Valastyan
S Valastyan
S Vasudevan
S Volinia
S Volinia
S-D Hsu
SA Bustin
SA Leon
SD Reddy
SD Reddy
SF Chin
SF Tavazoie
SM Hammond
T Dhingra
T Oskarsson
TA Farazi
TA Harris
TH Huang
TH Huang
TR Golub
U Hamann
UD Akavia
V Birgisdottir
V Bolón-Canedo
V Jayaswal
V Stearns
V Tarasov
VA Gennarino
VE Velculescu
VG Tusher
VK Mootha
VN Kristensen
VN Kristensen
W Chen
W Chen
W Li
W Ritchie
W Ritchie
W Scheuer
W Walther
W Zhang
WL Tam
WN Wieringen
X Gai
X Li
X Liu
X Zhao
X Zhao
XF Li
Y Assenov
Y Grad
Y Huang
Y Li
Y Nannya
Y Saeys
Y Sun
Y Wang
Y Zhang
YH Hsiao
Z Herceg
Z Hu
Z Li
Z Wang
Z Yu
Z Yu
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

An outlier detection method to improve gathered datasets for network behavior analysis in IoT

Author: Haugen Øystein
Shahraki Amin
Publication venue: 'Engineering and Technology Publishing'
Publication date: 01/01/2019
Field of study

Outlier detection is a subfield of data mining to determine data points that notably deviate from the rest of a dataset. Their deviation can indicate that these data points are generated by errors and should therefore be removed or repaired. There are many reasons for outliers in a network dataset such as human or instrument errors, noise or system behavior changes. On the other side, Network Behavior Analysis (NBA) is a way to monitor traffic and recognize unusual actions in a network. Analyzing data trends in NBA methods is a common way to interpret network situation. Outliers can deviate and produce erroneous trends that influence the results of the NBA methods. This paper presents an approach that based on a method for trend detection divides the data set into subsets where contextual outliers are discovered. The outliers can then be removed to have a clear dataset that better shows the network behavior when using NBA methods. Increasing the accuracy and reliability are the goals of our method. We compare the proposed method with the Hampel method on simulated IoT network data.publishedVersio

Crossref

HIØ Brage

NORA - Norwegian Open Research Archives

Intelligent Sensor Networks

Author
Publication venue: 'Informa UK Limited'
Publication date
Field of study

In the last decade, wireless or wired sensor networks have attracted much attention. However, most designs target general sensor network issues including protocol stack (routing, MAC, etc.) and security issues. This book focuses on the close integration of sensing, networking, and smart signal processing via machine learning. Based on their world-class research, the authors present the fundamentals of intelligent sensor networks. They cover sensing and sampling, distributed signal processing, and intelligent signal learning. In addition, they present cutting-edge research results from leading experts

OAPEN Library

Automated anomaly recognition in real time data streams for oil and gas industry.

Author: Majdani Shabestari Farzan
Publication venue
Publication date: 30/06/2020
Field of study

There is a growing demand for computer-assisted real-time anomaly detection - from the identification of suspicious activities in cyber security, to the monitoring of engineering data for various applications across the oil and gas, automotive and other engineering industries. To reduce the reliance on field experts' knowledge for identification of these anomalies, this thesis proposes a deep-learning anomaly-detection framework that can help to create an effective real-time condition-monitoring framework. The aim of this research is to develop a real-time and re-trainable generic anomaly-detection framework, which is capable of predicting and identifying anomalies with a high level of accuracy - even when a specific anomalous event has no precedent. Machine-based condition monitoring is preferable in many practical situations where fast data analysis is required, and where there are harsh climates or otherwise life-threatening environments. For example, automated conditional monitoring systems are ideal in deep sea exploration studies, offshore installations and space exploration. This thesis firstly reviews studies about anomaly detection using machine learning. It then adopts the best practices from those studies in order to propose a multi-tiered framework for anomaly detection with heterogeneous input sources, which can deal with unseen anomalies in a real-time dynamic problem environment. The thesis then applies the developed generic multi-tiered framework to two fields of engineering: data analysis and malicious cyber attack detection. Finally, the framework is further refined based on the outcomes of those case studies and is used to develop a secure cross-platform API, capable of re-training and data classification on a real-time data feed

Open Access Institutional Repository at Robert Gordon University