576 research outputs found
BINet: Multi-perspective Business Process Anomaly Classification
In this paper, we introduce BINet, a neural network architecture for
real-time multi-perspective anomaly detection in business process event logs.
BINet is designed to handle both the control flow and the data perspective of a
business process. Additionally, we propose a set of heuristics for setting the
threshold of an anomaly detection algorithm automatically. We demonstrate that
BINet can be used to detect anomalies in event logs not only on a case level
but also on event attribute level. Finally, we demonstrate that a simple set of
rules can be used to utilize the output of BINet for anomaly classification. We
compare BINet to eight other state-of-the-art anomaly detection algorithms and
evaluate their performance on an elaborate data corpus of 29 synthetic and 15
real-life event logs. BINet outperforms all other methods both on the synthetic
as well as on the real-life datasets
Ensemble Feature Learning-Based Event Classification for Cyber-Physical Security of the Smart Grid
The power grids are transforming into the cyber-physical smart grid with increasing two-way communications and abundant data flows. Despite the efficiency and reliability promised by this transformation, the growing threats and incidences of cyber attacks targeting the physical power systems have exposed severe vulnerabilities. To tackle such vulnerabilities, intrusion detection systems (IDS) are proposed to monitor threats for the cyber-physical security of electrical power and energy systems in the smart grid with increasing machine-to-machine communication. However, the multi-sourced, correlated, and often noise-contained data, which record various concurring cyber and physical events, are posing significant challenges to the accurate distinction by IDS among events of inadvertent and malignant natures. Hence, in this research, an ensemble learning-based feature learning and classification for cyber-physical smart grid are designed and implemented. The contribution of this research are (i) the design, implementation and evaluation of an ensemble learning-based attack classifier using extreme gradient boosting (XGBoost) to effectively detect and identify attack threats from the heterogeneous cyber-physical information in the smart grid; (ii) the design, implementation and evaluation of stacked denoising autoencoder (SDAE) to extract highlyrepresentative feature space that allow reconstruction of a noise-free input from noise-corrupted
perturbations; (iii) the design, implementation and evaluation of a novel ensemble learning-based feature extractors that combine multiple autoencoder (AE) feature extractors and random forest base classifiers, so as to enable accurate reconstruction of each feature and reliable classification against malicious events. The simulation results validate the usefulness of ensemble learning approach in detecting malicious events in the cyber-physical smart grid
Click Fraud Detection in Online and In-app Advertisements: A Learning Based Approach
Click Fraud is the fraudulent act of clicking on pay-per-click advertisements to increase a site’s revenue, to drain revenue from the advertiser, or to inflate the popularity of content on social media platforms. In-app advertisements on mobile platforms are among the most common targets for click fraud, which makes companies hesitant to advertise their products. Fraudulent clicks are supposed to be caught by ad providers as part of their service to advertisers, which is commonly done using machine learning methods. However: (1) there is a lack of research in current literature addressing and evaluating the different techniques of click fraud detection and prevention, (2) threat models composed of active learning systems (smart attackers) can mislead the training process of the fraud detection model by polluting the training data, (3) current deep learning models have significant computational overhead, (4) training data is often in an imbalanced state, and balancing it still results in noisy data that can train the classifier incorrectly, and (5) datasets with high dimensionality cause increased computational overhead and decreased classifier correctness -- while existing feature selection techniques address this issue, they have their own performance limitations. By extending the state-of-the-art techniques in the field of machine learning, this dissertation provides the following solutions: (i) To address (1) and (2), we propose a hybrid deep-learning-based model which consists of an artificial neural network, auto-encoder and semi-supervised generative adversarial network. (ii) As a solution for (3), we present Cascaded Forest and Extreme Gradient Boosting with less hyperparameter tuning. (iii) To overcome (4), we propose a row-wise data reduction method, KSMOTE, which filters out noisy data samples both in the raw data and the synthetically generated samples. (iv) For (5), we propose different column-reduction methods such as multi-time-scale Time Series analysis for fraud forecasting, using binary labeled imbalanced datasets and hybrid filter-wrapper feature selection approaches
IoT Data Analytics in Dynamic Environments: From An Automated Machine Learning Perspective
With the wide spread of sensors and smart devices in recent years, the data
generation speed of the Internet of Things (IoT) systems has increased
dramatically. In IoT systems, massive volumes of data must be processed,
transformed, and analyzed on a frequent basis to enable various IoT services
and functionalities. Machine Learning (ML) approaches have shown their capacity
for IoT data analytics. However, applying ML models to IoT data analytics tasks
still faces many difficulties and challenges, specifically, effective model
selection, design/tuning, and updating, which have brought massive demand for
experienced data scientists. Additionally, the dynamic nature of IoT data may
introduce concept drift issues, causing model performance degradation. To
reduce human efforts, Automated Machine Learning (AutoML) has become a popular
field that aims to automatically select, construct, tune, and update machine
learning models to achieve the best performance on specified tasks. In this
paper, we conduct a review of existing methods in the model selection, tuning,
and updating procedures in the area of AutoML in order to identify and
summarize the optimal solutions for every step of applying ML algorithms to IoT
data analytics. To justify our findings and help industrial users and
researchers better implement AutoML approaches, a case study of applying AutoML
to IoT anomaly detection problems is conducted in this work. Lastly, we discuss
and classify the challenges and research directions for this domain.Comment: Published in Engineering Applications of Artificial Intelligence
(Elsevier, IF:7.8); Code/An AutoML tutorial is available at Github link:
https://github.com/Western-OC2-Lab/AutoML-Implementation-for-Static-and-Dynamic-Data-Analytic
The Multiple Facets of Software Diversity: Recent Developments in Year 2000 and Beyond
Early experiments with software diversity in the mid 1970's investigated N-version programming and recovery blocks to increase the reliability of embedded systems. Four decades later, the literature about software diversity has expanded in multiple directions: goals (fault-tolerance, security, software engineering); means (managed or automated diversity) and analytical studies (quantification of diversity and its impact). Our paper contributes to the field of software diversity as the first paper that adopts an inclusive vision of the area, with an emphasis on the most recent advances in the field. This survey includes classical work about design and data diversity for fault tolerance, as well as the cybersecurity literature that investigates randomization at different system levels. It broadens this standard scope of diversity, to include the study and exploitation of natural diversity and the management of diverse software products. Our survey includes the most recent works, with an emphasis from 2000 to present. The targeted audience is researchers and practitioners in one of the surveyed fields, who miss the big picture of software diversity. Assembling the multiple facets of this fascinating topic sheds a new light on the field
Recommended from our members
Big data academic and learning analytics: connecting the dots for academic excellence in higher education
Purpose
Although big data analytics have great benefits for higher education institutions, due to lack of sufficient evidence on how big data analytics investment can pay off, it is tough for HEIs practitioners to realize value from such adoption. The current study proposes a big data academic and learning analytics enabled business value model to explain big data analytics potential benefits and business value which can be obtained by developing such analytics capabilities in HEIs.
Design/methodology/approach
The study examined 47 case descriptions from 26 HEIs to investigate the causal association between the big data analytics current and potential benefits and business value creation path for big data academic and learning analytics success in higher education institutions.
Findings
The pressure of compliance with all legal & regulatory requirements and competition had pushed higher education institutions hard to adopt BDA tools. However, the study found out that application of risk & security and predictive analytics to higher education fields is still in its infancy. Using this theoretical model, our results provide new insights to higher education administrators on ways to create big data analytics capabilities for higher education institutions transformation and suggest an empirical foundation that can lead to more thorough analysis of big data analytics implementation.
Originality/value
A distinctive theoretical contribution of this study is its conceptualization of understanding business value from big data analytics in the typical setting of higher education. The study provides HEIs with an all-inclusive understanding of big data analytics and gives insights on how it helps to transform HEIs. The new perspectives associated with the big data academic and learning analytics enabled business value model will contribute to future research in this area
Detection And Evaluation Of Exisiting Pavement System With Brick Base
At the turn of the century, the City of Orlando initiated the Neighborhood Horizon Program. This program involved local citizens to help improve their community resources by engaging in a process of planning where the problems associated with the communities were identified. Many residents favored to bring back the brick roads that were overlaid with asphalt concrete to provided better transportation in the mid 1900s. With majority of the neighborhood streets already bricked, removing asphalt ensured safety, served as a technique for slowing traffic, and added to the historical integrity. Since there were no official documentations available that stated the definite existence of bricks beneath the asphalt surface course, it would have been rather impossible to core hundreds of locations to ensure the whereabouts of these anomalies. Thus, without time delays and excessive coring costs, a nondestructive instrumentation of Ground Penetrating Radar (GPR) was employed in the detection of bricks. This geophysical survey system distinguishes materials based on their different electrical properties that depend upon temperature, density, moisture content and impurities by providing a continuous profile of the subsurface conditions. The Ground Penetrating Radar operates on the principle of the electromagnetic wave (EMW) theory. The main objectives of this study was to investigate the existing pavement by using Ground Penetrating Radar (GPR) in detecting the brick base and to analyze the performance of pavement system for fatigue and rutting. The results of this study will assist the City of Orlando in removing asphalt layer, rebuilding of brick roads, and facilitate in better zoning and planning of the city. The construction of controlled test area provided with a good sense of brick detection, which helped in precise locations bricks for sections of Summerlin Avenue, Church Street and Cherokee Drive. The project demonstrated a good sense of detecting the subsurface anomalies, such as bricks. The validation of the profile readings was near to a 100%
Resources of Near-Earth Space: Abstracts
The objectives are by theory, experiment, and bench-level testing of small systems, to develop scientifically-sound engineering processes and facility specifications for producing propellants and fuels, construction and shielding materials, and life support substances from the lithospheres and atmospheres of lunar, planetary, and asteroidal bodies. Current emphasis is on the production of oxygen, other usefull gases, metallic, ceramic/composite, and related byproducts from lunar regolith, carbonaceous chrondritic asteroids, and the carbon dioxide rich Martian atmosphere
- …