700 research outputs found
Behavioral Profiling of SCADA Network Traffic using Machine Learning Algorithms
Mixed traffic networks containing both traditional ICT network traffic and SCADA network traffic are more commonplace now due to the desire for remote control and monitoring of industrial processes. The ability to identify SCADA devices on a mixed traffic network with zero prior knowledge, such as port, protocol or IP address, is desirable since SCADA devices are communicating over corporate networks but typically use non-standard ports and proprietary protocols. Four supervised ML algorithms are tested on a mixed traffic dataset containing 116,527 dataflows from both SCADA and traditional ICT networks: Naive Bayes, NBTree, BayesNet, and J4.8. Using packet timing, packet size and data throughput as traffic behavior categories, this research calculates 24 attributes from each device dataflow. All four algorithms are tested with three attribute subsets: a full set and two reduced attribute subsets. The attributes and ML algorithms chosen for experimentation successfully demonstrate that a TPR of .9935 for SCADA network traffic is feasible on a given network. It also successfully identifies an optimal attribute subset, while maintaining at least a .99 TPR. The optimal attribute subset provides the SCADA network traffic behaviors that most effectively differentiating them from traditional ICT network traffic
Scalability Benchmarking of Cloud-Native Applications Applied to Event-Driven Microservices
Cloud-native applications constitute a recent trend for designing large-scale software systems. This thesis introduces the Theodolite benchmarking method, allowing researchers and practitioners to conduct empirical scalability evaluations of cloud-native applications, their frameworks, configurations, and deployments. The benchmarking method is applied to event-driven microservices, a specific type of cloud-native applications that employ distributed stream processing frameworks to scale with massive data volumes. Extensive experimental evaluations benchmark and compare the scalability of various stream processing frameworks under different configurations and deployments, including different public and private cloud environments. These experiments show that the presented benchmarking method provides statistically sound results in an adequate amount of time. In addition, three case studies demonstrate that the Theodolite benchmarking method can be applied to a wide range of applications beyond stream processing
Prioritized Anomaly Catalog Generation Using Model-Based Reasoning
Anomaly management—the detection, diagnosis, and resolution of anomalies in a system—is traditionally performed using experiential techniques which are quickly computed, but poorly structured. Newer model-based approaches are more systematic and higher performing but are computationally expensive, which is a particular challenge for execution in an operational environment. This paper builds on a novel system to pre-compute model-based anomaly symptoms to enable quick retrieval and diagnosis in operational settings. New additions to this system include a simplified model interface, anomaly likelihoods associated with each component, and easier interpretation of results. The implemented system has been used successfully to detect and diagnose anomalies in a baseline test circuit as well as in an operational satellite monitoring network. Results show that this approach is promising; with a thorough model, the diagnosis and resolution processes of anomaly management could be greatly improved for more complex remote systems such as university-operated nanosatellites and field robotic vehicles
Integrating Data Science and Earth Science
This open access book presents the results of three years collaboration between earth scientists and data scientists, in developing and applying data science methods for scientific discovery. The book will be highly beneficial for other researchers at senior and graduate level, interested in applying visual data exploration, computational approaches and scientifc workflows
Integrating Data Science and Earth Science
This open access book presents the results of three years collaboration between earth scientists and data scientist, in developing and applying data science methods for scientific discovery. The book will be highly beneficial for other researchers at senior and graduate level, interested in applying visual data exploration, computational approaches and scientifc workflows
Process-Driven and Flow-Based Processing of Industrial Sensor Data
For machine manufacturing companies, besides the production of high quality and reliable machines, requirements have emerged to maintain machine-related aspects through digital services. The development of such services in the field of the Industrial Internet of Things (IIoT) is dealing with solutions such as effective condition monitoring and predictive maintenance. However, appropriate data sources are needed on which digital services can be technically based. As many powerful and cheap sensors have been introduced over the last years, their integration into complex machines is promising for developing digital services for various scenarios. It is apparent that for components handling recorded data of these sensors they must usually deal with large amounts of data. In particular, the labeling of raw sensor data must be furthered by a technical solution. To deal with these data handling challenges in a generic way, a sensor processing pipeline (SPP) was developed, which provides effective methods to capture, process, store, and visualize raw sensor data based on a processing chain. Based on the example of a machine manufacturing company, the SPP approach is presented in this work. For the company involved, the approach has revealed promising results
Data Management in Microservices: State of the Practice, Challenges, and Research Directions
We are recently witnessing an increased adoption of microservice
architectures by the industry for achieving scalability by functional
decomposition, fault-tolerance by deployment of small and independent services,
and polyglot persistence by the adoption of different database technologies
specific to the needs of each service. Despite the accelerating industrial
adoption and the extensive research on microservices, there is a lack of
thorough investigation on the state of the practice and the major challenges
faced by practitioners with regard to data management. To bridge this gap, this
paper presents a detailed investigation of data management in microservices.
Our exploratory study is based on the following methodology: we conducted a
systematic literature review of articles reporting the adoption of
microservices in industry, where more than 300 articles were filtered down to
11 representative studies; we analyzed a set of 9 popular open-source
microservice-based applications, selected out of more than 20 open-source
projects; furthermore, to strengthen our evidence, we conducted an online
survey that we then used to cross-validate the findings of the previous steps
with the perceptions and experiences of over 120 practitioners and researchers.
Through this process, we were able to categorize the state of practice and
reveal several principled challenges that cannot be solved by software
engineering practices, but rather need system-level support to alleviate the
burden of practitioners. Based on the observations we also identified a series
of research directions to achieve this goal. Fundamentally, novel database
systems and data management tools that support isolation for microservices,
which include fault isolation, performance isolation, data ownership, and
independent schema evolution across microservices must be built to address the
needs of this growing architectural style
- …