Search CORE

1,297 research outputs found

Performance Analysis of Open Source Machine Learning Frameworks for Various Parameters in Single-Threaded and Multi-Threaded Modes

Author: Alienin Oleg
Gordienko Yuri
Kochura Yuriy
Novotarskiy Michail
Stirenko Sergii
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 29/08/2017
Field of study

The basic features of some of the most versatile and popular open source frameworks for machine learning (TensorFlow, Deep Learning4j, and H2O) are considered and compared. Their comparative analysis was performed and conclusions were made as to the advantages and disadvantages of these platforms. The performance tests for the de facto standard MNIST data set were carried out on H2O framework for deep learning algorithms designed for CPU and GPU platforms for single-threaded and multithreaded modes of operation Also, we present the results of testing neural networks architectures on H2O platform for various activation functions, stopping metrics, and other parameters of machine learning algorithm. It was demonstrated for the use case of MNIST database of handwritten digits in single-threaded mode that blind selection of these parameters can hugely increase (by 2-3 orders) the runtime without the significant increase of precision. This result can have crucial influence for optimization of available and new machine learning methods, especially for image recognition problems.Comment: 15 pages, 11 figures, 4 tables; this paper summarizes the activities which were started recently and described shortly in the previous conference presentations arXiv:1706.02248 and arXiv:1707.04940; it is accepted for Springer book series "Advances in Intelligent Systems and Computing

arXiv.org e-Print Archive

Crossref

Batch Size Influence on Performance of Graphic and Tensor Processing Units during Training and Inference Phases

Author: Alienin Oleg
Gordienko Nikita
Gordienko Yuri
Kochura Yuriy
Rokovyi Alexandr
Stirenko Sergii
Taran Vlad
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 31/12/2018
Field of study

The impact of the maximally possible batch size (for the better runtime) on performance of graphic processing units (GPU) and tensor processing units (TPU) during training and inference phases is investigated. The numerous runs of the selected deep neural network (DNN) were performed on the standard MNIST and Fashion-MNIST datasets. The significant speedup was obtained even for extremely low-scale usage of Google TPUv2 units (8 cores only) in comparison to the quite powerful GPU NVIDIA Tesla K80 card with the speedup up to 10x for training stage (without taking into account the overheads) and speedup up to 2x for prediction stage (with and without taking into account overheads). The precise speedup values depend on the utilization level of TPUv2 units and increase with the increase of the data volume under processing, but for the datasets used in this work (MNIST and Fashion-MNIST with images of sizes 28x28) the speedup was observed for batch sizes >512 images for training phase and >40 000 images for prediction phase. It should be noted that these results were obtained without detriment to the prediction accuracy and loss that were equal for both GPU and TPU runs up to the 3rd significant digit for MNIST dataset, and up to the 2nd significant digit for Fashion-MNIST dataset.Comment: 10 pages, 7 figures, 2 table

arXiv.org e-Print Archive

Crossref

A deep learning approach for intrusion detection in Internet of Things using bi-directional long short-term memory recurrent neural network

Author: Roy Bipraneel
Publication venue: 'American Psychological Association (APA)'
Publication date: 01/01/2018
Field of study

Internet-of-Things connects every ‘thing’ with the Internet and allows these ‘things’ to communicate with each other. IoT comprises of innumerous interconnected devices of diverse complexities and trends. This fundamental nature of IoT structure intensifies the amount of attack targets which might affect the sustainable growth of IoT. Thus, security issues become a crucial factor to be addressed. A novel deep learning approach have been proposed in this thesis, for performing real-time detections of security threats in IoT systems using the Bi-directional Long Short-Term Memory Recurrent Neural Network (BLSTM RNN). The proposed approach have been implemented through Google TensorFlow implementation framework and Python programming language. To train and test the proposed approach, UNSW-NB15 dataset has been employed, which is the most up-to-date benchmark dataset with sequential samples and contemporary attack patterns. This thesis work employs binary classification of attack and normal patterns. The experimental result demonstrates the proficiency of the introduced model with respect to recall, precision, FAR and f-1 score. The model attains over 97% detection accuracy. The test result demonstrates that BLSTM RNN is profoundly effective for building highly efficient model for intrusion detection and offers a novel research methodology

Western Sydney ResearchDirect

A review on deep-learning-based cyberbullying detection

Author: Ahmed Mohiuddin
Akter Arifa
Hasan Md Tarek
Hossain Md Al Emran
Islam Salekul
Mukta Md Saddam Hossain
Publication venue: Edith Cowan University, Research Online, Perth, Western Australia
Publication date: 01/05/2023
Field of study

Bullying is described as an undesirable behavior by others that harms an individual physically, mentally, or socially. Cyberbullying is a virtual form (e.g., textual or image) of bullying or harassment, also known as online bullying. Cyberbullying detection is a pressing need in today’s world, as the prevalence of cyberbullying is continually growing, resulting in mental health issues. Conventional machine learning models were previously used to identify cyberbullying. However, current research demonstrates that deep learning surpasses traditional machine learning algorithms in identifying cyberbullying for several reasons, including handling extensive data, efficiently classifying text and images, extracting features automatically through hidden layers, and many others. This paper reviews the existing surveys and identifies the gaps in those studies. We also present a deep-learning-based defense ecosystem for cyberbullying detection, including data representation techniques and different deep-learning-based models and frameworks. We have critically analyzed the existing DL-based cyberbullying detection techniques and identified their significant contributions and the future research directions they have presented. We have also summarized the datasets being used, including the DL architecture being used and the tasks that are accomplished for each dataset. Finally, several challenges faced by the existing researchers and the open issues to be addressed in the future have been presented

Research Online @ ECU

Application of probabilistic deep learning models to simulate thermal power plant processes

Author: Raidoo Renita Anand
Publication venue: Faculty of Engineering and the Built Environment
Publication date: 18/04/2023
Field of study

Deep learning has gained traction in thermal engineering due to its applications to process simulations, the deeper insights it can provide and its abilities to circumvent the shortcomings of classic thermodynamic simulation approaches by capturing complex inter-dependencies. This works sets out to apply probabilistic deep learning to power plant operations using historic plant data. The first study presented, entails the development of a steady-state mixture density network (MDN) capable of predicting effective heat transfer coefficients (HTC) for the various heat exchanger components inside a utility scale boiler. Selected directly controllable input features, including the excess air ratio, steam temperatures, flow rates and pressures are used to predict the HTCs. In the second case study, an encoder-decoder mixturedensity network (MDN) is developed using recurrent neural networks (RNN) for the prediction of utility-scale air-cooled condenser (ACC) backpressure. The effects of ambient conditions and plant operating parameters, such as extraction flow rate, on ACC performance is investigated. In both case studies, hyperparameter searches are done to determine the best performing architectures for these models. Comparisons are drawn between the MDN model versus standard model architecture in both case studies. The HTC predictor model achieved 90% accuracy which equates to an average error of 4.89 W m2K across all heat exchangers. The resultant time-series ACC model achieved an average error of 3.14 kPa, which translate into a model accuracy of 82%

Cape Town University OpenUCT