4 research outputs found

    Is One Hyperparameter Optimizer Enough?

    Full text link
    Hyperparameter tuning is the black art of automatically finding a good combination of control parameters for a data miner. While widely applied in empirical Software Engineering, there has not been much discussion on which hyperparameter tuner is best for software analytics. To address this gap in the literature, this paper applied a range of hyperparameter optimizers (grid search, random search, differential evolution, and Bayesian optimization) to defect prediction problem. Surprisingly, no hyperparameter optimizer was observed to be `best' and, for one of the two evaluation measures studied here (F-measure), hyperparameter optimization, in 50\% cases, was no better than using default configurations. We conclude that hyperparameter optimization is more nuanced than previously believed. While such optimization can certainly lead to large improvements in the performance of classifiers used in software analytics, it remains to be seen which specific optimizers should be applied to a new dataset.Comment: 7 pages, 2 columns, accepted for SWAN1

    Forecasting and Optimizing Dual Media Filter Performance via Machine Learning

    Get PDF
    Four different machine learning algorithms, including Decision Tree (DT), Random Forest (RF), Multivariable Linear Regression (MLR), Support Vector Regressions (SVR), and Gaussian Process Regressions (GPR), were applied to predict the performance of a multi-media filter operating as a function of raw water quality and plant operating variables. The models were trained using data collected over a seven year period covering water quality and operating variables, including true colour, turbidity, plant flow, and chemical dose for chlorine, KMnO4, FeCl3, and Cationic Polymer (PolyDADMAC). The machine learning algorithms have shown that the best prediction is at a 1-day time lag between input variables and unit filter run volume (UFRV). Furthermore, the RF algorithm with grid search using the input metrics mentioned above with a 1-day time lag has provided the highest reliability in predicting UFRV with a RMSE and R2 of 31.58 and 0.98, respectively. Similarly, RF with grid search has shown the shortest training time, prediction accuracy, and forecasting events using a ROC-AUC curve analysis (AUC over 0.8) in extreme wet weather events. Therefore, Random Forest with grid search and a 1-day time lag is an effective and robust machine learning algorithm that can predict the filter performance to aid water treatment operators in their decision makings by providing real-time warning of the potential turbidity breakthrough from the filters

    Optimising WLANs Power Saving: Context-Aware Listen Interval

    Get PDF
    Energy is a vital resource in wireless computing systems. Despite the increasing popularity of Wireless Local Area Networks (WLANs), one of the most important outstanding issues remains the power consumption caused by Wireless Network Interface Controller (WNIC). To save this energy and reduce the overall power consumption of wireless devices, a number of power saving approaches have been devised including Static Power Save Mode (SPSM), Adaptive PSM (APSM), and Smart Adaptive PSM (SAPSM). However, the existing literature has highlighted several issues and limitations in regards to their power consumption and performance degradation, warranting the need for further enhancements. This thesis proposes a novel Context-Aware Listen Interval (CALI), in which the wireless network interface, with the aid of a Machine Learning (ML) classification model, sleeps and awakes based on the level of network activity of each application. We focused on the network activity of a single smartphone application while ignoring the network activity of applications running simultaneously. We introduced a context-aware network traffic classification approach based on ML classifiers to classify the network traffic of wireless devices in WLANs. Smartphone applications’ network traffic reflecting a diverse array of network behaviour and interactions were used as contextual inputs for training ML classifiers of output traffic, constructing an ML classification model. A real-world dataset is constructed, based on nine smartphone applications’ network traffic, this is used firstly to evaluate the performance of five ML classifiers using cross-validation, followed by conducting extensive experimentation to assess the generalisation capacity of the selected classifiers on unseen testing data. The experimental results further validated the practical application of the selected ML classifiers and indicated that ML classifiers can be usefully employed for classifying the network traffic of smartphone applications based on different levels of behaviour and interaction. Furthermore, to optimise the sleep and awake cycles of the WNIC in accordance with the smartphone applications’ network activity. Four CALI power saving modes were developed based on the classified output traffic. Hence, the ML classification model classifies the new unseen samples into one of the classes, and the WNIC will be adjusted to operate into one of CALI power saving modes. In addition, the performance of CALI’s power saving modes were evaluated by comparing the levels of energy consumption with existing benchmark power saving approaches using three varied sets of energy parameters. The experimental results show that CALI consumes up to 75% less power when compared to the currently deployed power saving mechanism on the latest generation of smartphones, and up to 14% less energy when compared to SAPSM power saving approach, which also employs an ML classifier
    corecore