Search CORE

4,860 research outputs found

Clustering as an example of optimizing arbitrarily chosen objective functions

Author: A. Dempster
A. Fraser
A. Jain
J. Dunn
M. Budka
R. Dubes
R. Duda
R. Hamming
R. Jenssen
R. Sibson
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

This paper is a reflection upon a common practice of solving various types of learning problems by optimizing arbitrarily chosen criteria in the hope that they are well correlated with the criterion actually used for assessment of the results. This issue has been investigated using clustering as an example, hence a unified view of clustering as an optimization problem is first proposed, stemming from the belief that typical design choices in clustering, like the number of clusters or similarity measure can be, and often are suboptimal, also from the point of view of clustering quality measures later used for algorithm comparison and ranking. In order to illustrate our point we propose a generalized clustering framework and provide a proof-of-concept using standard benchmark datasets and two popular clustering methods for comparison

Crossref

Bournemouth University Research Online

Fuzzy Adaptive Tuning of a Particle Swarm Optimization Algorithm for Variable-Strength Combinatorial Test Suite Generation

Author: Afzal Wasif
Ahmed Bestoun S.
Mahmoud Thair
Zamli Kamal Z.
Publication venue
Publication date: 01/01/2018
Field of study

Combinatorial interaction testing is an important software testing technique that has seen lots of recent interest. It can reduce the number of test cases needed by considering interactions between combinations of input parameters. Empirical evidence shows that it effectively detects faults, in particular, for highly configurable software systems. In real-world software testing, the input variables may vary in how strongly they interact, variable strength combinatorial interaction testing (VS-CIT) can exploit this for higher effectiveness. The generation of variable strength test suites is a non-deterministic polynomial-time (NP) hard computational problem \cite{BestounKamalFuzzy2017}. Research has shown that stochastic population-based algorithms such as particle swarm optimization (PSO) can be efficient compared to alternatives for VS-CIT problems. Nevertheless, they require detailed control for the exploitation and exploration trade-off to avoid premature convergence (i.e. being trapped in local optima) as well as to enhance the solution diversity. Here, we present a new variant of PSO based on Mamdani fuzzy inference system \cite{Camastra2015,TSAKIRIDIS2017257,KHOSRAVANIAN2016280}, to permit adaptive selection of its global and local search operations. We detail the design of this combined algorithm and evaluate it through experiments on multiple synthetic and benchmark problems. We conclude that fuzzy adaptive selection of global and local search operations is, at least, feasible as it performs only second-best to a discrete variant of PSO, called DPSO. Concerning obtaining the best mean test suite size, the fuzzy adaptation even outperforms DPSO occasionally. We discuss the reasons behind this performance and outline relevant areas of future work.Comment: 21 page

arXiv.org e-Print Archive

Crossref

Research Online @ ECU

Recommended from our members

Ensemble learning of model hyperparameters and spatiotemporal data for calibration of low-cost PM2.5 sensors.

Author: Bhanu Bir
Day Rong-Fuh
Tsai Chih-Chun
Tung Ching-Ying
Yin Peng-Yeng
Publication venue: eScholarship, University of California
Publication date: 01/07/2019
Field of study

he PM2.5 air quality index (AQI) measurements from government-built supersites are accurate but cannot provide a dense coverage of monitoring areas. Low-cost PM2.5 sensors can be used to deploy a fine-grained internet-of-things (IoT) as a complement to government facilities. Calibration of low-cost sensors by reference to high-accuracy supersites is thus essential. Moreover, the imputation for missing-value in training data may affect the calibration result, the best performance of calibration model requires hyperparameter optimization, and the affecting factors of PM2.5 concentrations such as climate, geographical landscapes and anthropogenic activities are uncertain in spatial and temporal dimensions. In this paper, an ensemble learning for imputation method selection, calibration model hyperparameterization, and spatiotemporal training data composition is proposed. Three government supersites are chosen in central Taiwan for the deployment of low-cost sensors and hourly PM2.5 measurements are collected for 60 days for conducting experiments. Three optimizers, Sobol sequence, Nelder and Meads, and particle swarm optimization (PSO), are compared for evaluating their performances with various versions of ensembles. The best calibration results are obtained by using PSO, and the improvement ratios with respect to R2, RMSE, and NME, are 4.92%, 52.96%, and 56.85%, respectively

eScholarship - University of California

Easy over Hard: A Case Study on Deep Learning

Author: Bergstra James
Mou Lili
Pedregosa Fabian
Pennington Jeffrey
Rehurek Radim
Romano Jeanine
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 24/06/2017
Field of study

While deep learning is an exciting new technique, the benefits of this method need to be assessed with respect to its computational cost. This is particularly important for deep learning since these learners need hours (to weeks) to train the model. Such long training time limits the ability of (a)~a researcher to test the stability of their conclusion via repeated runs with different random seeds; and (b)~other researchers to repeat, improve, or even refute that original work. For example, recently, deep learning was used to find which questions in the Stack Overflow programmer discussion forum can be linked together. That deep learning system took 14 hours to execute. We show here that applying a very simple optimizer called DE to fine tune SVM, it can achieve similar (and sometimes better) results. The DE approach terminated in 10 minutes; i.e. 84 times faster hours than deep learning method. We offer these results as a cautionary tale to the software analytics community and suggest that not every new innovation should be applied without critical analysis. If researchers deploy some new and expensive process, that work should be baselined against some simpler and faster alternatives.Comment: 12 pages, 6 figures, accepted at FSE201

arXiv.org e-Print Archive

Crossref

SQG-Differential Evolution for difficult optimization problems under a tight function evaluation budget

Author: AK Qin
CA Floudas
Cullen Schaffer
DH Wolpert
F Duddeck
F Wilcoxon
GG Wang
J Brest
J Brest
J Knowles
J Sobieszczanski-Sobieski
J Zhang
LM Rios
M Kiani
R Mallipeddi
R Sala
R Sala
R Storn
RT Haftka
S Das
S Das
S García
S Venkataraman
T Krityakierne
TJ Carrigan
X Yao
Y Wang
YM Ermoliev
YM Ermoliev
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 12/07/2018
Field of study

In the context of industrial engineering, it is important to integrate efficient computational optimization methods in the product development process. Some of the most challenging simulation-based engineering design optimization problems are characterized by: a large number of design variables, the absence of analytical gradients, highly non-linear objectives and a limited function evaluation budget. Although a huge variety of different optimization algorithms is available, the development and selection of efficient algorithms for problems with these industrial relevant characteristics, remains a challenge. In this communication, a hybrid variant of Differential Evolution (DE) is introduced which combines aspects of Stochastic Quasi-Gradient (SQG) methods within the framework of DE, in order to improve optimization efficiency on problems with the previously mentioned characteristics. The performance of the resulting derivative-free algorithm is compared with other state-of-the-art DE variants on 25 commonly used benchmark functions, under tight function evaluation budget constraints of 1000 evaluations. The experimental results indicate that the new algorithm performs excellent on the 'difficult' (high dimensional, multi-modal, inseparable) test functions. The operations used in the proposed mutation scheme, are computationally inexpensive, and can be easily implemented in existing differential evolution variants or other population-based optimization algorithms by a few lines of program code as an non-invasive optional setting. Besides the applicability of the presented algorithm by itself, the described concepts can serve as a useful and interesting addition to the algorithmic operators in the frameworks of heuristics and evolutionary optimization and computing

arXiv.org e-Print Archive

Crossref