8 research outputs found

    Hyperparameter Selection with Good Region Recognition for SVM Based Fault Diagnosis

    Get PDF
    This paper proposes a novel method of good region recognition for hyperparameter selection of SVM. The method can provide a much smaller good region for optimization search-based methods, and thus it can greatly save computation time. Experimental results show that the proposed method improves effi ciency of fault diagnosis of rolling bearing with no accuracy loss

    Hybrid ACO and SVM algorithm for pattern classification

    Get PDF
    Ant Colony Optimization (ACO) is a metaheuristic algorithm that can be used to solve a variety of combinatorial optimization problems. A new direction for ACO is to optimize continuous and mixed (discrete and continuous) variables. Support Vector Machine (SVM) is a pattern classification approach originated from statistical approaches. However, SVM suffers two main problems which include feature subset selection and parameter tuning. Most approaches related to tuning SVM parameters discretize the continuous value of the parameters which will give a negative effect on the classification performance. This study presents four algorithms for tuning the SVM parameters and selecting feature subset which improved SVM classification accuracy with smaller size of feature subset. This is achieved by performing the SVM parameters’ tuning and feature subset selection processes simultaneously. Hybridization algorithms between ACO and SVM techniques were proposed. The first two algorithms, ACOR-SVM and IACOR-SVM, tune the SVM parameters while the second two algorithms, ACOMV-R-SVM and IACOMV-R-SVM, tune the SVM parameters and select the feature subset simultaneously. Ten benchmark datasets from University of California, Irvine, were used in the experiments to validate the performance of the proposed algorithms. Experimental results obtained from the proposed algorithms are better when compared with other approaches in terms of classification accuracy and size of the feature subset. The average classification accuracies for the ACOR-SVM, IACOR-SVM, ACOMV-R and IACOMV-R algorithms are 94.73%, 95.86%, 97.37% and 98.1% respectively. The average size of feature subset is eight for the ACOR-SVM and IACOR-SVM algorithms and four for the ACOMV-R and IACOMV-R algorithms. This study contributes to a new direction for ACO that can deal with continuous and mixed-variable ACO

    A practical approach to model selection for support vector machines with a gaussian kernel

    No full text
    When learning a support vector machine (SVM) from a set of labeled development patterns, the ultimate goal is to get a classifier attaining a low error rate on new patterns. This so-called generalization ability obviously depends on the choices of the learning parameters that control the learning process. Model selection is the method for identifying appropriate values for these parameters. In this paper, a novel model selection method for SVMs with a Gaussian kernel is proposed. Its aim is to find suitable values for the kernel parameter. and the cost parameter C with a minimum amount of central processing unit time. The determination of the kernel parameter is based on the argument that, for most patterns, the decision function of the SVM should consist of a sufficiently large number of significant contributions. A unique property of the proposed method is that it retrieves the kernel parameter as a simple analytical function of the dimensionality of the feature space and the dispersion of the classes in that space. An experimental evaluation on a test bed of 17 classification problems has shown that the new method favorably competes with two recently published methods: the classification of new patterns is equally good, but the computational effort to identify the learning parameters is substantially lower

    Large-scale Machine Learning in High-dimensional Datasets

    Get PDF

    Smart management of the charging of electric vehicles

    Get PDF
    The objective of this thesis was to investigate the management of electric vehicles (EVs) battery charging in distribution networks. Real EVs charging event data were used to investigate their charging demand profiles in a geographical area. A model was developed to analyse their charging demand characteristics and calculate their potential medium term operating risk level for the distribution network of the corresponding geographical area. A case study with real charging and weather data from three counties in UK was presented to demonstrate the modelling framework. The effectiveness of a charging control algorithm is dependent on the early knowledge of future EVs charging demand and local generation. To this end, two models were developed to provide this knowledge. The first model utilised data mining principles to forecast the day ahead EVs charging demand based on historical charging event data. The performance of four data mining methods in forecasting the charging demand of an EVs fleet was evaluated using real charging data from USA and France. The second model utilised a data fitting approach to produce stochastic generation forecast scenarios based only on the historical data. A case study was presented to evaluate the performance of the model based on real data from wind generators in UK. An agent-based control algorithm was developed to manage the EVs battery charging, according to the vehicles’ owner preferences, distribution network technical constraints and local distributed generation. Three agent classes were considered, a EVs/DG aggregator and “Responsive” or “Unresponsive” EVs. The real-time operation of the control system was experimentally demonstrated at the Electric Energy Systems Laboratory hosted at the National Technical University of Athens. A series of experiments demonstrated the adaptive behaviour of “Responsive” EVs agents and proved their ability to charge preferentially from renewable energy sources

    Método Grid-Quadtree para seleção de parâmetros do algoritmo support vector classification (SVC)

    Get PDF
    Orientador : Prof. Dr. Arinei Carlos Lindbeck da SilvaTese (doutorado) - Universidade Federal do Paraná, Setor de Tecnologia, Programa de Pós-Graduação em Métodos Numéricos em Engenharia. Defesa: Curitiba, 01/06/2016Inclui referências : f. 143-149Área de concentração : Programação matemáticaResumo: O algoritmo Support Vector Classification (SVC) é uma técnica de reconhecimento de padrões, cuja eficiência depende da seleção de seus parâmetros: constante de regularização C, função kernel e seus respectivos parâmetros. A escolha equivocada dessas variáveis impacta diretamente na performance do algoritmo, acarretando em fenômenos indesejáveis como o overfitting e o underfitting. O problema que estuda a procura de parâmetros ótimos para o SVC, em relação às suas medidas de desempenho, é denominado seleção de modelos do SVC. Em virtude do amplo domínio de convergência do kernel gaussiano, a maioria dos métodos destinados a solucionar esse problema concentra-se na seleção da constante C e do parâmetro ? do kernel gaussiano. Dentre esses métodos, a busca por grid é um dos de maior destaque devido à sua simplicidade e bons resultados. Contudo, por avaliar todas as combinações de parâmetros (C, ?) dentre o seu espaço de busca, a mesma necessita de muito tempo de processamento, tornando-se impraticável para avaliação de grandes conjuntos de dados. Desta forma, o objetivo deste trabalho é propor um método de seleção de parâmetros do SVC, usando o kernel gaussiano, que combine a técnica quadtree à busca por grid, para reduzir o número de operações efetuadas pelo grid e diminuir o seu custo computacional. A ideia fundamental é empregar a quadtree para desenhar a boa região de parâmetros, evitando avaliações desnecessárias de parâmetros situados nas áreas de underfitting e overfitting. Para isso, desenvolveu-se o método grid-quadtree (GQ), utilizando-se a linguagem de programação VB.net em conjunto com os softwares da biblioteca LIBSVM. Na execução do GQ, realizou-se o balanceamento da quadtree e criou-se um procedimento denominado refinamento, que permitiu delinear a curva de erro de generalização de parâmetros. Para validar o método proposto, empregaram-se vinte bases de dados referência na área de classificação, as quais foram separadas em dois grupos. Os resultados obtidos pelo GQ foram comparados com os da tradicional busca por grid (BG) levando-se em conta o número de operações executadas por ambos os métodos, a taxa de validação cruzada (VC) e o número de vetores suporte (VS) associados aos parâmetros encontrados e a acurácia do SVC na predição dos conjuntos de teste. A partir das análises realizadas, constatou-se que o GQ foi capaz de encontrar parâmetros de excelente qualidade, com altas taxas VC e baixas quantidades de VS, reduzindo em média, pelo menos, 78,8124% das operações da BG para o grupo 1 de dados e de 71,7172% a 88,7052% para o grupo 2. Essa diminuição na quantidade de cálculos efetuados pelo quadtree resultou em uma economia de horas de processamento. Além disso, em 11 das 20 bases estudadas a acurácia do SVC-GQ foi superior à do SVC-BG e para quatro delas igual. Isso mostra que o GQ é capaz de encontrar parâmetros melhores ou tão bons quanto os da BG executando muito menos operações. Palavras-chave: Seleção de modelos do SVC. Kernel gaussiano. Quadtree. Redução de operações.Abstract: The Support Vector Classification (SVC) algorithm is a pattern recognition technique, whose efficiency depends on its parameters selection: the penalty constant C, the kernel function and its own parameters. A wrong choice of these variables values directly impacts on the algorithm performance, leading to undesirable phenomena such as the overfitting and the underfitting. The task of searching for optimal parameters with respect to performance measures is called SVC model selection problem. Due to the Gaussian kernel wide convergence domain, many model selection approaches focus in determine the constant C and the Gaussian kernel ? parameter. Among these, the grid search is one of the highlights due to its easiest way and high performance. However, since it evaluates all parameters combinations (C, ?) on the search space, it requires high computational time and becomes impractical for large data sets evaluation. Thus, the aim of this thesis is to propose a SVC model selection method, using the Gaussian kernel, which integrates the quadtree technique with the grid search to reduce the number of operations performed by the grid and its computational cost. The main idea of this study is to use the quadtree to determine the good parameters region, neglecting the evaluation of unnecessary parameters located in the underfitting and the overfitting areas. In this regard, it was developed the grid-quadtree (GQ) method, which was implemented on VB.net development environment and that also uses the software of the LIBSVM library. In the GQ execution, it was considered the balanced quadtree and it was created a refinement procedure, that allowed to delineate the parameters generalization error curve. In order to validate the proposed method, twenty benchmark classification data set were used, which were separated into two groups. The results obtained via GQ were compared with the traditional grid search (GS) ones, considering the number of operations performed by both methods, the cross-validation rate (CV) and the number of support vectors (SV) associated to the selected parameters, and the SVC accuracy in the test set. Based on this analyzes, it was concluded that GQ was able to find excellent parameters, with high CV rates and few SV, achieving an average reduction of at least 78,8124% on GS operations for group 1 data and from 71,7172% to 88,7052% for group 2. The decrease in the amount of calculations performed by the quadtree lead to savings on the computational time. Furthermore, the SVC-GQ accuracy was superior than SVC-GS in 11 of the 20 studied bases and equal in four of them. These results demonstrate that GQ is able to find better or as good as parameters than BG, but executing much less operations. Key words: SVC Model Selection. Gaussian kernel. Quadtree. Reduction Operation

    Automatic analysis of pathological speech

    Get PDF
    De ernst van een spraakstoornis wordt vaak gemeten a.d.h.v. spraakverstaanbaarheid. Deze maat wordt in de klinische praktijk vaak bepaald met een perceptuele test. Zo’n test is van nature subjectief vermits de therapeut die de test afneemt de (stoornis van de) patiënt vaak kent en ook vertrouwd is met het gebruikte testmateriaal. Daarom is het interessant te onderzoeken of men met spraakherkenning een objectieve beoordelaar van verstaanbaarheid kan creëren. In deze thesis wordt een methodologie uitgewerkt om een gestandaardiseerde perceptuele test, het Nederlandstalig Spraakverstaanbaarheidsonderzoek (NSVO), te automatiseren. Hiervoor wordt gebruik gemaakt van spraakherkenning om de patiënt fonologisch en fonemisch te karakteriseren en uit deze karakterisering een spraakverstaanbaarheidsscore af te leiden. Experimenten hebben aangetoond dat de berekende scores zeer betrouwbaar zijn. Vermits het NSVO met nonsenswoorden werkt, kunnen vooral kinderen hierdoor leesfouten maken. Daarom werden nieuwe methodes ontwikkeld, gebaseerd op betekenisdragende lopende spraak, die hiertegen robuust zijn en tegelijk ook in verschillende talen gebruikt kunnen worden. Met deze nieuwe modellen bleek het mogelijk te zijn om betrouwbare verstaanbaarheidsscores te berekenen voor Vlaamse, Nederlandse en Duitse spraak. Tenslotte heeft het onderzoek ook belangrijke stappen gezet in de richting van een automatische karakterisering van andere aspecten van de spraakstoornis, zoals articulatie en stemgeving
    corecore