Search CORE

38,111 research outputs found

Unsupervised Discovery of Phonological Categories through Supervised Learning of Morphological Rules

Author: Berck Peter
Daelemans Walter
Gillis Steven
Publication venue
Publication date: 01/01/1996
Field of study

We describe a case study in the application of {\em symbolic machine learning} techniques for the discovery of linguistic rules and categories. A supervised rule induction algorithm is used to learn to predict the correct diminutive suffix given the phonological representation of Dutch nouns. The system produces rules which are comparable to rules proposed by linguists. Furthermore, in the process of learning this morphological task, the phonemes used are grouped into phonologically relevant categories. We discuss the relevance of our method for linguistics and language technology

arXiv.org e-Print Archive

CiteSeerX

Institutional Repository Universiteit Antwerpen

Tilburg University Repository

Min-Max Predictive Control of a Five-Phase Induction Machine

Author: Arahal Manuel R.
Kowal Gornig Agnieszka
Martín Torres Cristina
Rodríguez Ramírez Daniel
Publication venue: 'MDPI AG'
Publication date: 01/09/2019
Field of study

In this paper, a fuzzy-logic based operator is used instead of a traditional cost function for the predictive stator current control of a five-phase induction machine (IM). The min-max operator is explored for the first time as an alternative to the traditional loss function. With this proposal, the selection of voltage vectors does not need weighting factors that are normally used within the loss function and require a cumbersome procedure to tune. In order to cope with conflicting criteria, the proposal uses a decision function that compares predicted errors in the torque producing subspace and in the x-y subspace. Simulations and experimental results are provided, showing how the proposal compares with the traditional method of fixed tuning for predictive stator current control.Ministerio de Economía y Competitividad DPI 2016-76493-C3-1-R y 2014/425Unión Europea DPI 2016-76493-C3-1-R y 2014/425Universidad de Sevilla DPI 2016-76493-C3-1-R y 2014/42

Multidisciplinary Digital Publishing Institute

idUS. Depósito de Investigación Universidad de Sevilla

A very simple safe-Bayesian random forest

Author: Ghahramani Zoubin
Quadrianto Novi
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/06/2015
Field of study

Random forests works by averaging several predictions of de-correlated trees. We show a conceptually radical approach to generate a random forest: random sampling of many trees from a prior distribution, and subsequently performing a weighted ensemble of predictive probabilities. Our approach uses priors that allow sampling of decision trees even before looking at the data, and a power likelihood that explores the space spanned by combination of decision trees. While each tree performs Bayesian inference to compute its predictions, our aggregation procedure uses the power likelihood rather than the likelihood and is therefore strictly speaking not Bayesian. Nonetheless, we refer to it as a Bayesian random forest but with a built-in safety. The safeness comes as it has good predictive performance even if the underlying probabilistic model is wrong. We demonstrate empirically that our Safe-Bayesian random forest outperforms MCMC or SMC based Bayesian decision trees in term of speed and accuracy, and achieves competitive performance to entropy or Gini optimised random forest, yet is very simple to construct

Crossref

Sussex Research Online

See5 Algorithm versus Discriminant Analysis. An Application to the Prediction of Insolvency in Spanish Non-life Insurance Companies

Author: José Fernández Menéndez
Paloma Martínez Almodovar
Zuleyca Díaz Martínez
Publication venue
Publication date
Field of study

Prediction of insurance companies insolvency has arised as an important problem in the field of financial research, due to the necessity of protecting the general public whilst minimizing the costs associated to this problem. Most methods applied in the past to tackle this question are traditional statistical techniques which use financial ratios as explicative variables. However, these variables do not usually satisfy statistical assumptions, what complicates the application of the mentioned methods.In this paper, a comparative study of the performance of a well-known parametric statistical technique (Linear Discriminant Analysis) and a non-parametric machine learning technique (See5) is carried out. We have applied the two methods to the problem of the prediction of insolvency of Spanish non-life insurance companies upon the basis of a set of financial ratios. Results indicate a higher performance of the machine learning technique, what shows that this method can be a useful tool to evaluate insolvency of insurance firms.Insolvency, Insurance Companies, Discriminant Analysis, See5.

Research Papers in Economics

Ensembles of wrappers for automated feature selection in fish age classification

Author: Bermejo Sánchez Sergi
Publication venue: 'Elsevier BV'
Publication date: 01/01/2017
Field of study

In feature selection, the most important features must be chosen so as to decrease the number thereof while retaining their discriminatory information. Within this context, a novel feature selection method based on an ensemble of wrappers is proposed and applied for automatically select features in fish age classification. The effectiveness of this procedure using an Atlantic cod database has been tested for different powerful statistical learning classifiers. The subsets based on few features selected, e.g. otolith weight and fish weight, are particularly noticeable given current biological findings and practices in fishery research and the classification results obtained with them outperforms those of previous studies in which a manual feature selection was performed.Peer ReviewedPostprint (author's final draft

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

A model-based multithreshold method for subgroup identification

Author: Anderson TW
Breiman L
Giri NC
Golub GH
Jolliffe IT
Loh WY
Messenger R
Paul D
Rao CR
Su X
Thomson GH
Publication venue: eScholarship, University of California
Publication date: 11/02/2019
Field of study

Thresholding variable plays a crucial role in subgroup identification for personalizedmedicine. Most existing partitioning methods split the sample basedon one predictor variable. In this paper, we consider setting the splitting rulefrom a combination of multivariate predictors, such as the latent factors, principlecomponents, and weighted sum of predictors. Such a subgrouping methodmay lead to more meaningful partitioning of the population than using a singlevariable. In addition, our method is based on a change point regression modeland thus yields straight forward model-based prediction results. After choosinga particular thresholding variable form, we apply a two-stage multiple changepoint detection method to determine the subgroups and estimate the regressionparameters. We show that our approach can produce two or more subgroupsfrom the multiple change points and identify the true grouping with high probability.In addition, our estimation results enjoy oracle properties. We design asimulation study to compare performances of our proposed and existing methodsand apply them to analyze data sets from a Scleroderma trial and a breastcancer study

Crossref

eScholarship - University of California

Structured Prediction of Sequences and Trees using Infinite Contexts

Author: C Zhu
F Wood
M Johnson
MP Marcus
T Cohn
Publication venue
Publication date: 09/03/2015
Field of study

Linguistic structures exhibit a rich array of global phenomena, however commonly used Markov models are unable to adequately describe these phenomena due to their strong locality assumptions. We propose a novel hierarchical model for structured prediction over sequences and trees which exploits global context by conditioning each generation decision on an unbounded context of prior decisions. This builds on the success of Markov models but without imposing a fixed bound in order to better represent global phenomena. To facilitate learning of this large and unbounded model, we use a hierarchical Pitman-Yor process prior which provides a recursive form of smoothing. We propose prediction algorithms based on A* and Markov Chain Monte Carlo sampling. Empirical results demonstrate the potential of our model compared to baseline finite-context Markov models on part-of-speech tagging and syntactic parsing

arXiv.org e-Print Archive

Crossref