8 research outputs found
Combining models in discrete discriminant analysis
When conducting discrete discriminant analysis, alternative models provide different levels of predictive accuracy which has encouraged the research in combined models. This research seems to be specially promising when small or moderate sized samples are considered, which often occurs in practice. In this work we evaluate the performance of a linear combination of two discrete discriminant analysis models: the first-order independence model and the dependence trees model. The proposed methodology also uses a hierarchical coupling model when addressing multi-class classification problems, decomposing the multi-class problems into several bi-class problems, using a binary tree structure. The analysis is based both on simulated and real datasets. Results of the proposed approach are compared with those obtained by random forests, being generally more accurate. Measures of precision regarding a training set, a test set and cross-validation are presented. The R software is used for the algorithms' implementation.info:eu-repo/semantics/submittedVersio
Combining models in discrete discriminant analysis
Resumo da comunicação em póster apresentada em International Conference on Trends and Perspectives in Linear Statistical Inference (LinStat'2010), Tomar, Portugal, 27-31 July, 2010Diverse Discrete Discriminant Analysis (DDA) models perform differently on different sample observations (Brito et al. (2006)). This fact has encouraged research in combined models for DDA. This research seems to be specially promising when the a priori
classes are not well separated or when small or moderate sized samples are considered, which often occurs in practice. In this work we evaluate the performance of a linear combination of two DDA models (Marques et
al. (2008)): the First-Order Independence Model (FOIM) and the Dependence Trees Model (DTM) (Celeux and Nakache (1994). The pro-
posed methodology also uses a Hierarchical Coupling Model (HIERM) when addressing multiclass classification problems, decomposing the multiclass problems into several bi-class problems, using a binary tree structure (Sousa Ferreira (2000)). The analysis is based both on simulated and real datasets. Results include measures of precision regarding a training set, a test set and cross-validation. The R software is used for the algorithm's implementation
Combining Models in Discrete Discriminant Analysis
info:eu-repo/semantics/publishedVersio
Combining models in discrete discriminant analysis in the multiclass case
Resumo de comunicação oral em póster apresentado em COMPSTAT2008 - 18th International Conference on Computational Statistics, Porto, Portugal, 24 a 29 de Agosto 2008The idea of combining models in Discrete Discriminant Analysis (DDA) is present in a growing number of papers which aim to obtain more robust and more stable models than any of the competing ones. This seems to be a promising approach since it is known that different DDA models perform differently on different subjects (Brito et al.(2006)). In particular, this will be a more relevant issue if the groups are not well separated, which often occurs in practice.
In the present work a new methodological approach is suggested which is based on DDA models' combination. The multiclass problem is decomposed into several dichotomous problems that are nested in a hierarchical binary tree (Sousa Ferreira (2000), Brito et al. (2006)) and at each level of the binary tree a new combining model is proposed to derive the decision rule. This combining model is based on two well known models in the literature - the First-order Independence Model (FOIM) and the Dependence Trees Model (DTM) (Celeux and Nakache (1994)).
The MATLAB software is used for the algorithms' implementation and the proposed
approach is illustrated in a DDA application
Classification and combining models
Trabalho apresentado em SMTDA 2010: Stochastic Modeling Techniques and Data Analysis International Conference, Chania, Crete, Greece, 8-11 june 2010In the context of Discrete Discriminant Analysis (DDA) the idea of combining models is present in a growing number of papers aiming to obtain more robust and more stable models. This seems to be a promising approach since it is known that different DDA models perform differently on different subjects. Furthermore, the idea of combining models is particularly relevant when the groups are not well separeted, which often occurs in practice. Recently, we proposed a new DDA approach which is based on a linear combination of the First-order Independence Model (FOIM) and the Dependence Trees Model (DTM). In the present work we apply this new approach to classify consumers of a Portuguese cultural institution. We specifically focus on the performance of alternative models' combinations assessing the error rate and the Huberty index in a test sample. We use the R software for the algorithms' implementation and evaluation
Resultados de uma Escala de Sugestionabilidade: Classificação em grupos demográficos
Resumo da comunicação oral apresentada em XVI Jornadas de Classificação e Análise de Dados (JOCLAD2009), Faro, 2 a 4 Abril de 2009A natureza imperfeita dos processos de recuperação da memória, nomeadamente o
esquecimento e as distorções, tem importantes implicações na psicologia clÃnica e forense. A GSS1-Escala de Sugestionabilidade de Gudjonsson foi desenvolvida para avaliar a tendência que algumas pessoas têm para ceder perante questões falaciosas quando entrevistadas. Neste trabalho comparam-se os resultados de diversas técnicas de análise discriminante, no sentido de estudar a associação entre a sugestionabilidade e algumas caracterÃsticas demográficas dos inquiridos
Análise discriminante sobre variáveis qualitativas
Este estudo insere-se no campo da Análise Discriminante Discreta (ADD) propondo uma combinação de
modelos, uma vez que se tem verificado que, em geral, a sua aplicação conduz a métodos mais estáveis
e robustos. O trabalho que se apresenta é particularmente focado no caso em que se dispõe de classes a
priori mal separadas e/ou amostras de pequena ou moderada dimensão, situações em que a tarefa de ADD
é mais difÃcil.
Procura-se com esta contribuição, ultrapassar a dificuldade de estimação de um grande número de
parâmetros em ADD e encontrar classificadores que melhor se ajustem aos dados em estudo, uma vez que
os erros de classificação obtidos por vários modelos não ocorrem sobre os mesmos objetos (Sousa Ferreira,
2000; Brito, 2002 e Brito et al., 2006).
Com este objetivo, propusemos uma combinação de dois modelos com especificidades diferentes, o Modelo
de Independência Condicional (Goldstein and Dillon, 1978) e o Modelo Gráfico DecomponÃvel (Celeux
and Nakache, 1994; Pearl, 1988).
Tendo-nos deparado, em diversas aplicações do modelo proposto, com um número demasiado elevado
de variáveis explicativas face à dimensão da amostra considerada, direcionámos o trabalho na procura de
métodos de seleção de variáveis de forma a reduzir a complexidade dos dados a analisar.
Houve, ainda, necessidade de avaliar o impacto de alguns fatores no desempenho dos classificadores
propostos, nomeadamente: relação entre as variáveis explicativas intra-classes; grau de separabilidade
entre as classes; classes balanceadas ou não balanceadas; número de estados omissos e dimensão da amostra.This study falls within the scope of Discrete Discriminant Analysis (DDA) and proposes a combination
of models since, overall, its application has been found to lead to more stable and robust methods. The
work focuses particularly on the case where there are poorly separated a priori classes and/or small or
moderate-sized samples which tend to present more difficulties for the DDA task. This contribution sets
out to overcome the difficulty of estimating a large amount of DDA parameters and to find classifiers which
are better suited to the data under study, given that the classification errors obtained by diverse models do
not occur on the same objects (Sousa Ferreira, 2000; Brito, 2002 and Brito et al., 2006).
To this end, we have proposed a combination of two models with different specificities, the First-order
Independence Model (Goldstein and Dillon, 1978) and the Dependence Tree Model (Celeux and Nakache,
1994; Pearl, 1988).
In several applications of the proposed model, we were confronted with an excessive number of explanatory
variables in relation to the sample size under study. Therefore, our work has been geared towards seeking
variable selection methods, so as to reduce the complexity of the data to be analysed. It was also necessary
to evaluate the impact of certain factors on the performance of the proposed combined model, namely the
relationship among intra-class explanatory variables; the degree of separation between classes; balanced or
unbalanced classes; number of missing states and sample size
Representações euclidianas de dados : uma abordagem para variáveis heterogéneas
Tese de doutoramento, Medicina (Biomatemática), Universidade de Lisboa, Faculdade de Medicina, 2009DisponÃvel no document