205 research outputs found

    Estabilidade de uma estrutura de agrupamento : segmentos de clientes de uma instituição cultural

    Get PDF
    Neste trabalho implementa-se, como meio de avaliação de estabilidade de um agrupamento, uma nova proposta de validação cruzada de agrupamentos que prescinde do uso de classificadores, recorrendo à utilização de amostras ponderadas de treino e teste (Cardoso, Faceli et al. 2009). Ilustra-se a metodologia proposta sobre um agrupamento de clientes do CCB - Centro Cultural de Belém. Este agrupamento é efetuado mediante estimação de um modelo de mistura finita. Na constituição dos grupos ou segmentos atende-se à natureza ordinal das variáveis base (medições em escala de tipo Likert), em alternativa à modelação habitual que consideraria as mesmas variáveis como métricas. Em complemento, são apontadas metodologias consideradas mais apropriadas para a interpretação e discriminação dos grupos obtidos.This work implements, as a means of assessing the stability of a cluster, a new proposal for crossvalidation of clusters that dispenses with the use of classifiers, resorting to the use of weighted samples of training and testing (Cardoso, Facel et al. 2009) We illustrate the proposed approach over a cluster of clients of CCB – Cultura Centre of Belem (Centro Cultural de Belém). The clustering is obtained by means of an estimation of a mixture finite model. In the constitution of the clusters or segments, it it taken in consideration the ordinal nature of the clustering base variables (measurements in Likert scale) in lieu of the usual modeling that would consider the same variables as metric. In addition, we point out to some methodologies that are considered more adequate to interpret and discriminate the segments obtained

    Algoritmo cart : previsão do desempenho na matemática do secundário

    Get PDF
    O algoritmo CART-Classification and Regression Trees é aplicado na previsão das classificações de matemática associadas a uma amostra de alunos do ensino secundário. São modeladas, separadamente, as observações respeitantes a alunos do ensino secundário público e privado, considerando factores sócio-demográficos, factores específicos e factores pessoais. Obtém-se uma boa capacidade preditiva para os modelos propostos: 83,5% e 90,5%, estimativas da proporção de variância explicada, obtidas mediante validação cruzada, para os modelos do ensino público e privado, respectivamente. É ainda avaliada a importância relativa das variáveis preditivas nos modelos sublinhando-se, como principal, a média obtida pelos alunos às restantes disciplinas do secundário.Abstract: In the present study we use the CART-Classification and Regression Trees algorithm to predict math grades based on a sample of high school students. Students from Public and Private schools are considered separately. Predictors include socio-demographics, personal attributes and some specific characteristics related to school. The models obtained have a good predictive capacity: proportion of grades’ explained variance is 83,5% and 90,5% for regression trees referred to Public and Private schools, respectively. The relative importance of predictors is evaluated, the most important being the student’s average grade referred to the remaining subjects (excluding mathematics).peerreviewe

    Análise de agrupamento incremental : segmentação de pontos de retalho

    Get PDF
    O presente artigo apresenta um estudo sobre a utilização do algoritmo incremental Two-Step para identificar grupos homogéneos de pontos de venda de um universo de pontos de retalho para produtos alimentares congelados. Com este trabalho pretende-se efectuar a segmentação desse universo, de forma a suportar a tomada de decisão por parte dos gestores de marketing e vendas. Os grupos são identificados utilizando informação proveniente de um data warehouse que agrega dados sobre as características de cada ponto de retalho e as vendas que origina. Os resultados obtidos permitiram a identificação de 4 grupos, cujo perfil foi traçado e avaliado mediante o recurso a alguns testes de hipóteses.Abstract: The present study concerns the utilization of the Two-Step incremental clustering procedure to identify homogenous clusters of retail points that support the distribution network of frozen food products. The work is aimed to segment the retail points’ iverse, in order to support the marketing and sales’ decision making. The segmentation is based on information stored in a data warehouse that includes stores characteristics and sales performances of each retail point. The results obtained allowed the identification of 4 clusters which profile was identified and evaluated using hypothesis tests.peerreviewe

    Stock market series analysis using self-organizing maps

    Get PDF
    In this work a new clustering technique is implemented and tested. The proposed approach is based on the application of a SOM (self-organizing map) neural network and provides means to cluster U-MAT aggregated data. It relies on a flooding algorithm operating on the U-MAT and resorts to the Calinski and Harabask index to assess the depth of flooding, providing an adequate number of clusters. The method is tuned for the analysis of stock market series. Results obtained are promising although limited in scope.Neste trabalho é implementada e testada uma nova técnica de agrupamento. A abordagem proposta baseia-se na aplicação de uma rede neuronal SOM (mapa auto-organizado) e permite agrupar dados sobre a matriz de distancias (U-MAT). É utilizado um algoritmo de alagamento ("flooding") sobre a U-MAT e o índice de Calinski e Harabasz avalia a profundidade do alagamento determinando-se, assim, o número de grupos mais adequado. O método é desenhado especificamente para a análise de séries temporais da bolsa de valores. Os resultados obtidos são promissores, embora se registem ainda limitações

    Mapping atmospheric pollutants emissions in European countries

    Get PDF
    In this paper we present a methodology which enables the graphical representation, in a bi-dimensional Euclidean space, of atmospheric pollutants emissions in European countries. This approach relies on the use of Multidimensional Unfolding (MDU), an exploratory multivariate data analysis technique. This technique illustrates both the relationships between the emitted gases and the gases and their geographical origins. The main contribution of this work concerns the evaluation of MDU solutions. We use simulated data to define thresholds for the model fitting measures, allowing the MDU output quality evaluation. The quality assessment of the model adjustment is thus carried out as a step before interpretation of the gas types and geographical origins results. The MDU maps analysis generates useful insights, with an immediate substantive result and enables the formulation of hypotheses for further analysis and modeling

    The heterogeneous best-worst choice method in market research

    Get PDF
    WOS:000280441000009 (Nº de Acesso Web of Science)The article presents a market research technique for obtaining information on consumer preferences, called the heterogeneous best-worst (HBW) choice method. It accounts for preference heterogeneity, making it more accurate than the direct method (DM), and causes less information overload than other indirect methods. An example involving undergraduates picking a business school is presented to illustrate how the HBW choice method operates. It is demonstrated that the HBW and DM approaches produce very similar results, but the HBW method allows for more differentiation of extreme preferences

    An MML embedded approach for estimating the number of clusters

    Get PDF
    Assuming that the data originate from a finite mixture of multinomial distributions, we study the performance of an integrated Expectation Maximization (EM) algorithm considering Minimum Message Length (MML) criterion to select the number of mixture components. The referred EM-MML approach, rather than selecting one among a set of pre-estimated candidate models (which requires running EM several times), seamlessly integrates estimation and model selection in a single algorithm. Comparisons are provided with EM combined with well-known information criteria – e.g. the Bayesian information Criterion. We resort to synthetic data examples and a real application. The EM-MML computation time is a clear advantage of this method; also, the real data solution it provides is more parsimonious, which reduces the risk of model order overestimation and improves interpretabilityinfo:eu-repo/semantics/publishedVersio
    • …
    corecore