79 research outputs found

    Estabilidade de uma estrutura de agrupamento : segmentos de clientes de uma instituição cultural

    Get PDF
    Neste trabalho implementa-se, como meio de avaliação de estabilidade de um agrupamento, uma nova proposta de validação cruzada de agrupamentos que prescinde do uso de classificadores, recorrendo à utilização de amostras ponderadas de treino e teste (Cardoso, Faceli et al. 2009). Ilustra-se a metodologia proposta sobre um agrupamento de clientes do CCB - Centro Cultural de Belém. Este agrupamento é efetuado mediante estimação de um modelo de mistura finita. Na constituição dos grupos ou segmentos atende-se à natureza ordinal das variáveis base (medições em escala de tipo Likert), em alternativa à modelação habitual que consideraria as mesmas variáveis como métricas. Em complemento, são apontadas metodologias consideradas mais apropriadas para a interpretação e discriminação dos grupos obtidos.This work implements, as a means of assessing the stability of a cluster, a new proposal for crossvalidation of clusters that dispenses with the use of classifiers, resorting to the use of weighted samples of training and testing (Cardoso, Facel et al. 2009) We illustrate the proposed approach over a cluster of clients of CCB – Cultura Centre of Belem (Centro Cultural de Belém). The clustering is obtained by means of an estimation of a mixture finite model. In the constitution of the clusters or segments, it it taken in consideration the ordinal nature of the clustering base variables (measurements in Likert scale) in lieu of the usual modeling that would consider the same variables as metric. In addition, we point out to some methodologies that are considered more adequate to interpret and discriminate the segments obtained

    Algoritmo cart : previsão do desempenho na matemática do secundário

    Get PDF
    O algoritmo CART-Classification and Regression Trees é aplicado na previsão das classificações de matemática associadas a uma amostra de alunos do ensino secundário. São modeladas, separadamente, as observações respeitantes a alunos do ensino secundário público e privado, considerando factores sócio-demográficos, factores específicos e factores pessoais. Obtém-se uma boa capacidade preditiva para os modelos propostos: 83,5% e 90,5%, estimativas da proporção de variância explicada, obtidas mediante validação cruzada, para os modelos do ensino público e privado, respectivamente. É ainda avaliada a importância relativa das variáveis preditivas nos modelos sublinhando-se, como principal, a média obtida pelos alunos às restantes disciplinas do secundário.Abstract: In the present study we use the CART-Classification and Regression Trees algorithm to predict math grades based on a sample of high school students. Students from Public and Private schools are considered separately. Predictors include socio-demographics, personal attributes and some specific characteristics related to school. The models obtained have a good predictive capacity: proportion of grades’ explained variance is 83,5% and 90,5% for regression trees referred to Public and Private schools, respectively. The relative importance of predictors is evaluated, the most important being the student’s average grade referred to the remaining subjects (excluding mathematics).peerreviewe

    Análise de agrupamento incremental : segmentação de pontos de retalho

    Get PDF
    O presente artigo apresenta um estudo sobre a utilização do algoritmo incremental Two-Step para identificar grupos homogéneos de pontos de venda de um universo de pontos de retalho para produtos alimentares congelados. Com este trabalho pretende-se efectuar a segmentação desse universo, de forma a suportar a tomada de decisão por parte dos gestores de marketing e vendas. Os grupos são identificados utilizando informação proveniente de um data warehouse que agrega dados sobre as características de cada ponto de retalho e as vendas que origina. Os resultados obtidos permitiram a identificação de 4 grupos, cujo perfil foi traçado e avaliado mediante o recurso a alguns testes de hipóteses.Abstract: The present study concerns the utilization of the Two-Step incremental clustering procedure to identify homogenous clusters of retail points that support the distribution network of frozen food products. The work is aimed to segment the retail points’ iverse, in order to support the marketing and sales’ decision making. The segmentation is based on information stored in a data warehouse that includes stores characteristics and sales performances of each retail point. The results obtained allowed the identification of 4 clusters which profile was identified and evaluated using hypothesis tests.peerreviewe

    The heterogeneous best-worst choice method in market research

    Get PDF
    WOS:000280441000009 (Nº de Acesso Web of Science)The article presents a market research technique for obtaining information on consumer preferences, called the heterogeneous best-worst (HBW) choice method. It accounts for preference heterogeneity, making it more accurate than the direct method (DM), and causes less information overload than other indirect methods. An example involving undergraduates picking a business school is presented to illustrate how the HBW choice method operates. It is demonstrated that the HBW and DM approaches produce very similar results, but the HBW method allows for more differentiation of extreme preferences

    Stock market series analysis using self-organizing maps

    Get PDF
    In this work a new clustering technique is implemented and tested. The proposed approach is based on the application of a SOM (self-organizing map) neural network and provides means to cluster U-MAT aggregated data. It relies on a flooding algorithm operating on the U-MAT and resorts to the Calinski and Harabask index to assess the depth of flooding, providing an adequate number of clusters. The method is tuned for the analysis of stock market series. Results obtained are promising although limited in scope.Neste trabalho é implementada e testada uma nova técnica de agrupamento. A abordagem proposta baseia-se na aplicação de uma rede neuronal SOM (mapa auto-organizado) e permite agrupar dados sobre a matriz de distancias (U-MAT). É utilizado um algoritmo de alagamento ("flooding") sobre a U-MAT e o índice de Calinski e Harabasz avalia a profundidade do alagamento determinando-se, assim, o número de grupos mais adequado. O método é desenhado especificamente para a análise de séries temporais da bolsa de valores. Os resultados obtidos são promissores, embora se registem ainda limitações

    Mapping atmospheric pollutants emissions in European countries

    Get PDF
    In this paper we present a methodology which enables the graphical representation, in a bi-dimensional Euclidean space, of atmospheric pollutants emissions in European countries. This approach relies on the use of Multidimensional Unfolding (MDU), an exploratory multivariate data analysis technique. This technique illustrates both the relationships between the emitted gases and the gases and their geographical origins. The main contribution of this work concerns the evaluation of MDU solutions. We use simulated data to define thresholds for the model fitting measures, allowing the MDU output quality evaluation. The quality assessment of the model adjustment is thus carried out as a step before interpretation of the gas types and geographical origins results. The MDU maps analysis generates useful insights, with an immediate substantive result and enables the formulation of hypotheses for further analysis and modeling

    An MML embedded approach for estimating the number of clusters

    Get PDF
    Assuming that the data originate from a finite mixture of multinomial distributions, we study the performance of an integrated Expectation Maximization (EM) algorithm considering Minimum Message Length (MML) criterion to select the number of mixture components. The referred EM-MML approach, rather than selecting one among a set of pre-estimated candidate models (which requires running EM several times), seamlessly integrates estimation and model selection in a single algorithm. Comparisons are provided with EM combined with well-known information criteria – e.g. the Bayesian information Criterion. We resort to synthetic data examples and a real application. The EM-MML computation time is a clear advantage of this method; also, the real data solution it provides is more parsimonious, which reduces the risk of model order overestimation and improves interpretabilityinfo:eu-repo/semantics/publishedVersio

    La cultura de la ciberseguridad en las organizaciones portuguesas: un análisis exploratorio

    Get PDF
    Cybersecurity is currently one of the hottest topics for millions of organizations around the world. They depend on information technology to conduct their business processes that are exposed to wide range of security threats, and Portuguese organizations are no exception. How are these organizations taking a holistic approach to allow them to face and handle those threats with confidence? Are the proper technical mechanisms being put in place and are the appropriate information security skills and awareness programs implemented internally? How is all of this being handled by Portuguese organizations? This are some of the questions tackled on a survey addressed to directors of Portuguese organizations in order to map their current cybersecurity culture and corresponding processesLa ciberseguridad es actualmente uno de los temas más candentes para millones de organizaciones en todo el mundo. Dependen de la tecnología de la información para llevar a cabo sus procesos de negocio que están expuestos a una amplia gama de amenazas a la seguridad, y las organizaciones portuguesas no son una excepción. ¿Cómo estas organizaciones están adoptando un enfoque holístico para permitirles enfrentar y manejar esas amenazas con confianza? ¿Se están implementando los mecanismos técnicos adecuados y se implementan internamente las habilidades de seguridad de la información y los programas de concienciación adecuados? ¿Cómo es todo esto manejado por organizaciones portuguesas? Estas son algunas de las preguntas abordadas en una encuesta dirigida a directores de organizaciones portuguesas para trazar su actual cultura de ciberseguridad y los procesos correspondiente

    Quality indices for (practical) clustering evaluation

    Get PDF
    WOS:000271584000004 (Nº de Acesso Web of Science)Clustering quality or validation indices allow the evaluation of the quality of clustering in order to support the selection of a specific partition or clustering structure in its natural unsupervised environment, where the real solution is unknown or not available. In this paper, we investigate the use of quality indices mostly based on the concepts of clusters' compactness and separation, for the evaluation of clustering results (partitions in particular). This work intends to offer a general perspective regarding the appropriate use of quality indices for the purpose of clustering evaluation. After presenting some commonly used indices, as well as indices recently proposed in the literature, key issues regarding the practical use of quality indices are addressed. A general methodological approach is presented which considers the identification of appropriate indices thresholds. This general approach is compared with the simple use of quality indices for evaluating a clustering solution
    • …
    corecore