14 research outputs found

    Eigenvectors of a kurtosis matrix as interesting directions to reveal cluster structure

    Get PDF
    In this paper we study the properties of a kurtosis matrix and propose its eigenvectors as interesting directions to reveal the possible cluster structure of a data set. Under a mixture of elliptical distributions with proportional scatter matrix, it is shown that a subset of the eigenvectors of the fourth-order moment matrix corresponds to Fisher's linear discriminant subspace. The eigenvectors of the estimated kurtosis matrix are consistent estimators of this subspace and its calculation is easy to implement and computationally efficient, which is particularly favourable when the ratio n/p is large.Publicad

    The use of a common location measure in the invariant coordinate selection and projection pursuit

    Get PDF
    Invariant coordinate selection (ICS) and projection pursuit (PP) are two methods that can be used to detect clustering directions in multivariate data by optimizing criteria sensitive to non-normality. In particular, ICS finds clustering directions using a relative eigen-decomposition of two scatter matrices with different levels of robustness; PP is a one-dimensional variant of ICS. Each of the two scatter matrices includes an implicit or explicit choice of location. However, when different measures of location are used, ICS and PP can behave counter-intuitively. In this paper we explore this behavior in a variety of examples and propose a simple and natural solution: use the same measure of location for both scatter matrices

    Activity report. 2010

    Get PDF

    BIG-DATA and the Challenges for Statistical Inference and Economics Teaching and Learning

    Full text link
    The  increasing  automation  in  data  collection,  either  in  structured  orunstructured formats, as well as the development of reading, concatenation and comparison algorithms and the growing analytical skills which characterize the era of Big Data, cannot not only be considered a technological achievement, but an organizational, methodological and analytical challenge for knowledge as well, which is necessary to generate opportunities and added value.In fact, exploiting the potential of Big-Data includes all fields of community activity; and given its ability to extract behaviour patterns, we are interested in the challenges for the field of teaching and learning, particularly in the field of statistical inference and economic theory.Big-Data can improve the understanding of concepts, models and techniques used in both statistical inference and economic theory, and it can also generate reliable and robust short and long term predictions. These facts have led to the demand for analytical capabilities, which in turn encourages teachers and students to demand access to massive information produced by individuals, companies and public and private organizations in their transactions and inter- relationships.Mass data (Big Data) is changing the way people access, understand and organize knowledge, which in turn is causing a shift in the approach to statistics and economics teaching, considering them as a real way of thinking rather than just operational and technical disciplines. Hence, the question is how teachers can use automated collection and analytical skills to their advantage when teaching statistics and economics; and whether it will lead to a change in what is taught and how it is taught.Peñaloza Figueroa, J.; Vargas Perez, C. (2017). BIG-DATA and the Challenges for Statistical Inference and Economics Teaching and Learning. Multidisciplinary Journal for Education, Social and Technological Sciences. 4(1):64-87. doi:10.4995/muse.2017.6350.SWORD648741Akerkar R. (Ed.). (2014). Big Data Computing. CRC Press.Anderson, C. (2009). "Living by Numbers". Wired Magazine. July 2009.New York: Conde Nast Publications.Cukier, Kenneth and Mayer-Schönberger, Viktor (2013). Big Data: A Revolution that Will Transform How We Live, Work and Think. John Murray Publishers. London, UK.Dean, J., & Ghemawat, S. (2008). MapReduce. Communications of the ACM, 51(1), 107. doi:10.1145/1327452.1327492Diebold, F. X. (2012). A Personal Perspective on the Origin(s) and Development of «Big Data»: The Phenomenon, the Term, and the Discipline, Second Version. SSRN Electronic Journal. doi:10.2139/ssrn.2202843Duboc, L., Rosenblum, D. S., & Wicks, T. (2006). A framework for modelling and analysis of software systems scalability. Proceeding of the 28th international conference on Software engineering - ICSE ’06. doi:10.1145/1134285.1134460García Ros, R., Pérez González, F. & Talaya González, I. (2008). Preferencias Respecto a Métodos Instruccionales de los Estudiantes Universitarios de Nuevo Acceso y su Relación con Estilos de Aprendizaje y Estrategias Motivacionales. Electronic Journal of Research in Educational Psychology, 6(16), 547-570.Gould, R. (2010). Statistics and the Modern Student. International Statistical Review, 78(2), 297-315. doi:10.1111/j.1751-5823.2010.00117.xKambatla, K., Kollias, G., Kumar, V., & Grama, A. (2014). Trends in big data analytics. Journal of Parallel and Distributed Computing, 74(7), 2561-2573. doi:10.1016/j.jpdc.2014.01.003Leedy, P. & Ormrod, J. (2001). Practical Research: Planning and Design. 7th Editon. Upper Saddle River, NJ: Merrill Prentice Hall. Thousand Oaks: SAGE Publications.Meyer-Schonberger, Viktor and Cukier, Kenneth (2013). Big Data: A Revolution that Will Transform How We Live, Work and Think. John Murray Publishers. London. UK Company.Müller, Martin U., Rosenbach, Marcel and Schulz, Thomas (2013). Living by Numbers: Big-Data Knows What your Future Holds. DER SPIEGEL No. 20. Germany (Translated from German by Christopher Sultan).Pe-a, D., Prieto, J. and Viladomat, J. (2010) "Eigenvectors of a Kurtosis Matrix as Interesting Directions to Reveal Cluster Structure", Journal of Multivariate Analysis 9, 1995 -2007, 2010. https://doi.org/10.4995/muse.2015.2245Peñaloza Figueroa, J. L., & Vargas Perez, C. (2014). Construction and Evaluation of Scenarios as a Learning Strategy through Modelling-Simulation. Multidisciplinary Journal for Education, Social and Technological Sciences, 2(1), 40. doi:10.4995/muse.2015.2245Zhang, J., Wang, F.-Y., Wang, K., Lin, W.-H., Xu, X., & Chen, C. (2011). Data-Driven Intelligent Transportation Systems: A Survey. IEEE Transactions on Intelligent Transportation Systems, 12(4), 1624-1639. doi:10.1109/tits.2011.215800

    Eigenvectors of a kurtosis matrix as interesting directions to reveal cluster structure

    No full text
    In this paper we study the properties of a kurtosis matrix and propose its eigenvectors as interesting directions to reveal the possible cluster structure of a data set. Under a mixture of elliptical distributions with proportional scatter matrix, it is shown that a subset of the eigenvectors of the fourth-order moment matrix corresponds to Fisher's linear discriminant subspace. The eigenvectors of the estimated kurtosis matrix are consistent estimators of this subspace and its calculation is easy to implement and computationally efficient, which is particularly favourable when the ratio n/p is large.Cluster analysis Dimension reduction Fisher subspace Kurtosis matrix Multivariate kurtosis Projection Pursuit

    Hyperspectral Anomaly Detection: Comparative Evaluation in Scenes with Diverse Complexity

    Get PDF

    Ydinmenetelmä riippumattomien komponenttien analyysiin

    Get PDF
    Tutkielmassa esitellään ja johdetaan uusi menetelmä, ydin-FOBI, joka on ydinmenetelmä riippumattomien komponenttien analyysiin. Lisäksi esitellään MDS-FOBI, jonka avulla FOBI-ratkaisu voidaan tuottaa pelkän havaintojen etäisyysmatriisin perusteella. Johdantona aiheeseen esitellään ja johdetaan pääkomponenttianalyysi, sen ydinversio ja moniulotteinen skaalaus sekä esitellään riippumattomien komponenttien analyysi ja johdetaan sen lineaarinen FOBI-ratkaisu. Lopuksi käsiteltyjä menetelmiä vertaillaan kolmella aineistolla. Riippumattomien komponenttien analyysissä havaintovektorin muuttujien ajatellaan olevan riippumattomien satunnaismuuttujien lineaarikombinaatioita. Tarkoitus on palauttaa vaihtelu takaisin näihin komponentteihin. FOBI on eräs riippumattomien komponenttien ongelman ratkaisu ja se perustuu neljänsien momenttien muodostaman kurtoosimatriisin ominaisarvohajotelmaan. Tutkielmassa esitetään tapa FOBIn laskemiseen käyttäen vain havaintojen sisätulomatriisia. Kun sisätulomatriisi korvataan ydinmatriisilla, saadaan ydin-FOBI ja kun se korvataan tietyllä etäisyysmatriisiin pohjautuvalla matriisilla, saadaan MDSFOBI. Menetelmiä tutkittaessa havaitaan, että ydin-FOBI voidaan nähdä ydinpääkomponenttianalyysinä, jonka antamiin pistemääriin sovelletaan lineaarista FOBIa. Simuloiduilla aineistoilla tehdyssä tarkastelussa havaitaan, että ydin-FOBIn tuottamia komponentteja voidaan käyttää ryhmien erotteluun aineistosta. Toisesta aineistolla tehdystä esimerkistä havaitaan, että ydin-FOBI soveltuu myös niin sanottujen ominaiskasvojen tuottamiseen. Sopivan ydinfunktion etuna on tällöin, että se erottaa kuvista reunat melko terävästi, vaikka kuvien välillä kasvot ovatkin hieman eri paikoissa. Kolmas esimerkki taas osoittaa, että MDS-FOBIa voidaan käyttää tavallisen moniulotteisen skaalauksen lailla. Tällöin MDS-FOBIn ominaisuutena on, että se erottelee pisteet hieman tavallista moniulotteista skaalausta voimakkaammin ryhmiin

    Activity report. 2013

    Get PDF
    corecore