19,136 research outputs found
Multivariate Approaches to Classification in Extragalactic Astronomy
Clustering objects into synthetic groups is a natural activity of any
science. Astrophysics is not an exception and is now facing a deluge of data.
For galaxies, the one-century old Hubble classification and the Hubble tuning
fork are still largely in use, together with numerous mono-or bivariate
classifications most often made by eye. However, a classification must be
driven by the data, and sophisticated multivariate statistical tools are used
more and more often. In this paper we review these different approaches in
order to situate them in the general context of unsupervised and supervised
learning. We insist on the astrophysical outcomes of these studies to show that
multivariate analyses provide an obvious path toward a renewal of our
classification of galaxies and are invaluable tools to investigate the physics
and evolution of galaxies.Comment: Open Access paper.
http://www.frontiersin.org/milky\_way\_and\_galaxies/10.3389/fspas.2015.00003/abstract\>.
\<10.3389/fspas.2015.00003 \&g
History of art paintings through the lens of entropy and complexity
Art is the ultimate expression of human creativity that is deeply influenced
by the philosophy and culture of the corresponding historical epoch. The
quantitative analysis of art is therefore essential for better understanding
human cultural evolution. Here we present a large-scale quantitative analysis
of almost 140 thousand paintings, spanning nearly a millennium of art history.
Based on the local spatial patterns in the images of these paintings, we
estimate the permutation entropy and the statistical complexity of each
painting. These measures map the degree of visual order of artworks into a
scale of order-disorder and simplicity-complexity that locally reflects
qualitative categories proposed by art historians. The dynamical behavior of
these measures reveals a clear temporal evolution of art, marked by transitions
that agree with the main historical periods of art. Our research shows that
different artistic styles have a distinct average degree of entropy and
complexity, thus allowing a hierarchical organization and clustering of styles
according to these metrics. We have further verified that the identified groups
correspond well with the textual content used to qualitatively describe the
styles, and that the employed complexity-entropy measures can be used for an
effective classification of artworks.Comment: 10 two-column pages, 5 figures; accepted for publication in PNAS
[supplementary information available at
http://www.pnas.org/highwire/filestream/824089/field_highwire_adjunct_files/0/pnas.1800083115.sapp.pdf
Evolutionary framework for DNA Microarry Cluster Analysis
En esta investigación se propone un framework evolutivo donde se fusionan un método de clustering
jerárquico basado en un modelo evolutivo, un conjunto de medidas de validación de agrupamientos (clusters)
de datos y una herramienta de visualización de clusterings. El objetivo es crear un marco apropiado para la
extracción de conocimiento a partir de datos provenientes de DNA-microarrays. Por una parte, el modelo
evolutivo de clustering de nuestro framework es una alternativa novedosa que intenta resolver algunos de los
problemas presentes en los métodos de clustering existentes. Por otra parte, nuestra alternativa de
visualización de clusterings, materializada en una herramienta, incorpora nuevas propiedades y nuevos
componentes de visualización, lo cual permite validar y analizar los resultados de la tarea de clustering. De este
modo, la integración del modelo evolutivo de clustering con el modelo visual de clustering, convierta a nuestro
framework evolutivo en una aplicación novedosa de minería de datos frente a los métodos convencionales
Holistic corpus-based dialectology
This paper is concerned with sketching future directions for corpus-based dialectology. We advocate a holistic approach to the study of geographically conditioned linguistic variability, and we present a suitable methodology, 'corpusbased dialectometry', in exactly this spirit. Specifically, we argue that in order to live up to the potential of the corpus-based method, practitioners need to (i) abandon their exclusive focus on individual linguistic features in favor of the study of feature aggregates, (ii) draw on computationally advanced multivariate analysis techniques (such as multidimensional scaling, cluster analysis, and principal component analysis), and (iii) aid interpretation of empirical results by marshalling state-of-the-art data visualization techniques. To exemplify this line of analysis, we present a case study which explores joint frequency variability of 57 morphosyntax features in 34 dialects all over Great Britain
A visual analytics framework for cluster analysis of DNA microarray data
Prova tipográficaCluster analysis of DNA microarray data is an important but difficult task in knowledge discovery processes.
Many clustering methods are applied to analysis of data for gene expression, but none of them
is able to deal with an absolute way with the challenges that this technology raises. Due to this, many
applications have been developed for visually representing clustering algorithm results on DNA microarray
data, usually providing dendrogram and heat map visualizations. Most of these applications focus
only on the above visualizations, and do not offer further visualization components to the validate the
clustering methods or to validate one another. This paper proposes using a visual analytics framework
in cluster analysis of gene expression data. Additionally, it presents a new method for finding cluster
boundaries based on properties of metric spaces. Our approach presents a set of visualization components
able to interact with each other; namely, parallel coordinates, cluster boundary genes, 3D cluster
surfaces and DNA microarray visualizations as heat maps. Experimental results have shown that our
framework can be very useful in the process of more fully understanding DNA microarray data. The
software has been implemented in Java, and the framework is publicly available at http://www.
analiticavisual.com/jcastellanos/3DVisualCluster/3D-VisualCluster.This work has been partially funded by the Spanish Ministry of Science and Innovation, the Plan E from the Spanish Government, the European Union from the ERDF (TIN2009-14057-C03-02)
- …