4,852 research outputs found
A systematic review of data quality issues in knowledge discovery tasks
Hay un gran crecimiento en el volumen de datos porque las organizaciones capturan permanentemente la cantidad colectiva de datos para lograr un mejor proceso de toma de decisiones. El desafío mas fundamental es la exploración de los grandes volúmenes de datos y la extracción de conocimiento útil para futuras acciones por medio de tareas para el descubrimiento del conocimiento; sin embargo, muchos datos presentan mala calidad. Presentamos una revisión sistemática de los asuntos de calidad de datos en las áreas del descubrimiento de conocimiento y un estudio de caso aplicado a la enfermedad agrícola conocida como la roya del café.Large volume of data is growing because the organizations are continuously capturing the collective amount of data for better decision-making process. The most fundamental challenge is to explore the large volumes of data and extract useful knowledge for future actions through knowledge discovery tasks, nevertheless many data has poor quality. We presented a systematic review of the data quality issues in knowledge discovery tasks and a case study applied to agricultural disease named coffee rust
Multivariate Approaches to Classification in Extragalactic Astronomy
Clustering objects into synthetic groups is a natural activity of any
science. Astrophysics is not an exception and is now facing a deluge of data.
For galaxies, the one-century old Hubble classification and the Hubble tuning
fork are still largely in use, together with numerous mono-or bivariate
classifications most often made by eye. However, a classification must be
driven by the data, and sophisticated multivariate statistical tools are used
more and more often. In this paper we review these different approaches in
order to situate them in the general context of unsupervised and supervised
learning. We insist on the astrophysical outcomes of these studies to show that
multivariate analyses provide an obvious path toward a renewal of our
classification of galaxies and are invaluable tools to investigate the physics
and evolution of galaxies.Comment: Open Access paper.
http://www.frontiersin.org/milky\_way\_and\_galaxies/10.3389/fspas.2015.00003/abstract\>.
\<10.3389/fspas.2015.00003 \&g
Unsupervised Feature Learning through Divergent Discriminative Feature Accumulation
Unlike unsupervised approaches such as autoencoders that learn to reconstruct
their inputs, this paper introduces an alternative approach to unsupervised
feature learning called divergent discriminative feature accumulation (DDFA)
that instead continually accumulates features that make novel discriminations
among the training set. Thus DDFA features are inherently discriminative from
the start even though they are trained without knowledge of the ultimate
classification problem. Interestingly, DDFA also continues to add new features
indefinitely (so it does not depend on a hidden layer size), is not based on
minimizing error, and is inherently divergent instead of convergent, thereby
providing a unique direction of research for unsupervised feature learning. In
this paper the quality of its learned features is demonstrated on the MNIST
dataset, where its performance confirms that indeed DDFA is a viable technique
for learning useful features.Comment: Corrected citation formattin
- …