11,214 research outputs found
Automatic Bayesian Density Analysis
Making sense of a dataset in an automatic and unsupervised fashion is a
challenging problem in statistics and AI. Classical approaches for {exploratory
data analysis} are usually not flexible enough to deal with the uncertainty
inherent to real-world data: they are often restricted to fixed latent
interaction models and homogeneous likelihoods; they are sensitive to missing,
corrupt and anomalous data; moreover, their expressiveness generally comes at
the price of intractable inference. As a result, supervision from statisticians
is usually needed to find the right model for the data. However, since domain
experts are not necessarily also experts in statistics, we propose Automatic
Bayesian Density Analysis (ABDA) to make exploratory data analysis accessible
at large. Specifically, ABDA allows for automatic and efficient missing value
estimation, statistical data type and likelihood discovery, anomaly detection
and dependency structure mining, on top of providing accurate density
estimation. Extensive empirical evidence shows that ABDA is a suitable tool for
automatic exploratory analysis of mixed continuous and discrete tabular data.Comment: In proceedings of the Thirty-Third AAAI Conference on Artificial
Intelligence (AAAI-19
Methods of Hierarchical Clustering
We survey agglomerative hierarchical clustering algorithms and discuss
efficient implementations that are available in R and other software
environments. We look at hierarchical self-organizing maps, and mixture models.
We review grid-based clustering, focusing on hierarchical density-based
approaches. Finally we describe a recently developed very efficient (linear
time) hierarchical clustering algorithm, which can also be viewed as a
hierarchical grid-based algorithm.Comment: 21 pages, 2 figures, 1 table, 69 reference
A survey on utilization of data mining approaches for dermatological (skin) diseases prediction
Due to recent technology advances, large volumes of medical data is obtained. These data contain valuable information. Therefore data mining techniques can be used to extract useful patterns. This paper is intended to introduce data mining and its various techniques and a survey of the available literature on medical data mining. We emphasize mainly on the application of data mining on skin diseases. A categorization has been provided based on the different data mining techniques. The utility of the various data mining methodologies is highlighted. Generally association mining is suitable for extracting rules. It has been used especially in cancer diagnosis. Classification is a robust method in medical mining. In this paper, we have summarized the different uses of classification in dermatology. It is one of the most important methods for diagnosis of erythemato-squamous diseases. There are different methods like Neural Networks, Genetic Algorithms and fuzzy classifiaction in this topic. Clustering is a useful method in medical images mining. The purpose of clustering techniques is to find a structure for the given data by finding similarities between data according to data characteristics. Clustering has some applications in dermatology. Besides introducing different mining methods, we have investigated some challenges which exist in mining skin data
- …