25,870 research outputs found
On the role of pre and post-processing in environmental data mining
The quality of discovered knowledge is highly depending on data quality. Unfortunately real data use to contain noise, uncertainty, errors, redundancies or even irrelevant information. The more complex is the reality to be analyzed, the higher the risk of getting low quality data. Knowledge Discovery from Databases (KDD) offers a global framework to prepare data in the right form to perform correct analyses. On the other hand, the quality of decisions taken upon KDD results, depend not only on the quality of the results themselves, but on the capacity of the system to communicate those results in an understandable form. Environmental systems are particularly complex and environmental users particularly require clarity in their results. In this paper some details about how this can be achieved are provided. The role of the pre and post processing in the whole process of Knowledge Discovery in environmental systems is discussed
Recommended from our members
The role of human factors in stereotyping behavior and perception of digital library users: A robust clustering approach
To deliver effective personalization for digital library users, it is necessary to identify which human factors are most relevant in determining the behavior and perception of these users. This paper examines three key human factors: cognitive styles, levels of expertise and gender differences, and utilizes three individual clustering techniques: k-means, hierarchical clustering and fuzzy clustering to understand user behavior and perception. Moreover, robust clustering, capable of correcting the bias of individual clustering techniques, is used to obtain a deeper understanding. The robust clustering approach produced results that highlighted the relevance of cognitive style for user behavior, i.e., cognitive style dominates and justifies each of the robust clusters created. We also found that perception was mainly determined by the level of expertise of a user. We conclude that robust clustering is an effective technique to analyze user behavior and perception
A survey on utilization of data mining approaches for dermatological (skin) diseases prediction
Due to recent technology advances, large volumes of medical data is obtained. These data contain valuable information. Therefore data mining techniques can be used to extract useful patterns. This paper is intended to introduce data mining and its various techniques and a survey of the available literature on medical data mining. We emphasize mainly on the application of data mining on skin diseases. A categorization has been provided based on the different data mining techniques. The utility of the various data mining methodologies is highlighted. Generally association mining is suitable for extracting rules. It has been used especially in cancer diagnosis. Classification is a robust method in medical mining. In this paper, we have summarized the different uses of classification in dermatology. It is one of the most important methods for diagnosis of erythemato-squamous diseases. There are different methods like Neural Networks, Genetic Algorithms and fuzzy classifiaction in this topic. Clustering is a useful method in medical images mining. The purpose of clustering techniques is to find a structure for the given data by finding similarities between data according to data characteristics. Clustering has some applications in dermatology. Besides introducing different mining methods, we have investigated some challenges which exist in mining skin data
Stigmergy-based modeling to discover urban activity patterns from positioning data
Positioning data offer a remarkable source of information to analyze crowds
urban dynamics. However, discovering urban activity patterns from the emergent
behavior of crowds involves complex system modeling. An alternative approach is
to adopt computational techniques belonging to the emergent paradigm, which
enables self-organization of data and allows adaptive analysis. Specifically,
our approach is based on stigmergy. By using stigmergy each sample position is
associated with a digital pheromone deposit, which progressively evaporates and
aggregates with other deposits according to their spatiotemporal proximity.
Based on this principle, we exploit positioning data to identify high density
areas (hotspots) and characterize their activity over time. This
characterization allows the comparison of dynamics occurring in different days,
providing a similarity measure exploitable by clustering techniques. Thus, we
cluster days according to their activity behavior, discovering unexpected urban
activity patterns. As a case study, we analyze taxi traces in New York City
during 2015
- …