5,530 research outputs found
Data Mining Decision Trees in Economy
Data Mining represents the extraction previously unknown, and potentially useful information from data. Using Data Mining Decision Trees techniques our investigation tries to illustrate how to extract meaningful socio-economical knowledge from large data sets. Our tests find 5 attributes selection measures that perform more accurate then the best performance of the 17 algorithms presented in literature.Data Mining, Decision Trees, classification error rate
Region Based Data Mining on Agriculture Data
Spatial Data Mining is the process of discovering interesting and previously unknown, but potentially useful patterns from large spatial databases. Most relationships in spatial datasets are regional and there is a great need for regional regression methods that derive regional reflects different spatial characteristics of different regions. A central challenge in spatial data mining is the efficiency of spatial data mining algorithms, due to the often huge amount of spatial data and the complexity of spatial data types and spatial accessing methods. This paper proposes a regional regression technique for regions that are defined by a categorical attribute, in particular soil type. The result is a series of hierarchically grouped regions according to their similarity
Intelligent data analysis - support for development of SMEs sector
The paper studies possibilities of intelligent data analysis application for discovering knowledge hidden in small and medium-sized enterprises’ (SMEs) data, on the territory of the province of Vojvodina. The knowledge revealed by intelligent analysis, and not accessible by any other means, could be the valuable starting point for working out of proactive and preventive actions for the development of the SMEs sector.Intelligent data analysis, CRISP-DM, clustering, small and medium enterprises., Research and Development/Tech Change/Emerging Technologies, C8, L2,
Data Mining Applications in Big Data
Data mining is a process of extracting hidden, unknown, but potentially useful information from massive data. Big Data has great impacts on scientific discoveries and value creation. This paper introduces methods in data mining and technologies in Big Data. Challenges of data mining and data mining with big data are discussed. Some technology progress of data mining and data mining with big data are also presented
Data mining in Cloud Computing
This paper describes how data mining is used in cloud computing. Data Mining is used for extracting potentially useful information from raw data. The integration of data mining techniques into normal day-to-day activities has become common place. Every day people are confronted with targeted advertising, and data mining techniques help businesses to become more efficient by reducing costs.Data mining techniques and applications are very much needed in the cloud computing paradigm. The implementation of data mining techniques through Cloud computing will allow the users to retrieve meaningful information from virtually integrated data warehouse that reduces the costs of infrastructure and storage
New probabilistic interest measures for association rules
Mining association rules is an important technique for discovering meaningful
patterns in transaction databases. Many different measures of interestingness
have been proposed for association rules. However, these measures fail to take
the probabilistic properties of the mined data into account. In this paper, we
start with presenting a simple probabilistic framework for transaction data
which can be used to simulate transaction data when no associations are
present. We use such data and a real-world database from a grocery outlet to
explore the behavior of confidence and lift, two popular interest measures used
for rule mining. The results show that confidence is systematically influenced
by the frequency of the items in the left hand side of rules and that lift
performs poorly to filter random noise in transaction data. Based on the
probabilistic framework we develop two new interest measures, hyper-lift and
hyper-confidence, which can be used to filter or order mined association rules.
The new measures show significantly better performance than lift for
applications where spurious rules are problematic
Discovery of Frequent Itemsets: Frequent Item Tree-Based Approach
Mining frequent patterns in large transactional databases is a highly researched area in the field of data mining. Existing frequent pattern discovering algorithms suffer from many problems regarding the high memory dependency when mining large amount of data, computational and I/O cost. Additionally, the recursive mining process to mine these structures is also too voracious in memory resources. In this paper, we describe a more efficient algorithm for mining complete frequent itemsets from transactional databases. The suggested algorithm is partially based on FP-tree hypothesis and extracts the frequent itemsets directly from the tree. Its memory requirement, which is independent from the number of processed transactions, is another benefit of the new method. We present performance comparisons for our algorithm against the Apriori algorithm and FP-growth
Discovery of Frequent Itemsets: Frequent Item Tree-Based Approach
Mining frequent patterns in large transactional databases is a highly researched area in the field of data mining. Existing frequent pattern discovering algorithms suffer from many problems regarding the high memory dependency when mining large amount of data, computational and I/O cost. Additionally, the recursive mining process to mine these structures is also too voracious in memory resources. In this paper, we describe a more efficient algorithm for mining complete frequent itemsets from transactional databases. The suggested algorithm is partially based on FP-tree hypothesis and extracts the frequent itemsets directly from the tree. Its memory requirement, which is independent from the number of processed transactions, is another benefit of the new method. We present performance comparisons for our algorithm against the Apriori algorithm and FP-growth
- …