82,807 research outputs found

    Neural Networks in Data Mining

    Get PDF
    Data Mining means extraction of hidden predictive information from huge amount of databases. It is beneficial in every field like business, engineering, web data etc. In data mining classification of data is very difficult task that can be solving by using different algorithms. The more common model functions in data mining include classification, clustering, rule generation and knowledge discovery. There are many technologies available to data mining, including Artificial Neural Networks, Regression, and Decision Trees. In this paper the data mining based on neural networks is studied in detail, and the key technology and ways to achieve the data mining based on neural networks are also studied

    The Impact of Data Imputation Methodologies on Knowledge Discovery

    Get PDF
    The purpose of this research is to investigate the impact of Data Imputation Methodologies that are employed when a specific Data Mining algorithm is utilized within a KDD (Knowledge Discovery in Databases) process. This study will employ certain Knowledge Discovery processes that are widely accepted in both the academic and commercial worlds. Several Knowledge Discovery models will be developed utilizing secondary data containing known correct values. Tests will be conducted on the secondary data both before and after storing data instances with known results and then identifying imprecise data values. One of the integral stages in the accomplishment of successful Knowledge Discovery is the Data Mining phase. The actual Data Mining process deals significantly with prediction, estimation, classification, pattern recognition and the development of association rules. Neural Networks are the most commonly selected tools for Data Mining classification and prediction. Neural Networks employ various types of Transfer Functions when outputting data. The most commonly employed Transfer Function is the s-Sigmoid Function. Various Knowledge Discovery Models from various research and business disciplines were tested using this framework. However, missing and inconsistent data has been pervasive problems in the history of data analysis since the origin of data collection. Due to advancements in the capacities of data storage and the proliferation of computer software, more historical data is being collected and analyzed today than ever before. The issue of missing data must be addressed, since ignoring this problem can introduce bias into the models being evaluated and lead to inaccurate data mining conclusions. The objective of this research is to address the impact of Missing Data and Data Imputation on the Data Mining phase of Knowledge Discovery when Neural Networks are utilized when employing an s-Sigmoid Transfer function, and are confronted with Missing Data and Data Imputation methodologie

    The Impact of Data Imputation Methodologies on Knowledge Discovery

    Get PDF
    The purpose of this research is to investigate the impact of Data Imputation Methodologies that are employed when a specific Data Mining algorithm is utilized within a KDD (Knowledge Discovery in Databases) process. This study will employ certain Knowledge Discovery processes that are widely accepted in both the academic and commercial worlds. Several Knowledge Discovery models will be developed utilizing secondary data containing known correct values. Tests will be conducted on the secondary data both before and after storing data instances with known results and then identifying imprecise data values. One of the integral stages in the accomplishment of successful Knowledge Discovery is the Data Mining phase. The actual Data Mining process deals significantly with prediction, estimation, classification, pattern recognition and the development of association rules. Neural Networks are the most commonly selected tools for Data Mining classification and prediction. Neural Networks employ various types of Transfer Functions when outputting data. The most commonly employed Transfer Function is the s-Sigmoid Function. Various Knowledge Discovery Models from various research and business disciplines were tested using this framework. However, missing and inconsistent data has been pervasive problems in the history of data analysis since the origin of data collection. Due to advancements in the capacities of data storage and the proliferation of computer software, more historical data is being collected and analyzed today than ever before. The issue of missing data must be addressed, since ignoring this problem can introduce bias into the models being evaluated and lead to inaccurate data mining conclusions. The objective of this research is to address the impact of Missing Data and Data Imputation on the Data Mining phase of Knowledge Discovery when Neural Networks are utilized when employing an s-Sigmoid Transfer function, and are confronted with Missing Data and Data Imputation methodologie

    A Survey of Parallel Data Mining

    Get PDF
    With the fast, continuous increase in the number and size of databases, parallel data mining is a natural and cost-effective approach to tackle the problem of scalability in data mining. Recently there has been a considerable research on parallel data mining. However, most projects focus on the parallelization of a single kind of data mining algorithm/paradigm. This paper surveys parallel data mining with a broader perspective. More precisely, we discuss the parallelization of data mining algorithms of four knowledge discovery paradigms, namely rule induction, instance-based learning, genetic algorithms and neural networks. Using the lessons learned from this discussion, we also derive a set of heuristic principles for designing efficient parallel data mining algorithms

    Encapsulation of Soft Computing Approaches within Itemset Mining a A Survey

    Get PDF
    Data Mining discovers patterns and trends by extracting knowledge from large databases. Soft Computing techniques such as fuzzy logic, neural networks, genetic algorithms, rough sets, etc. aims to reveal the tolerance for imprecision and uncertainty for achieving tractability, robustness and low-cost solutions. Fuzzy Logic and Rough sets are suitable for handling different types of uncertainty. Neural networks provide good learning and generalization. Genetic algorithms provide efficient search algorithms for selecting a model, from mixed media data. Data mining refers to information extraction while soft computing is used for information processing. For effective knowledge discovery from large databases, both Soft Computing and Data Mining can be merged. Association rule mining (ARM) and Itemset mining focus on finding most frequent item sets and corresponding association rules, extracting rare itemsets including temporal and fuzzy concepts in discovered patterns. This survey paper explores the usage of soft computing approaches in itemset utility mining

    The Use of Intelligent Systems for Planning and Scheduling of Product Development Projects

    Get PDF
    AbstractThe paper investigates the use of intelligent systems to identify the factors that significantly influence the duration of new product development. These factors are identified on the basis of an internal database of a production enterprise and further used to estimate the duration of phases in product development projects. In the paper, some models and methodologies of the knowledge discovery process are compared and a method of knowledge acquisition from an internal database is proposed. The presented approach is dedicated to industrial enterprises that develop modifications of previous products and are interested in obtaining more precise estimates for project planning and scheduling. The example contains four stages of the knowledge discovery process including data selection, data transformation, data mining, and interpretation of patterns. The example also presents a performance comparison of intelligent systems in the context of variable reduction and preprocessing. Among data mining techniques, artificial neural networks and the fuzzy neural system are chosen to seek relationships between the duration of project phase and other data stored in the information system of an enterprise

    Risk Management Based on Expert Rules and Data Mining: A Case Study in Insurance

    Get PDF
    Correctness, transparency and effectiveness are the principal attributes of knowledge derived from databases using data mining. In the current data mining research there is a focus on efficiency improvement of algorithms for knowledge discovery. However, improving the algorithms is often not sufficient. The limitations of data mining can only be dissolved by the integration of knowledge of experts in the field, encoded in some accessible way, with knowledge derived from patterns in the databases. In this paper we discuss an approach for combining expert knowledge and knowledge derived from transactional databases. The approach proposed is applicable to a wide variety of risk management problems. We illustrate the approach with a case study on fraud detection in an insurance company. The case clearly shows that the combination of expert knowledge with monotomic neural networks leads to significant performance improvements

    Efficient K-Mean Algorithm for Large Dataset

    Get PDF
    The term data mining is used to discover knowledge from large amount of data. For knowledge discovery many software haven developed, that is known as data mining tools these are statistical, machine learning, And neural networks. K-means and K-medoids are widely used simplest partition based unsupervised learning algorithms that solve the well known clustering problem. The procedure follows a simple and easy way to classify a given data set through a certain number of clusters; technically, data mining is the process of finding correlations or patterns among dozens of fields in large relational databases. Stored data is used to locate data in predetermined groups called class. Data items are grouped according to logical relationships or consumer preferences called cluster. Data can be mined to identify association. Data is mined to anticipate behavior patterns and trends called sequential patterns
    • …
    corecore