124,799 research outputs found

    Improve Data Mining and Knowledge Discovery Through the Use of MatLab

    Get PDF
    Data mining is widely used to mine business, engineering, and scientific data. Data mining uses pattern based queries, searches, or other analyses of one or more electronic databases/datasets in order to discover or locate a predictive pattern or anomaly indicative of system failure, criminal or terrorist activity, etc. There are various algorithms, techniques and methods used to mine data; including neural networks, genetic algorithms, decision trees, nearest neighbor method, rule induction association analysis, slice and dice, segmentation, and clustering. These algorithms, techniques and methods used to detect patterns in a dataset, have been used in the development of numerous open source and commercially available products and technology for data mining. Data mining is best realized when latent information in a large quantity of data stored is discovered. No one technique solves all data mining problems; challenges are to select algorithms or methods appropriate to strengthen data/text mining and trending within given datasets. In recent years, throughout industry, academia and government agencies, thousands of data systems have been designed and tailored to serve specific engineering and business needs. Many of these systems use databases with relational algebra and structured query language to categorize and retrieve data. In these systems, data analyses are limited and require prior explicit knowledge of metadata and database relations; lacking exploratory data mining and discoveries of latent information. This presentation introduces MatLab(R) (MATrix LABoratory), an engineering and scientific data analyses tool to perform data mining. MatLab was originally intended to perform purely numerical calculations (a glorified calculator). Now, in addition to having hundreds of mathematical functions, it is a programming language with hundreds built in standard functions and numerous available toolboxes. MatLab's ease of data processing, visualization and its enormous availability of built in functionalities and toolboxes make it suitable to perform numerical computations and simulations as well as a data mining tool. Engineers and scientists can take advantage of the readily available functions/toolboxes to gain wider insight in their perspective data mining experiments

    Feature extraction algorithms from MRI to evaluate quality parameters on meat products by using data mining

    Get PDF
    This thesis proposes a new methodology to determine the quality characteristics of meat products (Iberian loin and ham) in a non-destructive way. For that, new algorithms have been developed to analyze Magnetic Resonance Imaging (MRI), and data mining techniques have been applied on data obtained from the images.The general procedure consists of obtaining MRI of meat products, and applying different computer vision algorithms (texture and fractal approaches, mainly), which allow the extraction of sets of computational features. Figure 1 shows the design of the proposed procedure.To achieve this, different research have been done, based on:high-field and low-field MRI scannersdifferent acquisition sequences: Spin Echo (SE), Gradient Echo (GE) and Turbo 3D (T3D)different texture approaches: Gray Level Co-occurrence Matrix (GLCM), Gray Level Run Length Matrix (GLRLM) and Neighboring Gray Level Dependence Matrix (NGLDM)fractals algorithms: Classical Fractal Algorithm (CFA), Fractal Texture Algorithm (FTA) and One Point Fractal Texture Algorithm (OPFTA)FTA [1] and OPFTA [2] have been developed in this thesis. They allow analyzing MRI images, properly, noting OPFTA for its simplicity and lower computational cost. At the same time, the meat products, Iberian hams and loins, were also analyzed by means of physico-chemical and sensory techniques. Databases were constructed with all these data. Different data mining techniques have been applied on them: deductive (Multiple Linear Regression (MLR)) [3], classification (Decision Trees (DT) and Rules-based Systems (RBS)) [4], and prediction techniques [5-7]. Figure 2 shows the MRI images of fresh and dry-cured Iberian loins (Figure 2A and 2B) and fresh and dry-cured hams (Figure 2C and 2D).The accuracy of the analysis of the quality parameters of Iberian ham and loin is affected by the MRI acquisition sequence, the algorithm used to analyze them and the data mining technique applied. Considering the data mining techniques, MLR and DT are appropriate, respectively, to deduce physico-chemical parameters of hams, and to classify as a function of salt content in hams. Regarding to the predictive technique, MLR could be indicate it allows obtaining equations to determine the physico-chemical characteristics and sensory attributes of Iberian loins and hams with a high degree of reliability, and analyzing the quality of these meat products in a non-destructive, efficient, effective and accurate way

    Feature extraction algorithms from MRI to evaluate quality parameters on meat products by using data mining

    Get PDF
    This thesis proposes a new methodology to determine the quality characteristics of meat products (Iberian loin and ham) in a non-destructive way. For that, new algorithms have been developed to analyze Magnetic Resonance Imaging (MRI), and data mining techniques have been applied on data obtained from the images.The general procedure consists of obtaining MRI of meat products, and applying different computer vision algorithms (texture and fractal approaches, mainly), which allow the extraction of sets of computational features. Figure 1 shows the design of the proposed procedure.To achieve this, different research have been done, based on:high-field and low-field MRI scannersdifferent acquisition sequences: Spin Echo (SE), Gradient Echo (GE) and Turbo 3D (T3D)different texture approaches: Gray Level Co-occurrence Matrix (GLCM), Gray Level Run Length Matrix (GLRLM) and Neighboring Gray Level Dependence Matrix (NGLDM)fractals algorithms: Classical Fractal Algorithm (CFA), Fractal Texture Algorithm (FTA) and One Point Fractal Texture Algorithm (OPFTA)FTA [1] and OPFTA [2] have been developed in this thesis. They allow analyzing MRI images, properly, noting OPFTA for its simplicity and lower computational cost. At the same time, the meat products, Iberian hams and loins, were also analyzed by means of physico-chemical and sensory techniques. Databases were constructed with all these data. Different data mining techniques have been applied on them: deductive (Multiple Linear Regression (MLR)) [3], classification (Decision Trees (DT) and Rules-based Systems (RBS)) [4], and prediction techniques [5-7]. Figure 2 shows the MRI images of fresh and dry-cured Iberian loins (Figure 2A and 2B) and fresh and dry-cured hams (Figure 2C and 2D).The accuracy of the analysis of the quality parameters of Iberian ham and loin is affected by the MRI acquisition sequence, the algorithm used to analyze them and the data mining technique applied. Considering the data mining techniques, MLR and DT are appropriate, respectively, to deduce physico-chemical parameters of hams, and to classify as a function of salt content in hams. Regarding to the predictive technique, MLR could be indicate it allows obtaining equations to determine the physico-chemical characteristics and sensory attributes of Iberian loins and hams with a high degree of reliability, and analyzing the quality of these meat products in a non-destructive, efficient, effective and accurate way

    RESEARCH ISSUES CONCERNING ALGORITHMS USED FOR OPTIMIZING THE DATA MINING PROCESS

    Get PDF
    In this paper, we depict some of the most widely used data mining algorithms that have an overwhelming utility and influence in the research community. A data mining algorithm can be regarded as a tool that creates a data mining model. After analyzing a set of data, an algorithm searches for specific trends and patterns, then defines the parameters of the mining model based on the results of this analysis. The above defined parameters play a significant role in identifying and extracting actionable patterns and detailed statistics. The most important algorithms within this research refer to topics like clustering, classification, association analysis, statistical learning, link mining. In the following, after a brief description of each algorithm, we analyze its application potential and research issues concerning the optimization of the data mining process. After the presentation of the data mining algorithms, we will depict the most important data mining algorithms included in Microsoft and Oracle software products, useful suggestions and criteria in choosing the most recommended algorithm for solving a mentioned task, advantages offered by these software products.data mining optimization, data mining algorithms, software solutions

    Ensembles of probability estimation trees for customer churn prediction

    Get PDF
    Customer churn prediction is one of the most, important elements tents of a company's Customer Relationship Management, (CRM) strategy In tins study, two strategies are investigated to increase the lift. performance of ensemble classification models, i.e (1) using probability estimation trees (PETs) instead of standard decision trees as base classifiers; and (n) implementing alternative fusion rules based on lift weights lot the combination of ensemble member's outputs Experiments ale conducted lot font popular ensemble strategics on five real-life chin n data sets In general, the results demonstrate how lift performance can be substantially improved by using alternative base classifiers and fusion tides However: the effect vanes lot the (Idol cut ensemble strategies lit particular, the results indicate an increase of lift performance of (1) Bagging by implementing C4 4 base classifiets. (n) the Random Subspace Method (RSM) by using lift-weighted fusion rules, and (in) AdaBoost, by implementing both
    corecore