506,211 research outputs found

    A Framework to Discover Emerging Patterns for Application in Microarray Data

    Get PDF
    Various supervised learning and gene selection methods have been used for cancer diagnosis. Most of these methods do not consider interactions between genes, although this might be interesting biologically and improve classification accuracy. Here we introduce a new CART-based method to discover emerging patterns. Emerging patterns are structures of the form (X1>a1)AND(X2<a2) that have differing frequencies in the considered classes. Interaction structures of this kind are of great interest in cancer research. Moreover, they can be used to define new variables for classification. Using simulated data sets, we show that our method allows the identification of emerging patterns with high efficiency. We also perform classification using two publicly available data sets (leukemia and colon cancer). For each data set, the method allows efficient classification as well as the identification of interesting patterns

    Identification of Interaction Patterns and Classification with Applications to Microarray Data

    Get PDF
    Emerging patterns represent a class of interaction structures which has been recently proposed as a tool in data mining. In this paper, a new and more general definition refering to underlying probabilities is proposed. The defined interaction patterns carry information about the relevance of combinations of variables for distinguishing between classes. Since they are formally quite similar to the leaves of a classification tree, we propose a fast and simple method which is based on the CART algorithm to find the corresponding empirical patterns in data sets. In simulations, it can be shown that the method is quite effective in identifying patterns. In addition, the detected patterns can be used to define new variables for classification. Thus, we propose a simple scheme to use the patterns to improve the performance of classification procedures. The method may also be seen as a scheme to improve the performance of CARTs concerning the identification of interaction patterns as well as the accuracy of prediction

    KTDA: emerging patterns based data analysis system

    Get PDF
    Emerging patterns are kind of relationships discovered in databases containing a decision attribute. They represent contrast characteristics of individual decision classes. This form of knowledge can be useful for experts and has been successfully employed in a field of classification. In this paper we present the KTDA system. It enables discovering emerging patterns and applies them to classification purposes. The system has capabilities of identifying improper data by making use of data credibility analysis, a new approach to assessment data typicality

    A new classification method using array Comparative Genome Hybridization data, based on the concept of Limited Jumping Emerging Patterns

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Classification using aCGH data is an important and insufficiently investigated problem in bioinformatics. In this paper we propose a new classification method of DNA copy number data based on the concept of limited Jumping Emerging Patterns. We present the comparison of our limJEPClassifier to SVM which is considered the most successful classifier in the case of high-throughput data.</p> <p>Results</p> <p>Our results revealed that the classification performance using limJEPClassifier is significantly higher than other methods. Furthermore, we show that application of the limited JEP's can significantly improve classification, when strongly unbalanced data are given.</p> <p>Conclusion</p> <p>Nowadays, aCGH has become a very important tool, used in research of cancer or genomic disorders. Therefore, improving classification of aCGH data can have a great impact on many medical issues such as the process of diagnosis and finding disease-related genes. The performed experiment shows that the application of Jumping Emerging Patterns can be effective in the classification of high-dimensional data, including these from aCGH experiments.</p

    Globalisation and the euro area: simulation based analysis using the New Area Wide Model

    Get PDF
    In this paper, we utilise the multi-country version of the NAWM to analyse the impact of globalisation on euro area macroeconomic aggregates. We provide alternative model-based definitions of globalisation associated with an increase in potential output in emerging Asia and its impact on total factor productivity in the euro area, and a shift in international specialisation patterns leading to changes in relative demand and import substitutions. The results indicate that globalisation has a positive impact on output, consumption, investment and real labour income in the long-run. This impact is driven by the improvement in the terms of trade and associated positive wealth effects, as well as by spillovers of higher potential output in emerging Asia on euro area total factor productivity. Additionally, we provide evidence that structural reforms in goods and labour markets would amplify the benefits associated with globalisation. JEL Classification: E32, E62DSGE modelling, euro area, Globalisation

    Financial Ratio Classification and Sub-sector Discrimination of Manufacturing Firms Evidence from an Emerging Market

    Get PDF
    This article aims to develop an empirically-based classification of financial ratios for manufacturing firms and to examine whether or not these ratios can be used in differentiating sub-sectors of manufacturing industry. The article involves 160 manufacturing firms which are traded in the emerging Istanbul Stock Exchange (ISE). It covers the period between December 1992 and June 1999, and financial ratios of those companies have been calculated for 14 terms. Factor analysis was applied, both to isolate the independent patterns of financial ratios and to create an empirical classification for them. Factor analysis revealed four common factors, namely profitability, solvency/leverage, liquidity, and activity. The discriminating ability of the independent patterns of the financial ratios has been evaluated by means of the discriminant analysis. The eight sub-sectors of the manufacturing firms were included in the analysis, and it was concluded that those common factors are statistically significant in differentiating the sub-sectors of manufacturing firms of an emerging market

    Counting the learnable functions of structured data

    Get PDF
    Cover's function counting theorem is a milestone in the theory of artificial neural networks. It provides an answer to the fundamental question of determining how many binary assignments (dichotomies) of pp points in nn dimensions can be linearly realized. Regrettably, it has proved hard to extend the same approach to more advanced problems than the classification of points. In particular, an emerging necessity is to find methods to deal with structured data, and specifically with non-pointlike patterns. A prominent case is that of invariant recognition, whereby identification of a stimulus is insensitive to irrelevant transformations on the inputs (such as rotations or changes in perspective in an image). An object is therefore represented by an extended perceptual manifold, consisting of inputs that are classified similarly. Here, we develop a function counting theory for structured data of this kind, by extending Cover's combinatorial technique, and we derive analytical expressions for the average number of dichotomies of generically correlated sets of patterns. As an application, we obtain a closed formula for the capacity of a binary classifier trained to distinguish general polytopes of any dimension. These results may help extend our theoretical understanding of generalization, feature extraction, and invariant object recognition by neural networks
    corecore