21 research outputs found

    Cascaded High Dimensional Histograms: A Generative Approach to Density Estimation

    Full text link
    We present tree- and list- structured density estimation methods for high dimensional binary/categorical data. Our density estimation models are high dimensional analogies to variable bin width histograms. In each leaf of the tree (or list), the density is constant, similar to the flat density within the bin of a histogram. Histograms, however, cannot easily be visualized in higher dimensions, whereas our models can. The accuracy of histograms fades as dimensions increase, whereas our models have priors that help with generalization. Our models are sparse, unlike high-dimensional histograms. We present three generative models, where the first one allows the user to specify the number of desired leaves in the tree within a Bayesian prior. The second model allows the user to specify the desired number of branches within the prior. The third model returns lists (rather than trees) and allows the user to specify the desired number of rules and the length of rules within the prior. Our results indicate that the new approaches yield a better balance between sparsity and accuracy of density estimates than other methods for this task.Comment: 27 pages, 13 figure

    Fast Implementation of Linear Discriminant Analysis

    Get PDF
    Master'sMASTER OF SCIENC

    The Effect of Trade Openness on Income Inequality with the Role of Institutional Quality

    Get PDF
    This study investigates the effect of trade openness on income inequality using the panel system generalised method of moments (GMM). The sample countries consist of 65 developed and developing countries and the time period covers from 1984 to 2012. This study also provides new evidence that sheds light on the role of institutional quality in influencing the effectof trade openness on income inequality. The empirical results reveal that trade openness tends to increase income inequality. In addition, the marginal effect also revealed that institutional quality has a corrective effect on the trade openness – income inequality nexus

    Impact of innovation on economic growth: evidence from Malaysia

    Get PDF
    This study empirically investigates the effect of innovation on economic growth using the neoclassical economic growth model. Embarking from the traditional labour growth, physical capital and human capital framework, innovation is postulated to be the main driver for robust economic growth. Using time series techniques, we discover very attention-grabbing findings that highlight the impact of innovation on economic growth for Malaysia. First, the innovation measured by the quantity of a total number of a patent application is statistically insignificant. The result is robust for various innovation measurements, including total local patent application and total foreign patent application. Interestingly, switching to total patent grant instead of a total number of patent application (local or foreign), the empirical result shows a significant impact on economic growth. The finding indirectly reveals the crucial impact of quality innovation rather than the quantity concern. Neglecting both quality and the commercialisation process of these new technologies may not solve the rigidity of knowledge commercialisation paradox. Finally, we test for the prominent institutional quality in mediating economic growth under a knowledge-based economy. The interaction between institutional quality and the total patent grant has significantly accelerated the role of innovation channel to economic growth. The empirical findings imply that inadequacy of innovative technology flow over the long term has a detrimental effect on national innovative capacity. Thus, the innovation-economic growth nexus needs to be complemented with a good institutional quality framework, skilled human capital and broader networking to commercialise the innovative product to ensure that the innovation activities promote economic growth

    A STUDY OF CONVEX M-MATRICES AND SOME RELATED ITERATIVE METHODS

    No full text
    Bachelor'sBACHELOR OF SCIENCE (HONOURS

    Machine learning approaches to challenging problems : interpretable imbalanced classification, interpretable density estimation, and causal inference

    No full text
    Thesis: Ph. D., Massachusetts Institute of Technology, Sloan School of Management, Operations Research Center, 2018.This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.Cataloged from student-submitted PDF version of thesis.Includes bibliographical references (pages 111-118).In this thesis, I address three challenging machine-learning problems. The first problem that we address is the imbalanced data problem. We propose two algorithms to handle highly imbalanced classification problems. The first algorithm uses mixed integer programming to optimize a weighted balance between positive and negative class accuracies. The second method uses an approximation in order to assist with scalability. Specifically, it follows a characterize-then-discriminate approach. The positive class is first characterized by boxes, and then each box boundary becomes a separate discriminative classifier. This method is computationally advantageous because it can be easily parallelized, and considers only the relevant regions of the feature space. The second problem is a density estimation problem for categorical data sets. We present tree- and list- structured density estimation methods for binary/categorical data. We present three generative models, where the first one allows the user to specify the number of desired leaves in the tree within a Bayesian prior. The second model allows the user to specify the desired number of branches within the prior. The third model returns lists (rather than trees) and allows the user to specify the desired number of rules and the length of rules within the prior. Finally, we present a new machine learning approach to estimate personalized treatment effects in the classical potential outcomes framework with binary outcomes. Strictly, both treatment and control outcomes must be measured for each unit in order to perform supervised learning. However, in practice, only one outcome can be observed per unit. To overcome the problem that both treatment and control outcomes for the same unit are required for supervised learning, we propose surrogate loss functions that incorporate both treatment and control data. The new surrogates yield tighter bounds than the sum of the losses for the treatment and control groups. A specific choice of loss function, namely a type of hinge loss, yields a minimax support vector machine formulation. The resulting optimization problem requires the solution to only a single convex optimization problem, incorporating both treatment and control units, and it enables the kernel trick to be used to handle nonlinear (also non-parametric) estimation.by Siong Thye Goh.Ph. D
    corecore