4 research outputs found

    Fast Conical Hull Algorithms for Near-separable Non-negative Matrix Factorization

    Full text link
    The separability assumption (Donoho & Stodden, 2003; Arora et al., 2012) turns non-negative matrix factorization (NMF) into a tractable problem. Recently, a new class of provably-correct NMF algorithms have emerged under this assumption. In this paper, we reformulate the separable NMF problem as that of finding the extreme rays of the conical hull of a finite set of vectors. From this geometric perspective, we derive new separable NMF algorithms that are highly scalable and empirically noise robust, and have several other favorable properties in relation to existing methods. A parallel implementation of our algorithm demonstrates high scalability on shared- and distributed-memory machines.Comment: 15 pages, 6 figure

    A computationally efficient procedure for data envelopment analysis.

    Get PDF
    This thesis is the final outcome of a project carried out for the UK's Department for Education and Skills (DfES). They were interested in finding a fast algorithm for solving a Data Envelopment Analysis (DEA) model to compare the relative efficiency of 13216 primary schools in England based on 9 input-output factors. The standard approach for solving a DEA model comparing n units (such as primary schools) based on m factors, requires solving 2n linear programming (LP) problems, each with m constraints and at least n variables. At m = 9 and n = 13216, it was proving to be difficult. The research reported in this thesis describes both theoretical and practical contributions to achieving faster computational performance. First we establish that in analysing any unit t only against some critically important units - we call them generators - we can either (a) complete its efficiency analysis, or (b) find a new generator. This is an important contribution to the theory of solution procedures of DEA. It leads to our new Generator Based Algorithm (GBA) which solves only n LPs of maximum size (m x k), where k is the number of generators. As k is a small percentage of n, GBA significantly improves computational performance in large datasets. Further, GBA is capable of solving all the commonly used DEA models including important extensions of the basic models such as weight restricted models. In broad outline, the thesis describes four themes. First, it provides a comprehensive critical review of the extant literature on the computational aspects of DEA. Second, the thesis introduces the new computationally efficient algorithm GBA. It solves the practical problem in 105 seconds. The commercial software used by the DfES, at best, took more than an hour and often took 3 to 5 hours making it impractical for model development work. Third, the thesis presents results of comprehensive computational tests involving GBA, Jose Dula's BuildHull - the best available DEA algorithm in the literature - and the standard approach. Dula's published result showing that BuildHull consistently outperforms the standard approach is confirmed by our experiments. It is also shown that GBA is consistently better than BuildHull and is a viable tool for solving large scale DBA problems. An interesting by-product of this work is a new closed-form solution to the important practical problem of finding strictly positive factor weights without explicit weight restrictions for what are known in the DEA literature as "extreme-efficient units". To date, the only other methods for achieving this require solving additional LPs or a pair of Mixed Integer Linear Programs
    corecore