19,266 research outputs found

    Teaching Data Science

    Get PDF
    We describe an introductory data science course, entitled Introduction to Data Science, offered at the University of Illinois at Urbana-Champaign. The course introduced general programming concepts by using the Python programming language with an emphasis on data preparation, processing, and presentation. The course had no prerequisites, and students were not expected to have any programming experience. This introductory course was designed to cover a wide range of topics, from the nature of data, to storage, to visualization, to probability and statistical analysis, to cloud and high performance computing, without becoming overly focused on any one subject. We conclude this article with a discussion of lessons learned and our plans to develop new data science courses.Comment: 10 pages, 4 figures, International Conference on Computational Science (ICCS 2016

    Counting minimal generator matrices

    Get PDF
    Given a particular convolutional code C, we wish to find all minimal generator matrices G(D) which represent that code. A standard form S(D) for a minimal matrix is defined, and then all standard forms for the code C are counted (this is equivalent to counting special pre-multiplication matrices P(D)). It is shown that all the minimal generator matrices G(D) are contained within the 'ordered row permutations' of these standard forms, and that all these permutations are distinct. Finally, the result is used to place a simple upper bound on the possible number of convolutional codes

    When is multidimensional screening a convex program?

    Full text link
    A principal wishes to transact business with a multidimensional distribution of agents whose preferences are known only in the aggregate. Assuming a twist (= generalized Spence-Mirrlees single-crossing) hypothesis and that agents can choose only pure strategies, we identify a structural condition on the preference b(x,y) of agent type x for product type y -- and on the principal's costs c(y) -- which is necessary and sufficient for reducing the profit maximization problem faced by the principal to a convex program. This is a key step toward making the principal's problem theoretically and computationally tractable; in particular, it allows us to derive uniqueness and stability of the principal's optimum strategy -- and similarly of the strategy maximizing the expected welfare of the agents when the principal's profitability is constrained. We call this condition non-negative cross-curvature: it is also (i) necessary and sufficient to guarantee convexity of the set of b-convex functions, (ii) invariant under reparametrization of agent and/or product types by diffeomorphisms, and (iii) a strengthening of Ma, Trudinger and Wang's necessary and sufficient condition (A3w) for continuity of the correspondence between an exogenously prescribed distribution of agents and of products. We derive the persistence of economic effects such as the desirability for a monopoly to establish prices so high they effectively exclude a positive fraction of its potential customers, in nearly the full range of non-negatively cross-curved models.Comment: 23 page

    Optimal transportation, topology and uniqueness

    Get PDF
    The Monge-Kantorovich transportation problem involves optimizing with respect to a given a cost function. Uniqueness is a fundamental open question about which little is known when the cost function is smooth and the landscapes containing the goods to be transported possess (non-trivial) topology. This question turns out to be closely linked to a delicate problem (# 111) of Birkhoff [14]: give a necessary and sufficient condition on the support of a joint probability to guarantee extremality among all measures which share its marginals. Fifty years of progress on Birkhoff's question culminate in Hestir and Williams' necessary condition which is nearly sufficient for extremality; we relax their subtle measurability hypotheses separating necessity from sufficiency slightly, yet demonstrate by example that to be sufficient certainly requires some measurability. Their condition amounts to the vanishing of the measure \gamma outside a countable alternating sequence of graphs and antigraphs in which no two graphs (or two antigraphs) have domains that overlap, and where the domain of each graph / antigraph in the sequence contains the range of the succeeding antigraph (respectively, graph). Such sequences are called numbered limb systems. We then explain how this characterization can be used to resolve the uniqueness of Kantorovich solutions for optimal transportation on a manifold with the topology of the sphere.Comment: 36 pages, 6 figure

    Regularity of optimal transport maps on multiple products of spheres

    Full text link
    This article addresses regularity of optimal transport maps for cost="squared distance" on Riemannian manifolds that are products of arbitrarily many round spheres with arbitrary sizes and dimensions. Such manifolds are known to be non-negatively cross-curved [KM2]. Under boundedness and non-vanishing assumptions on the transfered source and target densities we show that optimal maps stay away from the cut-locus (where the cost exhibits singularity), and obtain injectivity and continuity of optimal maps. This together with the result of Liu, Trudinger and Wang [LTW] also implies higher regularity (C^{1,\alpha}/C^\infty) of optimal maps for more smooth (C^\alpha /C^\infty)) densities. These are the first global regularity results which we are aware of concerning optimal maps on non-flat Riemannian manifolds which possess some vanishing sectional curvatures. Moreover, such product manifolds have potential relevance in statistics (see [S]) and in statistical mechanics (where the state of a system consisting of many spins is classically modeled by a point in the phase space obtained by taking many products of spheres). For the proof we apply and extend the method developed in [FKM1], where we showed injectivity and continuity of optimal maps on domains in R^n for smooth non-negatively cross-curved cost. The major obstacle in the present paper is to deal with the non-trivial cut-locus and the presence of flat directions.Comment: 35 pages, 4 figure

    Subspace Methods for Data Attack on State Estimation: A Data Driven Approach

    Full text link
    Data attacks on state estimation modify part of system measurements such that the tempered measurements cause incorrect system state estimates. Attack techniques proposed in the literature often require detailed knowledge of system parameters. Such information is difficult to acquire in practice. The subspace methods presented in this paper, on the other hand, learn the system operating subspace from measurements and launch attacks accordingly. Conditions for the existence of an unobservable subspace attack are obtained under the full and partial measurement models. Using the estimated system subspace, two attack strategies are presented. The first strategy aims to affect the system state directly by hiding the attack vector in the system subspace. The second strategy misleads the bad data detection mechanism so that data not under attack are removed. Performance of these attacks are evaluated using the IEEE 14-bus network and the IEEE 118-bus network.Comment: 12 page
    • …
    corecore