147 research outputs found

    Algebraic geometry in experimental design and related fields

    Get PDF
    The thesis is essentially concerned with two subjects corresponding to the two grants under which the author was research assistant in the last three years. The one presented first, which cronologically comes second, addresses the issues of iden- tifiability for polynomial models via algebraic geometry and leads to a deeper understanding of the classical theory. For example the very recent introduction of the idea of the fan of an experimental design gives a maximal class of models identifiable with a given design. The second area develops a theory of optimum orthogonal fractions for Fourier regression models based on integer lattice designs. These provide alternatives to product designs. For particular classes of Fourier models with a given number of interactions the focus is on the study of orthogonal designs with attention given to complexity issues as the dimension of the model increases. Thus multivariate identifiability is the field of concern of the thesis. A major link between these two parts is given by Part III where the algebraic approach to identifiability is extended to Fourier models and lattice designs. The approach is algorithmic and algorithms to deal with the various issues are to be found throughout the thesis. Both the application of algebraic geometry and computer algebra in statistics and the analysis of orthogonal fractions for Fourier models are new and rapidly growing fields. See for example the work by Koval and Schwabe (1997) [42] on qualitative Fourier models, Shi and Fang (1995) [67] on ¿/-designs for Fourier regression and Dette and Haller (1997) [25] on one-dimensional incomplete Fourier models. For algebraic geometry in experimental design see Fontana, Pistone and Rogantin (1997) [31] on two-level orthogonal fractions, Caboara and Robbiano (1997) [15] on the inversion problem and Robbiano and Rogantin (1997) [61] on distracted fractions. The only previous extensive application of algebraic geometry in statistics is the work of Diaconis and Sturmfels (1993) [27] on sampling from conditional distributions

    Graphs with few trivial characteristic ideals

    Get PDF
    We give a characterization of the graphs with at most three trivial characteristic ideals. This implies the complete characterization of the regular graphs whose critical groups have at most three invariant factors equal to 1 and the characterization of the graphs whose Smith groups have at most 3 invariant factors equal to 1. We also give an alternative and simpler way to obtain the characterization of the graphs whose Smith groups have at most 3 invariant factors equal to 1, and a list of minimal forbidden graphs for the family of graphs with Smith group having at most 4 invariant factors equal to 1

    Numerical algebraic fan of a design for statistical model building

    Get PDF
    In this article we develop methods for the analysis of non-standard experimental designs by using techniques from algebraic statistics. Our work is motivated by a thermal spraying process used to produce a particle coating on a surface, e.g. for wear protection or durable medical instruments. In this application non-standard designs occur as intermediate results from initial standard designs in a two-stage production process. We investigate algebraic methods to derive better identifiable models with particular emphasis on the second stage of two-stage processes. Ideas from algebraic statistics are explored where the design as finite set of distinct experimental settings is expressed as solution of a system of polynomials. Thereby the design is identified by a polynomial ideal and features and properties of the ideal are explored and provide inside into the structures of models identifiable by the design [Pistone et al., 2001, Riccomagno, 2009]. Holliday et al. [1999] apply these ideas to a problem from the automotive industry with an incomplete standard factorial design, Bates et al. [2003] to the question of finding good polynomial metamodels for computer experiments. In our thermal spraying application, designs for the controllable process parameters are run and properties of particles in flight measured as intermediate responses. The final output describes the coating properties, which are very time-consuming and expensive to measure as the specimen has to be destroyed. It is desirable to predict coating properties either on the basis of process parameters and/or from particle properties. Rudak et al. [2012] provides a first comparison of different modeling approaches. There are still open questions: which models are identifiable with the different choices of input (process parameters, particle properties, or both)? Is it better to base the second model between particle and coating properties on estimated expected values or the observations themselves? The present article is a contribution in this direction. Especially in the second stage particle properties as input variables are observed values from the originally chosen design for the controllable factors. The resulting design on the particle property level can be tackled with algebraic statistics to determine identifiable models. However, it turns out that resulting models contain elements which are only identifiable due to small deviations of the design from more regular points, hence leading to unwanted unstable model results. We tackle this problem with tools from algebraic statistics. Because of the fact that data in the second stage are very noisy, we extend existing theory by switching from symbolic, exact computations to numerical computations in the calculation of the design ideal and of its fan. Specifically, instead of polynomials whose solution are the design points, we identify a design with a set of polynomials which "almost vanish" at the design points using results and algorithms from Fassino [2010]. The paper is organized as follows. In Section 2 three different approaches towards the modeling of a final output in a two-stage process are introduced and compared. The algebraic treatment and reasoning is the same whatever the approach. Section 3 contains the theoretical background of algebraic statistics for experimental design, always exemplified for the special application. Section 4 is the case study itself

    Constructions and complexity of secondary polytopes

    Get PDF
    AbstractThe secondary polytope Σ(A) of a configuration A of n points in affine (d − 1)-space is an (n − d)-polytope whose vertices correspond to regular triangulations of conv(A). In this article we present three constructions of Σ(A) and apply them to study various geometric, combinatorial, and computational properties of secondary polytopes. The first construction is due to Gel'fand, Kapranov, and Zelevinsky, who used it to describe the face lattice of Σ(A). We introduce the universal polytope u(A) ⊂ ΛdRn, a combinatorial object depending only on the oriented matroid of A. The secondary Σ(A) can be obtained as the image of u(A) under a canonical linear map onto Rn. The third construction is based upon Gale transforms or oriented matroid duality. It is used to analyze the complexity of computing Σ(A) and to give bounds in terms of n and d for the number of faces of Σ(A)

    Glimmerglass Volume 50 Number 09 (1991)

    Get PDF
    Official Student Newspaper Issue is 8 pages long

    Design of competitive paging algorithms with good behaviour in practice

    Get PDF
    Paging is one of the most prominent problems in the field of online algorithms. We have to serve a sequence of page requests using a cache that can hold up to k pages. If the currently requested page is in cache we have a cache hit, otherwise we say that a cache miss occurs, and the requested page needs to be loaded into the cache. The goal is to minimize the number of cache misses by providing a good page-replacement strategy. This problem is part of memory-management when data is stored in a two-level memory hierarchy, more precisely a small and fast memory (cache) and a slow but large memory (disk). The most important application area is the virtual memory management of operating systems. Accessed pages are either already in the RAM or need to be loaded from the hard disk into the RAM using expensive I/O. The time needed to access the RAM is insignificant compared to an I/O operation which takes several milliseconds. The traditional evaluation framework for online algorithms is competitive analysis where the online algorithm is compared to the optimal offline solution. A shortcoming of competitive analysis consists of its too pessimistic worst-case guarantees. For example LRU has a theoretical competitive ratio of k but in practice this ratio rarely exceeds the value 4. Reducing the gap between theory and practice has been a hot research issue during the last years. More recent evaluation models have been used to prove that LRU is an optimal online algorithm or part of a class of optimal algorithms respectively, which was motivated by the assumption that LRU is one of the best algorithms in practice. Most of the newer models make LRU-friendly assumptions regarding the input, thus not leaving much room for new algorithms. Only few works in the field of online paging have introduced new algorithms which can compete with LRU as regards the small number of cache misses. In the first part of this thesis we study strongly competitive randomized paging algorithms, i.e. algorithms with optimal competitive guarantees. Although the tight bound for the competitive ratio has been known for decades, current algorithms matching this bound are complex and have high running times and memory requirements. We propose the algorithm OnlineMin which processes a page request in O(log k/log log k) time in the worst case. The best previously known solution requires O(k^2) time. Usually the memory requirement of a paging algorithm is measured by the maximum number of pages that the algorithm keeps track of. Any algorithm stores information about the k pages in the cache. In addition it can also store information about pages not in cache, denoted bookmarks. We answer the open question of Bein et al. '07 whether strongly competitive randomized paging algorithms using only o(k) bookmarks exist or not. To do so we modify the Partition algorithm of McGeoch and Sleator '85 which has an unbounded bookmark complexity, and obtain Partition2 which uses O(k/log k) bookmarks. In the second part we extract ideas from theoretical analysis of randomized paging algorithms in order to design deterministic algorithms that perform well in practice. We refine competitive analysis by introducing the attack rate parameter r, which ranges between 1 and k. We show that r is a tight bound on the competitive ratio of deterministic algorithms. We give empirical evidence that r is usually much smaller than k and thus r-competitive algorithms have a reasonable performance on real-world traces. By introducing the r-competitive priority-based algorithm class OnOPT we obtain a collection of promising algorithms to beat the LRU-standard. We single out the new algorithm RDM and show that it outperforms LRU and some of its variants on a wide range of real-world traces. Since RDM is more complex than LRU one may think at first sight that the gain in terms of lowering the number of cache misses is ruined by high runtime for processing pages. We engineer a fast implementation of RDM, and compare it to LRU and the very fast FIFO algorithm in an overall evaluation scheme, where we measure the runtime of the algorithms and add penalties for each cache miss. Experimental results show that for realistic penalties RDM still outperforms these two algorithms even if we grant the competitors an idealistic runtime of 0
    corecore