4 research outputs found

    Constrained variable clustering and the best basis problem in functional data analysis

    Get PDF
    International audienceFunctional data analysis involves data described by regular functions rather than by a finite number of real valued variables. While some robust data analysis methods can be applied directly to the very high dimensional vectors obtained from a fine grid sampling of functional data, all methods benefit from a prior simplification of the functions that reduces the redundancy induced by the regularity. In this paper we propose to use a clustering approach that targets variables rather than individual to design a piecewise constant representation of a set of functions. The contiguity constraint induced by the functional nature of the variables allows a polynomial complexity algorithm to give the optimal solution

    Computing maximum-scoring segments in almost linear time

    No full text
    Given a sequence, the problem studied in this paper is to find a set of k disjoint continuous subsequences such that the total sum of all elements in the set is maximized. This problem arises naturally in the analysis of DNA sequences. The previous best known algorithm requires Θ(n log n) time in the worst case. For a given sequence of length n, we present an almost linear-time algorithm for this problem. Our algorithm uses a disjoint-set data structure and requires O(nα(n, n)) time in the worst case, where α(n, n) is the inverse Ackermann function
    corecore