114 research outputs found

    Identifying almost sorted permutations from TCP buffer dynamics

    Full text link
    Associate to each sequence AA of integers (intending to represent packet IDs) a sequence of positive integers of the same length M(A){\mathcal M}(A). The ii'th entry of M(A){\mathcal M}(A) is the size (at time ii) of the smallest buffer needed to hold out-of-order packets, where space is accounted for unreceived packets as well. Call two sequences AA, BB {\em equivalent} (written A≡FBBA\equiv_{FB} B) if M(A)=M(B){\mathcal M}(A)={\mathcal M}(B). We prove the following result: any two permutations A,BA,B of the same length with SUS(A)SUS(A), SUS(B)≤3SUS(B)\leq 3 (where SUS is the {\em shuffled-up-sequences} reordering measure), and such that A≡FBBA\equiv_{FB} B are identical. The result (which is no longer valid if we replace the upper bound 3 by 4) was motivated by RESTORED, a receiver-oriented model of network traffic we have previously introduced

    List Heaps

    Full text link
    This paper presents a simple extension of the binary heap, the List Heap. We use List Heaps to demonstrate the idea of adaptive heaps: heaps whose performance is a function of both the size of the problem instance and the disorder of the problem instance. We focus on the presortedness of the input sequence as a measure of disorder for the problem instance. A number of practical applications that rely on heaps deal with input that is not random. Even random input contains presorted subsequences. Devising heaps that exploit this structure may provide a means for improving practical performance. We present some basic empirical tests to support this claim. Additionally, adaptive heaps may provide an interesting direction for theoretical investigation

    Stationarily ordered types and the number of countable models

    Full text link
    We introduce notions of stationarily ordered types and theories; the latter generalizes weak o-minimality and the first is a relaxed version of weak o-minimality localized at the locus of a single type. We show that forking, as a binary relation on elements realizing stationarily ordered types, is an equivalence relation and that each stationarily ordered type in a model determines some order-type as an invariant of the model. We study weak and forking non-orthogonality of stationarily ordered types, show that they are equivalence relations and prove that invariants of non-orthogonal types are closely related. The developed techniques are applied to prove that in the case of a binary, stationarily ordered theory with fewer than 2ℵ02^{\aleph_0} countable models, the isomorphism type of a countable model is determined by a certain sequence of invariants of the model. In particular, we confirm Vaught's conjecture for binary, stationarily ordered theories.Comment: Revised version accepted for publication in Annals of Pure and Applied Logi

    Restricted Patience Sorting and Barred Pattern Avoidance

    Full text link
    Patience Sorting is a combinatorial algorithm that can be viewed as an iterated, non-recursive form of the Schensted Insertion Algorithm. In recent work the authors have shown that Patience Sorting provides an algorithmic description for permutations avoiding the barred (generalized) permutation pattern 3−1ˉ−423-\bar{1}-42. Motivated by this and a recently formulated geometric form for Patience Sorting in terms of certain intersecting lattice paths, we study the related themes of restricted input and avoidance of similar barred permutation patterns. One such result is to characterize those permutations for which Patience Sorting is an invertible algorithm as the set of permutations simultaneously avoiding the barred patterns 3−1ˉ−423-\bar{1}-42 and 3−1ˉ−243-\bar{1}-24. We then enumerate this avoidance set, which involves convolved Fibonacci numbers.Comment: 12 pages, LaTeX, uses pstricks, needs fpsac.cls v2: final version of extended abstract for FPSAC'0

    CFOF: A Concentration Free Measure for Anomaly Detection

    Full text link
    We present a novel notion of outlier, called the Concentration Free Outlier Factor, or CFOF. As a main contribution, we formalize the notion of concentration of outlier scores and theoretically prove that CFOF does not concentrate in the Euclidean space for any arbitrary large dimensionality. To the best of our knowledge, there are no other proposals of data analysis measures related to the Euclidean distance for which it has been provided theoretical evidence that they are immune to the concentration effect. We determine the closed form of the distribution of CFOF scores in arbitrarily large dimensionalities and show that the CFOF score of a point depends on its squared norm standard score and on the kurtosis of the data distribution, thus providing a clear and statistically founded characterization of this notion. Moreover, we leverage this closed form to provide evidence that the definition does not suffer of the hubness problem affecting other measures. We prove that the number of CFOF outliers coming from each cluster is proportional to cluster size and kurtosis, a property that we call semi-locality. We determine that semi-locality characterizes existing reverse nearest neighbor-based outlier definitions, thus clarifying the exact nature of their observed local behavior. We also formally prove that classical distance-based and density-based outliers concentrate both for bounded and unbounded sample sizes and for fixed and variable values of the neighborhood parameter. We introduce the fast-CFOF algorithm for detecting outliers in large high-dimensional dataset. The algorithm has linear cost, supports multi-resolution analysis, and is embarrassingly parallel. Experiments highlight that the technique is able to efficiently process huge datasets and to deal even with large values of the neighborhood parameter, to avoid concentration, and to obtain excellent accuracy

    Random Shuffling to Reduce Disorder in Adaptive Sorting Scheme

    Full text link
    In this paper we present a random shuffling scheme to apply with adaptive sorting algorithms. Adaptive sorting algorithms utilize the presortedness present in a given sequence. We have probabilistically increased the amount of presortedness present in a sequence by using a random shuffling technique that requires little computation. Theoretical analysis suggests that the proposed scheme can improve the performance of adaptive sorting. Experimental results show that it significantly reduces the amount of disorder present in a given sequence and improves the execution time of adaptive sorting algorithm as well.Comment: 7 pages, 2 table

    Seq2Slate: Re-ranking and Slate Optimization with RNNs

    Full text link
    Ranking is a central task in machine learning and information retrieval. In this task, it is especially important to present the user with a slate of items that is appealing as a whole. This in turn requires taking into account interactions between items, since intuitively, placing an item on the slate affects the decision of which other items should be placed alongside it. In this work, we propose a sequence-to-sequence model for ranking called seq2slate. At each step, the model predicts the next `best' item to place on the slate given the items already selected. The sequential nature of the model allows complex dependencies between the items to be captured directly in a flexible and scalable way. We show how to learn the model end-to-end from weak supervision in the form of easily obtained click-through data. We further demonstrate the usefulness of our approach in experiments on standard ranking benchmarks as well as in a real-world recommendation system

    Estimation of Monge Matrices

    Full text link
    Monge matrices and their permuted versions known as pre-Monge matrices naturally appear in many domains across science and engineering. While the rich structural properties of such matrices have long been leveraged for algorithmic purposes, little is known about their impact on statistical estimation. In this work, we propose to view this structure as a shape constraint and study the problem of estimating a Monge matrix subject to additive random noise. More specifically, we establish the minimax rates of estimation of Monge and pre-Monge matrices. In the case of pre-Monge matrices, the minimax-optimal least-squares estimator is not efficiently computable, and we propose two efficient estimators and establish their rates of convergence. Our theoretical findings are supported by numerical experiments.Comment: 42 pages, 3 figure

    A Variable Neighborhood MOEA/D for Multiobjective Test Task Scheduling Problem

    Get PDF
    Test task scheduling problem (TTSP) is a typical combinational optimization scheduling problem. This paper proposes a variable neighborhood MOEA/D (VNM) to solve the multiobjective TTSP. Two minimization objectives, the maximal completion time (makespan) and the mean workload, are considered together. In order to make solutions obtained more close to the real Pareto Front, variable neighborhood strategy is adopted. Variable neighborhood approach is proposed to render the crossover span reasonable. Additionally, because the search space of the TTSP is so large that many duplicate solutions and local optima will exist, the Starting Mutation is applied to prevent solutions from becoming trapped in local optima. It is proved that the solutions got by VNM can converge to the global optimum by using Markov Chain and Transition Matrix, respectively. The experiments of comparisons of VNM, MOEA/D, and CNSGA (chaotic nondominated sorting genetic algorithm) indicate that VNM performs better than the MOEA/D and the CNSGA in solving the TTSP. The results demonstrate that proposed algorithm VNM is an efficient approach to solve the multiobjective TTSP

    Identifying Almost Sorted Permutations from TCP Buffer Dynamics

    Get PDF
    Associate to each sequence A of integers (intending to model packet IDs in a TCP/IP stream) a sequence of positive integers of the same length M(A). The i’th entry of M(A) is the size (at time i) of the smallest buffer needed to hold out-of-order packets, where space is accounted for unreceived packets as well. Call two sequences A, B equivalent (written A≡FB B) if M(A) = M(B). For a sequence of integers A define SUS(A) to be the shuffled-up-sequences reordering measure defined as the smallest possible number of classes in a partition of the original sequence into increasing subsequences. We prove the following result: any two permutations A, B of the same length with SUS(A), SUS(B) ≤ 3 such that A ≡FB B are identical. The result is no longer valid if we replace the upper bound 3 by 4. We also consider a similar problem for permutations with repeats. In this case the uniqueness of the preimage is no longer true, but we obtain a characterization of all the preimages of a given sequence, which in particular allows us to count them in polynomial time. The results were motivated by explaining the behavior and engineering RESTORED, a receiver-oriented model of traffic we introduced and experimentally validated in earlier work
    • …
    corecore