Search CORE

114 research outputs found

Identifying almost sorted permutations from TCP buffer dynamics

Author: Istrate Gabriel
Publication venue
Publication date: 09/10/2008
Field of study

Associate to each sequence

A

of integers (intending to represent packet IDs) a sequence of positive integers of the same length

{\mathcal M}(A)

. The

i

'th entry of

{\mathcal M}(A)

is the size (at time

i

) of the smallest buffer needed to hold out-of-order packets, where space is accounted for unreceived packets as well. Call two sequences

A

B

{\em equivalent} (written

A\equiv_{FB} B

) if

{\mathcal M}(A)={\mathcal M}(B)

. We prove the following result: any two permutations

A,B

of the same length with

SUS(A)

SUS(B)\leq 3

(where SUS is the {\em shuffled-up-sequences} reordering measure), and such that

A\equiv_{FB} B

are identical. The result (which is no longer valid if we replace the upper bound 3 by 4) was motivated by RESTORED, a receiver-oriented model of network traffic we have previously introduced

arXiv.org e-Print Archive

List Heaps

Author: Frohmader Andrew
Publication venue
Publication date: 15/02/2018
Field of study

This paper presents a simple extension of the binary heap, the List Heap. We use List Heaps to demonstrate the idea of adaptive heaps: heaps whose performance is a function of both the size of the problem instance and the disorder of the problem instance. We focus on the presortedness of the input sequence as a measure of disorder for the problem instance. A number of practical applications that rely on heaps deal with input that is not random. Even random input contains presorted subsequences. Devising heaps that exploit this structure may provide a means for improving practical performance. We present some basic empirical tests to support this claim. Additionally, adaptive heaps may provide an interesting direction for theoretical investigation

arXiv.org e-Print Archive

Stationarily ordered types and the number of countable models

Author: Moconja Slavko
Tanović Predrag
Publication venue: 'Elsevier BV'
Publication date: 18/12/2019
Field of study

We introduce notions of stationarily ordered types and theories; the latter generalizes weak o-minimality and the first is a relaxed version of weak o-minimality localized at the locus of a single type. We show that forking, as a binary relation on elements realizing stationarily ordered types, is an equivalence relation and that each stationarily ordered type in a model determines some order-type as an invariant of the model. We study weak and forking non-orthogonality of stationarily ordered types, show that they are equivalence relations and prove that invariants of non-orthogonal types are closely related. The developed techniques are applied to prove that in the case of a binary, stationarily ordered theory with fewer than

2^{\aleph_0}

countable models, the isomorphism type of a countable model is determined by a certain sequence of invariants of the model. In particular, we confirm Vaught's conjecture for binary, stationarily ordered theories.Comment: Revised version accepted for publication in Annals of Pure and Applied Logi

arXiv.org e-Print Archive

Restricted Patience Sorting and Barred Pattern Avoidance

Author: Burstein Alexander
Lankham Isaiah
Publication venue
Publication date: 01/01/2006
Field of study

Patience Sorting is a combinatorial algorithm that can be viewed as an iterated, non-recursive form of the Schensted Insertion Algorithm. In recent work the authors have shown that Patience Sorting provides an algorithmic description for permutations avoiding the barred (generalized) permutation pattern

3-\bar{1}-42

. Motivated by this and a recently formulated geometric form for Patience Sorting in terms of certain intersecting lattice paths, we study the related themes of restricted input and avoidance of similar barred permutation patterns. One such result is to characterize those permutations for which Patience Sorting is an invertible algorithm as the set of permutations simultaneously avoiding the barred patterns

3-\bar{1}-42

and

3-\bar{1}-24

. We then enumerate this avoidance set, which involves convolved Fibonacci numbers.Comment: 12 pages, LaTeX, uses pstricks, needs fpsac.cls v2: final version of extended abstract for FPSAC'0

arXiv.org e-Print Archive

CiteSeerX

CFOF: A Concentration Free Measure for Anomaly Detection

Author: Angiulli Fabrizio
Publication venue
Publication date: 17/09/2019
Field of study

We present a novel notion of outlier, called the Concentration Free Outlier Factor, or CFOF. As a main contribution, we formalize the notion of concentration of outlier scores and theoretically prove that CFOF does not concentrate in the Euclidean space for any arbitrary large dimensionality. To the best of our knowledge, there are no other proposals of data analysis measures related to the Euclidean distance for which it has been provided theoretical evidence that they are immune to the concentration effect. We determine the closed form of the distribution of CFOF scores in arbitrarily large dimensionalities and show that the CFOF score of a point depends on its squared norm standard score and on the kurtosis of the data distribution, thus providing a clear and statistically founded characterization of this notion. Moreover, we leverage this closed form to provide evidence that the definition does not suffer of the hubness problem affecting other measures. We prove that the number of CFOF outliers coming from each cluster is proportional to cluster size and kurtosis, a property that we call semi-locality. We determine that semi-locality characterizes existing reverse nearest neighbor-based outlier definitions, thus clarifying the exact nature of their observed local behavior. We also formally prove that classical distance-based and density-based outliers concentrate both for bounded and unbounded sample sizes and for fixed and variable values of the neighborhood parameter. We introduce the fast-CFOF algorithm for detecting outliers in large high-dimensional dataset. The algorithm has linear cost, supports multi-resolution analysis, and is embarrassingly parallel. Experiments highlight that the technique is able to efficiently process huge datasets and to deal even with large values of the neighborhood parameter, to avoid concentration, and to obtain excellent accuracy

arXiv.org e-Print Archive

Random Shuffling to Reduce Disorder in Adaptive Sorting Scheme

Author: Karim Md. Enamul
Mahmood Abdun Naser
Publication venue
Publication date: 02/12/2000
Field of study

In this paper we present a random shuffling scheme to apply with adaptive sorting algorithms. Adaptive sorting algorithms utilize the presortedness present in a given sequence. We have probabilistically increased the amount of presortedness present in a sequence by using a random shuffling technique that requires little computation. Theoretical analysis suggests that the proposed scheme can improve the performance of adaptive sorting. Experimental results show that it significantly reduces the amount of disorder present in a given sequence and improves the execution time of adaptive sorting algorithm as well.Comment: 7 pages, 2 table

arXiv.org e-Print Archive

Seq2Slate: Re-ranking and Slate Optimization with RNNs

Author: Bello Irwan
Boutilier Craig
Chi Ed
Eban Elad
Jain Sagar
Kulkarni Sayali
Luo Xiyang
Mackey Alan
Meshi Ofer
Publication venue
Publication date: 19/03/2019
Field of study

Ranking is a central task in machine learning and information retrieval. In this task, it is especially important to present the user with a slate of items that is appealing as a whole. This in turn requires taking into account interactions between items, since intuitively, placing an item on the slate affects the decision of which other items should be placed alongside it. In this work, we propose a sequence-to-sequence model for ranking called seq2slate. At each step, the model predicts the next `best' item to place on the slate given the items already selected. The sequential nature of the model allows complex dependencies between the items to be captured directly in a flexible and scalable way. We show how to learn the model end-to-end from weak supervision in the form of easily obtained click-through data. We further demonstrate the usefulness of our approach in experiments on standard ranking benchmarks as well as in a real-world recommendation system

arXiv.org e-Print Archive

Estimation of Monge Matrices

Author: Hütter Jan-Christian
Mao Cheng
Rigollet Philippe
Robeva Elina
Publication venue
Publication date: 05/04/2019
Field of study

Monge matrices and their permuted versions known as pre-Monge matrices naturally appear in many domains across science and engineering. While the rich structural properties of such matrices have long been leveraged for algorithmic purposes, little is known about their impact on statistical estimation. In this work, we propose to view this structure as a shape constraint and study the problem of estimating a Monge matrix subject to additive random noise. More specifically, we establish the minimax rates of estimation of Monge and pre-Monge matrices. In the case of pre-Monge matrices, the minimax-optimal least-squares estimator is not efficiently computable, and we propose two efficient estimators and establish their rates of convergence. Our theoretical findings are supported by numerical experiments.Comment: 42 pages, 3 figure

arXiv.org e-Print Archive

A Variable Neighborhood MOEA/D for Multiobjective Test Task Scheduling Problem

Author: Hui Lu
Lijuan Yin
Xiaoteng Wang
Zheng Zhu
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2014
Field of study

Test task scheduling problem (TTSP) is a typical combinational optimization scheduling problem. This paper proposes a variable neighborhood MOEA/D (VNM) to solve the multiobjective TTSP. Two minimization objectives, the maximal completion time (makespan) and the mean workload, are considered together. In order to make solutions obtained more close to the real Pareto Front, variable neighborhood strategy is adopted. Variable neighborhood approach is proposed to render the crossover span reasonable. Additionally, because the search space of the TTSP is so large that many duplicate solutions and local optima will exist, the Starting Mutation is applied to prevent solutions from becoming trapped in local optima. It is proved that the solutions got by VNM can converge to the global optimum by using Markov Chain and Transition Matrix, respectively. The experiments of comparisons of VNM, MOEA/D, and CNSGA (chaotic nondominated sorting genetic algorithm) indicate that VNM performs better than the MOEA/D and the CNSGA in solving the TTSP. The results demonstrate that proposed algorithm VNM is an efficient approach to solve the multiobjective TTSP

Directory of Open Access Journals

Identifying Almost Sorted Permutations from TCP Buffer Dynamics

Author: G. Istrate
Publication venue: 'Scientific Annals of Computer Science'
Publication date: 01/06/2015
Field of study

Associate to each sequence A of integers (intending to model packet IDs in a TCP/IP stream) a sequence of positive integers of the same length M(A). The i’th entry of M(A) is the size (at time i) of the smallest buffer needed to hold out-of-order packets, where space is accounted for unreceived packets as well. Call two sequences A, B equivalent (written A≡FB B) if M(A) = M(B). For a sequence of integers A define SUS(A) to be the shuffled-up-sequences reordering measure defined as the smallest possible number of classes in a partition of the original sequence into increasing subsequences. We prove the following result: any two permutations A, B of the same length with SUS(A), SUS(B) ≤ 3 such that A ≡FB B are identical. The result is no longer valid if we replace the upper bound 3 by 4. We also consider a similar problem for permutations with repeats. In this case the uniqueness of the preimage is no longer true, but we obtain a characterization of all the preimages of a given sequence, which in particular allows us to count them in polynomial time. The results were motivated by explaining the behavior and engineering RESTORED, a receiver-oriented model of traffic we introduced and experimentally validated in earlier work

Directory of Open Access Journals