Search CORE

8,305 research outputs found

Lattice path counting and the theory of queues

Author: Böhm Walter
Publication venue: Department of Statistics and Mathematics, WU Vienna University of Economics and Business
Publication date: 01/01/2008
Field of study

In this paper we will show how recent advances in the combinatorics of lattice paths can be applied to solve interesting and nontrivial problems in the theory of queues. The problems we discuss range from classical ones like M^a/M^b/1 systems to open tandem systems with and without global blocking and to queueing models that are related to random walks in a quarter plane like the Flatto-Hahn model or systems with preemptive priorities. (author´s abstract)Series: Research Report Series / Department of Statistics and Mathematic

Structure-Aware Sampling: Flexible and Accurate Summarization

Author: Cohen Edith
Cormode Graham
Duffield Nick
Publication venue
Publication date: 01/01/2011
Field of study

In processing large quantities of data, a fundamental problem is to obtain a summary which supports approximate query answering. Random sampling yields flexible summaries which naturally support subset-sum queries with unbiased estimators and well-understood confidence bounds. Classic sample-based summaries, however, are designed for arbitrary subset queries and are oblivious to the structure in the set of keys. The particular structure, such as hierarchy, order, or product space (multi-dimensional), makes range queries much more relevant for most analysis of the data. Dedicated summarization algorithms for range-sum queries have also been extensively studied. They can outperform existing sampling schemes in terms of accuracy on range queries per summary size. Their accuracy, however, rapidly degrades when, as is often the case, the query spans multiple ranges. They are also less flexible - being targeted for range sum queries alone - and are often quite costly to build and use. In this paper we propose and evaluate variance optimal sampling schemes that are structure-aware. These summaries improve over the accuracy of existing structure-oblivious sampling schemes on range queries while retaining the benefits of sample-based summaries: flexible summaries, with high accuracy on both range queries and arbitrary subset queries

arXiv.org e-Print Archive

CiteSeerX

Dynamic Range Majority Data Structures

Author: A. Andersson
E.D. Demaine
J. Bentley
J. Misra
L. Arge
M. Fredman
P. Bozanis
P. Gupta
R. Karp
S. Durocher
T. Gagie
T. Husfeldt
Y. Lai
Publication venue
Publication date: 01/01/2011
Field of study

Given a set

P

of coloured points on the real line, we study the problem of answering range

\alpha

-majority (or "heavy hitter") queries on

P

. More specifically, for a query range

Q

, we want to return each colour that is assigned to more than an

\alpha

-fraction of the points contained in

Q

. We present a new data structure for answering range

\alpha

-majority queries on a dynamic set of points, where

\alpha \in (0,1)

. Our data structure uses O(n) space, supports queries in

O((\lg n) / \alpha)

time, and updates in

O((\lg n) / \alpha)

amortized time. If the coordinates of the points are integers, then the query time can be improved to

O(\lg n / (\alpha \lg \lg n) + (\lg(1/\alpha))/\alpha))

. For constant values of

\alpha

, this improved query time matches an existing lower bound, for any data structure with polylogarithmic update time. We also generalize our data structure to handle sets of points in d-dimensions, for

d \ge 2

, as well as dynamic arrays, in which each entry is a colour.Comment: 16 pages, Preliminary version appeared in ISAAC 201

arXiv.org e-Print Archive

CiteSeerX

Copenhagen University Research Information System

Multiple Comparative Metagenomics using Multiset k-mer Counting

Author: Benoit Gaëtan
Drezen Erwan
Lavenier Dominique
Lemaitre Claire
Mariadassou Mahendra
Peterlongo Pierre
Schbath Sophie
Publication venue
Publication date: 28/04/2016
Field of study

Background. Large scale metagenomic projects aim to extract biodiversity knowledge between different environmental conditions. Current methods for comparing microbial communities face important limitations. Those based on taxonomical or functional assignation rely on a small subset of the sequences that can be associated to known organisms. On the other hand, de novo methods, that compare the whole sets of sequences, either do not scale up on ambitious metagenomic projects or do not provide precise and exhaustive results. Methods. These limitations motivated the development of a new de novo metagenomic comparative method, called Simka. This method computes a large collection of standard ecological distances by replacing species counts by k-mer counts. Simka scales-up today's metagenomic projects thanks to a new parallel k-mer counting strategy on multiple datasets. Results. Experiments on public Human Microbiome Project datasets demonstrate that Simka captures the essential underlying biological structure. Simka was able to compute in a few hours both qualitative and quantitative ecological distances on hundreds of metagenomic samples (690 samples, 32 billions of reads). We also demonstrate that analyzing metagenomes at the k-mer level is highly correlated with extremely precise de novo comparison techniques which rely on all-versus-all sequences alignment strategy or which are based on taxonomic profiling

arXiv.org e-Print Archive

HAL-CentraleSupelec

INRIA a CCSD electronic archive server

Directory of Open Access Journals

ProdInra

Hal-Diderot

HAL-Rennes 1

Biologically Inspired Approaches to Automated Feature Extraction and Target Recognition

Author: Carpenter Gail
Martens Siegfried
Mingolla Ennio
Ogas Ogi
Sai Chaitanya
Publication venue: Boston University Center for Adaptive Systems and Department of Cognitive and Neural Systems
Publication date: 01/10/2004
Field of study

Ongoing research at Boston University has produced computational models of biological vision and learning that embody a growing corpus of scientific data and predictions. Vision models perform long-range grouping and figure/ground segmentation, and memory models create attentionally controlled recognition codes that intrinsically cornbine botton-up activation and top-down learned expectations. These two streams of research form the foundation of novel dynamically integrated systems for image understanding. Simulations using multispectral images illustrate road completion across occlusions in a cluttered scene and information fusion from incorrect labels that are simultaneously inconsistent and correct. The CNS Vision and Technology Labs (cns.bu.edulvisionlab and cns.bu.edu/techlab) are further integrating science and technology through analysis, testing, and development of cognitive and neural models for large-scale applications, complemented by software specification and code distribution.Air Force Office of Scientific Research (F40620-01-1-0423); National Geographic-Intelligence Agency (NMA 201-001-1-2016); National Science Foundation (SBE-0354378; BCS-0235298); Office of Naval Research (N00014-01-1-0624); National Geospatial-Intelligence Agency and the National Society of Siegfried Martens (NMA 501-03-1-2030, DGE-0221680); Department of Homeland Security graduate fellowshi

Boston University Institutional Repository (OpenBU)