1,526,399 research outputs found
String Synchronizing Sets: Sublinear-Time BWT Construction and Optimal LCE Data Structure
Burrows-Wheeler transform (BWT) is an invertible text transformation that,
given a text of length , permutes its symbols according to the
lexicographic order of suffixes of . BWT is one of the most heavily studied
algorithms in data compression with numerous applications in indexing, sequence
analysis, and bioinformatics. Its construction is a bottleneck in many
scenarios, and settling the complexity of this task is one of the most
important unsolved problems in sequence analysis that has remained open for 25
years. Given a binary string of length , occupying machine
words, the BWT construction algorithm due to Hon et al. (SIAM J. Comput., 2009)
runs in time and space. Recent advancements (Belazzougui,
STOC 2014, and Munro et al., SODA 2017) focus on removing the alphabet-size
dependency in the time complexity, but they still require time.
In this paper, we propose the first algorithm that breaks the -time
barrier for BWT construction. Given a binary string of length , our
procedure builds the Burrows-Wheeler transform in time and
space. We complement this result with a conditional lower bound
proving that any further progress in the time complexity of BWT construction
would yield faster algorithms for the very well studied problem of counting
inversions: it would improve the state-of-the-art -time
solution by Chan and P\v{a}tra\c{s}cu (SODA 2010). Our algorithm is based on a
novel concept of string synchronizing sets, which is of independent interest.
As one of the applications, we show that this technique lets us design a data
structure of the optimal size that answers Longest Common
Extension queries (LCE queries) in time and, furthermore, can be
deterministically constructed in the optimal time.Comment: Full version of a paper accepted to STOC 201
Counting "exotics"
An introduced or exotic species is commonly defined as an organism accidentally or intentionally introduced to a new location by human activity (Williamson 1996; Richardson et al. 2000; Guo and Ricklefs 2010). However, the counting of exotics is often inconsistent. For example, in the US, previously published plant richness data for each state are only those either native or exotic to the US (USDA and NRCS 2004), not actually to the state. Yet, within-country (e.g., among states, counties) species introductions which form “homegrown exotics” (Cox 1999) or “native invaders” (Simberloff 2011) are undoubtedly numerous. The growing human population and associated activity increase species introductions at all levels, both international and internal but to date intercontinental species introductions have always been the focus. Those species introduced among neighboring areas are often unnoticed but they are actually far more frequent due to the proximity and environmental similarities. Many domestic exotic plant species exhibit high invasiveness such as Spartina alterniflora (smooth cordgrass; introduced from the east coast to California) and Molothrus ater (brown-headed cowbird; introduced from the Great Plains to California)
Counting Carambolas
We give upper and lower bounds on the maximum and minimum number of geometric
configurations of various kinds present (as subgraphs) in a triangulation of
points in the plane. Configurations of interest include \emph{convex
polygons}, \emph{star-shaped polygons} and \emph{monotone paths}. We also
consider related problems for \emph{directed} planar straight-line graphs.Comment: update reflects journal version, to appear in Graphs and
Combinatorics; 18 pages, 13 figure
Counting monomials
This paper presents two enumeration techniques based on Hilbert functions. The paper illustrates these techniques by solving two chessboard problems
Double Counting in LDA+DMFT - The Example of NiO
An intrinsic issue of the LDA+DMFT approach is the so called double counting
of interaction terms. How to choose the double-counting potential in a manner
that is both physically sound and consistent is unknown. We have conducted an
extensive study of the charge transfer system NiO in the LDA+DMFT framework
using quantum Monte Carlo and exact diagonalization as impurity solvers. By
explicitly treating the double-counting correction as an adjustable parameter
we systematically investigated the effects of different choices for the double
counting on the spectral function. Different methods for fixing the double
counting can drive the result from Mott insulating to almost metallic. We
propose a reasonable scheme for the determination of double-counting
corrections for insulating systems.Comment: 7 pages, 6 figure
Field-normalized citation impact indicators and the choice of an appropriate counting method
Bibliometric studies often rely on field-normalized citation impact
indicators in order to make comparisons between scientific fields. We discuss
the connection between field normalization and the choice of a counting method
for handling publications with multiple co-authors. Our focus is on the choice
between full counting and fractional counting. Based on an extensive
theoretical and empirical analysis, we argue that properly field-normalized
results cannot be obtained when full counting is used. Fractional counting does
provide results that are properly field normalized. We therefore recommend the
use of fractional counting in bibliometric studies that require field
normalization, especially in studies at the level of countries and research
organizations. We also compare different variants of fractional counting. In
general, it seems best to use either the author-level or the address-level
variant of fractional counting
Analysis of General Power Counting Rules in Effective Field Theory
We derive the general counting rules for a quantum effective field theory
(EFT) in dimensions. The rules are valid for strongly and weakly
coupled theories, and predict that all kinetic energy terms are canonically
normalized. They determine the energy dependence of scattering cross sections
in the range of validity of the EFT expansion. We show that the size of cross
sections is controlled by the power counting of EFT, not by chiral
counting, even for chiral perturbation theory (PT). The relation between
and is generalized to dimensions. We show that the
naive dimensional analysis counting is related to counting. The
EFT counting rules are applied to PT, low-energy weak interactions,
Standard Model EFT and the non-trivial case of Higgs EFT.Comment: V2: more details and examples added; version published in journal. 17
pages, 4 figures, 2 table
Counting Supertubes
The quantum states of the supertube are counted by directly quantizing the
linearized Born-Infeld action near the round tube. The result is an entropy , in accord with conjectures in the
literature. As a result, supertubes may be the generic D0-F1 bound state. Our
approach also shows directly that supertubes are marginal bound states with a
discrete spectrum. We also discuss the relation to recent suggestions of Mathur
et al involving three-charge black holes.Comment: 15 pages, v2: reference corrected; v3: few corrections and explicit
derivation of a relation are added to appendix
People, Penguins and Petri Dishes: Adapting Object Counting Models To New Visual Domains And Object Types Without Forgetting
In this paper we propose a technique to adapt a convolutional neural network
(CNN) based object counter to additional visual domains and object types while
still preserving the original counting function. Domain-specific normalisation
and scaling operators are trained to allow the model to adjust to the
statistical distributions of the various visual domains. The developed
adaptation technique is used to produce a singular patch-based counting
regressor capable of counting various object types including people, vehicles,
cell nuclei and wildlife. As part of this study a challenging new cell counting
dataset in the context of tissue culture and patient diagnosis is constructed.
This new collection, referred to as the Dublin Cell Counting (DCC) dataset, is
the first of its kind to be made available to the wider computer vision
community. State-of-the-art object counting performance is achieved in both the
Shanghaitech (parts A and B) and Penguins datasets while competitive
performance is observed on the TRANCOS and Modified Bone Marrow (MBM) datasets,
all using a shared counting model.Comment: 10 page
- …
