291 research outputs found
Fast Searching in Packed Strings
Given strings and the (exact) string matching problem is to find all
positions of substrings in matching . The classical Knuth-Morris-Pratt
algorithm [SIAM J. Comput., 1977] solves the string matching problem in linear
time which is optimal if we can only read one character at the time. However,
most strings are stored in a computer in a packed representation with several
characters in a single word, giving us the opportunity to read multiple
characters simultaneously. In this paper we study the worst-case complexity of
string matching on strings given in packed representation. Let be
the lengths and , respectively, and let denote the size of the
alphabet. On a standard unit-cost word-RAM with logarithmic word size we
present an algorithm using time O\left(\frac{n}{\log_\sigma n} + m +
\occ\right). Here \occ is the number of occurrences of in . For this improves the bound of the Knuth-Morris-Pratt algorithm.
Furthermore, if our algorithm is optimal since any
algorithm must spend at least \Omega(\frac{(n+m)\log
\sigma}{\log n} + \occ) = \Omega(\frac{n}{\log_\sigma n} + \occ) time to
read the input and report all occurrences. The result is obtained by a novel
automaton construction based on the Knuth-Morris-Pratt algorithm combined with
a new compact representation of subautomata allowing an optimal
tabulation-based simulation.Comment: To appear in Journal of Discrete Algorithms. Special Issue on CPM
200
The VLBA Imaging and Polarimetry Survey at 5 GHz
We present the first results of the VLBA Imaging and Polarimetry Survey
(VIPS), a 5 GHz VLBI survey of 1,127 sources with flat radio spectra. Through
automated data reduction and imaging routines, we have produced publicly
available I, Q, and U images and have detected polarized flux density from 37%
of the sources. We have also developed an algorithm to use each source's I
image to automatically classify it as a point-like source, a core-jet, a
compact symmetric object (CSO) candidate, or a complex source. The mean ratio
of the polarized to total 5 GHz flux density for VIPS sources with detected
polarized flux density ranges from 1% to 20% with a median value of about 5%.
We have also found significant evidence that the directions of the jets in
core-jet systems tend to be perpendicular to the electric vector position
angles (EVPAs). The data is consistent with a scenario in which ~24% of the
polarized core-jets have EVPAs that are anti-aligned with the directions of
their jet components and which have a substantial amount of Faraday rotation.
In addition to these initial results, plans for future follow-up observations
are discussed.Comment: 36 pages, 3 tables, 13 figures; accepted for publication in Ap
Shape and blocking effects on odd-even mass differences and rotational motion of nuclei
Nuclear shapes and odd-nucleon blockings strongly influence the odd-even
differences of nuclear masses. When such effects are taken into account, the
determination of the pairing strength is modified resulting in larger pair
gaps. The modified pairing strength leads to an improved self-consistent
description of moments of inertia and backbending frequencies, with no
additional parameters.Comment: 7 pages, 3 figures, subm to PR
Can inflationary models of cosmic perturbations evade the secondary oscillation test?
We consider the consequences of an observed Cosmic Microwave Background (CMB)
temperature anisotropy spectrum containing no secondary oscillations. While
such a spectrum is generally considered to be a robust signature of active
structure formation, we show that such a spectrum {\em can} be produced by
(very unusual) inflationary models or other passive evolution models. However,
we show that for all these passive models the characteristic oscillations would
show up in other observable spectra. Our work shows that when CMB polarization
and matter power spectra are taken into account secondary oscillations are
indeed a signature of even these very exotic passive models. We construct a
measure of the observability of secondary oscillations in a given experiment,
and show that even with foregrounds both the MAP and \pk satellites should be
able to distinguish between models with and without oscillations. Thus we
conclude that inflationary and other passive models can {\em not} evade the
secondary oscillation test.Comment: Final version accepted for publication in PRD. Minor improvements
have been made to the discussion and new data has been included. The
conclusions are unchagne
Mean-field description of ground-state properties of drip-line nuclei. (I) Shell-correction method
A shell-correction method is applied to nuclei far from the beta stability
line and its suitability to describe effects of the particle continuum is
discussed. The sensitivity of predicted locations of one- and two-particle drip
lines to details of the macroscopic-microscopic model is analyzed.Comment: 22 REVTeX pages, 13 uuencoded postscript figures available upon
reques
Dynamics of earthquake nucleation process represented by the Burridge-Knopoff model
Dynamics of earthquake nucleation process is studied on the basis of the
one-dimensional Burridge-Knopoff (BK) model obeying the rate- and
state-dependent friction (RSF) law. We investigate the properties of the model
at each stage of the nucleation process, including the quasi-static initial
phase, the unstable acceleration phase and the high-speed rupture phase or a
mainshock. Two kinds of nucleation lengths L_sc and L_c are identified and
investigated. The nucleation length L_sc and the initial phase exist only for a
weak frictional instability regime, while the nucleation length L_c and the
acceleration phase exist for both weak and strong instability regimes. Both
L_sc and L_c are found to be determined by the model parameters, the frictional
weakening parameter and the elastic stiffness parameter, hardly dependent on
the size of an ensuing mainshock. The sliding velocity is extremely slow in the
initial phase up to L_sc, of order the pulling speed of the plate, while it
reaches a detectable level at a certain stage of the acceleration phase. The
continuum limits of the results are discussed. The continuum limit of the BK
model lies in the weak frictional instability regime so that a mature
homogeneous fault under the RSF law always accompanies the quasi-static
nucleation process. Duration times of each stage of the nucleation process are
examined. The relation to the elastic continuum model and implications to real
seismicity are discussed.Comment: Title changed. Changes mainly in abstract and in section 1. To appear
in European Physical Journal
Machine Learning in Automated Text Categorization
The automated categorization (or classification) of texts into predefined
categories has witnessed a booming interest in the last ten years, due to the
increased availability of documents in digital form and the ensuing need to
organize them. In the research community the dominant approach to this problem
is based on machine learning techniques: a general inductive process
automatically builds a classifier by learning, from a set of preclassified
documents, the characteristics of the categories. The advantages of this
approach over the knowledge engineering approach (consisting in the manual
definition of a classifier by domain experts) are a very good effectiveness,
considerable savings in terms of expert manpower, and straightforward
portability to different domains. This survey discusses the main approaches to
text categorization that fall within the machine learning paradigm. We will
discuss in detail issues pertaining to three different problems, namely
document representation, classifier construction, and classifier evaluation.Comment: Accepted for publication on ACM Computing Survey
- âŠ