38,692 research outputs found
TopSig: Topology Preserving Document Signatures
Performance comparisons between File Signatures and Inverted Files for text
retrieval have previously shown several significant shortcomings of file
signatures relative to inverted files. The inverted file approach underpins
most state-of-the-art search engine algorithms, such as Language and
Probabilistic models. It has been widely accepted that traditional file
signatures are inferior alternatives to inverted files. This paper describes
TopSig, a new approach to the construction of file signatures. Many advances in
semantic hashing and dimensionality reduction have been made in recent times,
but these were not so far linked to general purpose, signature file based,
search engines. This paper introduces a different signature file approach that
builds upon and extends these recent advances. We are able to demonstrate
significant improvements in the performance of signature file based indexing
and retrieval, performance that is comparable to that of state of the art
inverted file based systems, including Language models and BM25. These findings
suggest that file signatures offer a viable alternative to inverted files in
suitable settings and from the theoretical perspective it positions the file
signatures model in the class of Vector Space retrieval models.Comment: 12 pages, 8 figures, CIKM 201
Analyze Large Multidimensional Datasets Using Algebraic Topology
This paper presents an efficient algorithm to extract knowledge from high-dimensionality, high- complexity datasets using algebraic topology, namely simplicial complexes. Based on concept of isomorphism of relations, our method turn a relational table into a geometric object (a simplicial complex is a polyhedron). So, conceptually association rule searching is turned into a geometric traversal problem. By leveraging on the core concepts behind Simplicial Complex, we use a new technique (in computer science) that improves the performance over existing methods and uses far less memory. It was designed and developed with a strong emphasis on scalability, reliability, and extensibility. This paper also investigate the possibility of Hadoop integration and the challenges that come with the framework
A practical guide to computer simulations
Here practical aspects of conducting research via computer simulations are
discussed. The following issues are addressed: software engineering,
object-oriented software development, programming style, macros, make files,
scripts, libraries, random numbers, testing, debugging, data plotting, curve
fitting, finite-size scaling, information retrieval, and preparing
presentations.
Because of the limited space, usually only short introductions to the
specific areas are given and references to more extensive literature are cited.
All examples of code are in C/C++.Comment: 69 pages, with permission of Wiley-VCH, see http://www.wiley-vch.de
(some screenshots with poor quality due to arXiv size restrictions) A
comprehensively extended version will appear in spring 2009 as book at
Word-Scientific, see http://www.worldscibooks.com/physics/6988.htm
Bayes Merging of Multiple Vocabularies for Scalable Image Retrieval
The Bag-of-Words (BoW) representation is well applied to recent
state-of-the-art image retrieval works. Typically, multiple vocabularies are
generated to correct quantization artifacts and improve recall. However, this
routine is corrupted by vocabulary correlation, i.e., overlapping among
different vocabularies. Vocabulary correlation leads to an over-counting of the
indexed features in the overlapped area, or the intersection set, thus
compromising the retrieval accuracy. In order to address the correlation
problem while preserve the benefit of high recall, this paper proposes a Bayes
merging approach to down-weight the indexed features in the intersection set.
Through explicitly modeling the correlation problem in a probabilistic view, a
joint similarity on both image- and feature-level is estimated for the indexed
features in the intersection set.
We evaluate our method through extensive experiments on three benchmark
datasets. Albeit simple, Bayes merging can be well applied in various merging
tasks, and consistently improves the baselines on multi-vocabulary merging.
Moreover, Bayes merging is efficient in terms of both time and memory cost, and
yields competitive performance compared with the state-of-the-art methods.Comment: 8 pages, 7 figures, 6 tables, accepted to CVPR 201
A Short Travel for Neutrinos in Large Extra Dimensions
Neutrino oscillations successfully explain the flavor transitions observed in
neutrinos produced in natural sources like the center of the sun and the earth
atmosphere, and also from man-made sources like reactors and accelerators.
These oscillations are driven by two mass-squared differences, solar and
atmospheric, at the sub-eV scale. However, longstanding anomalies at
short-baselines might imply the existence of new oscillation frequencies at the
eV-scale and the possibility of this sterile state(s) to mix with the three
active neutrinos. One of the many future neutrino programs that are expected to
provide a final word on this issue is the Short-Baseline Neutrino Program (SBN)
at FERMILAB. In this letter, we consider a specific model of Large Extra
Dimensions (LED) which provides interesting signatures of oscillation of extra
sterile states. We started re-creating sensitivity analyses for sterile
neutrinos in the 3+1 scenario, previously done by the SBN collaboration, by
simulating neutrino events in the three SBN detectors from both muon neutrino
disappearance and electron neutrino appearance. Then, we implemented neutrino
oscillations as predicted in the LED model and also we have performed
sensitivity analysis to the LED parameters. Finally, we studied the SBN power
of discriminating between the two models, the 3+1 and the LED. We have found
that SBN is sensitive to the oscillations predicted in the LED model and have
the potential to constrain the LED parameter space better than any other
oscillation experiment, for . In case SBN observes a
departure from the three active neutrino framework, it also has the power of
discriminating between sterile oscillations predicted in the 3+1 framework and
the LED ones.Comment: 21 pages, 6 figures, 2 table
Exploiting boundary states of imperfect spin chains for high-fidelity state transfer
We study transfer of a quantum state through XX spin chains with static
imperfections. We combine the two standard approaches for state transfer based
on (i) modulated couplings between neighboring spins throughout the spin chain
and (ii) weak coupling of the outermost spins to an unmodulated spin chain. The
combined approach allows us to design spin chains with modulated couplings and
localized boundary states, permitting high-fidelity state transfer in the
presence of random static imperfections of the couplings. The modulated
couplings are explicitly obtained from an exact algorithm using the close
relation between tridiagonal matrices and orthogonal polynomials [Linear
Algebr. Appl. 21, 245 (1978)]. The implemented algorithm and a graphical user
interface for constructing spin chains with boundary states (spinGUIn) are
provided as Supplemental Material.Comment: 7 pages, 3 figures + spinGUIn description and Matlab files
iepsolve.m, spinGUIn.fig, spinGUIn.
- …