34,152 research outputs found
Lattice score based data cleaning for phrase-based statistical machine translation
Statistical machine translation relies heavily
on parallel corpora to train its models
for translation tasks. While more and
more bilingual corpora are readily available,
the quality of the sentence pairs
should be taken into consideration. This
paper presents a novel lattice score-based
data cleaning method to select proper sentence
pairs from the ones extracted from a
bilingual corpus by the sentence alignment
methods. The proposed method is carried
out as follows: firstly, an initial phrasebased
model is trained on the full sentencealigned
corpus; then for each of the sentence
pairs in the corpus, word alignments
are used to create anchor pairs and sourceside
lattices; thirdly, based on the translation
model, target-side phrase networks
are expanded on the lattices and Viterbi
searching is used to find approximated decoding
results; finally, BLEU score thresholds
are used to filter out the low-score
sentence pairs for the data cleaning purpose.
Our experiments on the FBIS corpus
showed improvements of BLEU score
from 23.78 to 24.02 in Chinese-English
Faster tuple lattice sieving using spherical locality-sensitive filters
To overcome the large memory requirement of classical lattice sieving
algorithms for solving hard lattice problems, Bai-Laarhoven-Stehl\'{e} [ANTS
2016] studied tuple lattice sieving, where tuples instead of pairs of lattice
vectors are combined to form shorter vectors. Herold-Kirshanova [PKC 2017]
recently improved upon their results for arbitrary tuple sizes, for example
showing that a triple sieve can solve the shortest vector problem (SVP) in
dimension in time , using a technique similar to
locality-sensitive hashing for finding nearest neighbors.
In this work, we generalize the spherical locality-sensitive filters of
Becker-Ducas-Gama-Laarhoven [SODA 2016] to obtain space-time tradeoffs for near
neighbor searching on dense data sets, and we apply these techniques to tuple
lattice sieving to obtain even better time complexities. For instance, our
triple sieve heuristically solves SVP in time . For
practical sieves based on Micciancio-Voulgaris' GaussSieve [SODA 2010], this
shows that a triple sieve uses less space and less time than the current best
near-linear space double sieve.Comment: 12 pages + references, 2 figures. Subsumed/merged into Cryptology
ePrint Archive 2017/228, available at https://ia.cr/2017/122
Scaling law in target-hunting processes
We study the hunting process for a target, in which the hunter tracks the
goal by smelling odors it emits. The odor intensity is supposed to decrease
with the distance it diffuses. The Monte Carlo experiment is carried out on a
2-dimensional square lattice. Having no idea of the location of the target, the
hunter determines its moves only by random attempts in each direction. By
sorting the searching time in each simulation and introducing a variable to
reflect the sequence of searching time, we obtain a curve with a wide plateau,
indicating a most probable time of successfully finding out the target. The
simulations reveal a scaling law for the searching time versus the distance to
the position of the target. The scaling exponent depends on the sensitivity of
the hunter. Our model may be a prototype in studying such the searching
processes as various foods-foraging behavior of the wild animals.Comment: 7 figure
Target-searching on the percolation
We study target-searching processes on a percolation, on which a hunter
tracks a target by smelling odors it emits. The odor intensity is supposed to
be inversely proportional to the distance it propagates. The Monte Carlo
simulation is performed on a 2-dimensional bond-percolation above the
threshold. Having no idea of the location of the target, the hunter determines
its moves only by random attempts in each direction. For lager percolation
connectivity , it reveals a scaling law for the searching time
versus the distance to the position of the target. The scaling exponent is
dependent on the sensitivity of the hunter. For smaller , the scaling law is
broken and the probability of finding out the target significantly reduces. The
hunter seems trapped in the cluster of the percolation and can hardly reach the
goal.Comment: 5 figure
Parallelization of the Wolff Single-Cluster Algorithm
A parallel [open multiprocessing (OpenMP)] implementation of the Wolff single-cluster algorithm has been developed and tested for the three-dimensional (3D) Ising model. The developed procedure is generalizable to other lattice spin models and its effectiveness depends on the specific application at hand. The applicability of the developed methodology is discussed in the context of the applications, where a sophisticated shuffling scheme is used to generate pseudorandom numbers of high quality, and an iterative method is applied to find the critical temperature of the 3D Ising model with a great accuracy. For the lattice with linear size L=1024, we have reached the speedup about 1.79 times on two processors and about 2.67 times on four processors, as compared to the serial code. According to our estimation, the speedup about three times on four processors is reachable for the O(n) models with n ≥ 2. Furthermore, the application of the developed OpenMP code allows us to simulate larger lattices due to greater operative (shared) memory available
Solving the Shortest Vector Problem in Lattices Faster Using Quantum Search
By applying Grover's quantum search algorithm to the lattice algorithms of
Micciancio and Voulgaris, Nguyen and Vidick, Wang et al., and Pujol and
Stehl\'{e}, we obtain improved asymptotic quantum results for solving the
shortest vector problem. With quantum computers we can provably find a shortest
vector in time , improving upon the classical time
complexity of of Pujol and Stehl\'{e} and the of Micciancio and Voulgaris, while heuristically we expect to find a
shortest vector in time , improving upon the classical time
complexity of of Wang et al. These quantum complexities
will be an important guide for the selection of parameters for post-quantum
cryptosystems based on the hardness of the shortest vector problem.Comment: 19 page
- …