15,439 research outputs found
Natural clustering: the modularity approach
We show that modularity, a quantity introduced in the study of networked
systems, can be generalized and used in the clustering problem as an indicator
for the quality of the solution. The introduction of this measure arises very
naturally in the case of clustering algorithms that are rooted in Statistical
Mechanics and use the analogy with a physical system.Comment: 11 pages, 5 figure enlarged versio
Classical diffusion in double-delta-kicked particles
We investigate the classical chaotic diffusion of atoms subjected to {\em
pairs} of closely spaced pulses (`kicks) from standing waves of light (the
-KP). Recent experimental studies with cold atoms implied an
underlying classical diffusion of type very different from the well-known
paradigm of Hamiltonian chaos, the Standard Map.
The kicks in each pair are separated by a small time interval , which together with the kick strength , characterizes the transport.
Phase space for the -KP is partitioned into momentum `cells' partially
separated by momentum-trapping regions where diffusion is slow. We present here
an analytical derivation of the classical diffusion for a -KP
including all important correlations which were used to analyze the
experimental data.
We find a new asymptotic () regime of `hindered' diffusion:
while for the Standard Map the diffusion rate, for , oscillates about the uncorrelated, rate , we find
analytically, that the -KP can equal, but never diffuses faster than,
a random walk rate.
We argue this is due to the destruction of the important classical
`accelerator modes' of the Standard Map.
We analyze the experimental regime , where
quantum localisation lengths are affected by fractal
cell boundaries. We find an approximate asymptotic diffusion rate , in correspondence to a regime in the Standard Map
associated with 'golden-ratio' cantori.Comment: 14 pages, 10 figures, error in equation in appendix correcte
Optical evidence for a spin-filter effect in the charge transport of
We have measured the optical reflectivity of
as a function of temperature between 1.5 and 300
and in external magnetic fields up to 7 . The slope at the onset of the
plasma edge feature in increases with decreasing temperature and
increasing field but the plasma edge itself does not exhibit the remarkable
blue shift that is observed in the binary compound . The analysis of
the magnetic field dependence of the low temperature optical conductivity
spectrum confirms the previously observed exponential decrease of the
electrical resistivity upon increasing, field-induced bulk magnetization at
constant temperature. In addition, the individual exponential magnetization
dependences of the plasma frequency and scattering rate are extracted from the
optical data.Comment: submitted to Phys. Rev. Let
Detecting Singleton Review Spammers Using Semantic Similarity
Online reviews have increasingly become a very important resource for
consumers when making purchases. Though it is becoming more and more difficult
for people to make well-informed buying decisions without being deceived by
fake reviews. Prior works on the opinion spam problem mostly considered
classifying fake reviews using behavioral user patterns. They focused on
prolific users who write more than a couple of reviews, discarding one-time
reviewers. The number of singleton reviewers however is expected to be high for
many review websites. While behavioral patterns are effective when dealing with
elite users, for one-time reviewers, the review text needs to be exploited. In
this paper we tackle the problem of detecting fake reviews written by the same
person using multiple names, posting each review under a different name. We
propose two methods to detect similar reviews and show the results generally
outperform the vectorial similarity measures used in prior works. The first
method extends the semantic similarity between words to the reviews level. The
second method is based on topic modeling and exploits the similarity of the
reviews topic distributions using two models: bag-of-words and
bag-of-opinion-phrases. The experiments were conducted on reviews from three
different datasets: Yelp (57K reviews), Trustpilot (9K reviews) and Ott dataset
(800 reviews).Comment: 6 pages, WWW 201
Beyond homozygosity mapping: family-control analysis based on Hamming distance for prioritizing variants in exome sequencing
A major challenge in current exome sequencing in autosomal recessive (AR) families is the lack of an effective method to prioritize single-nucleotide variants (SNVs). AR families are generally too small for linkage analysis, and length of homozygous regions is unreliable for identification of causative variants. Various common filtering steps usually result in a list of candidate variants that cannot be narrowed down further or ranked. To prioritize shortlisted SNVs we consider each homozygous candidate variant together with a set of SNVs flanking it. We compare the resulting array of genotypes between an affected family member and a number of control individuals and argue that, in a family, differences between family member and controls should be larger for a pathogenic variant and SNVs flanking it than for a random variant. We assess differences between arrays in two individuals by the Hamming distance and develop a suitable test statistic, which is expected to be large for a causative variant and flanking SNVs. We prioritize candidate variants based on this statistic and applied our approach to six patients with known pathogenic variants and found these to be in the top 2 to 10 percentiles of ranks
- …