1,011 research outputs found
Detecting Sockpuppets in Deceptive Opinion Spam
This paper explores the problem of sockpuppet detection in deceptive opinion
spam using authorship attribution and verification approaches. Two methods are
explored. The first is a feature subsampling scheme that uses the KL-Divergence
on stylistic language models of an author to find discriminative features. The
second is a transduction scheme, spy induction that leverages the diversity of
authors in the unlabeled test set by sending a set of spies (positive samples)
from the training set to retrieve hidden samples in the unlabeled test set
using nearest and farthest neighbors. Experiments using ground truth sockpuppet
data show the effectiveness of the proposed schemes.Comment: 18 pages, Accepted at CICLing 2017, 18th International Conference on
Intelligent Text Processing and Computational Linguistic
Recommended from our members
Direct x-ray response of self-scanning photodiode arrays
Self-scanning photodiode arrays were tested for their ability to measure the spatial distribution of low-energy x rays in a wavelength-dispersive spectrometer. X-ray spectral sensitivity was measured with a calibrated dc source of nearly-monochromatic characteristic-x rays with photon energies in the range of 1.5 to 8 keV. Photodiode response was found to be linear with x-ray flux. Exposure to large doses of copper radiation did not affect sensitivity. A mathematical model that describes the experimental data is presented. It was found that spatial resolving power was lowered by the dispersal of photogenerated charges. This effect was investigated with collimated beams and is described with a formula that predicts the loss of diode signals. (auth
An Integrated XRF/XRD Instrument for Mars Exobiology and Geology Experiments
By employing an integrated x-ray instrument on a future Mars mission, data obtained will greatly augment those returned by Viking; details characterizing the past and present environment on Mars and those relevant to the possibility of the origin and evolution of life will be acquired. A combined x-ray fluorescence/x-ray diffraction (XRF/XRD) instrument was breadboarded and demonstrated to accommodate important exobiology and geology experiment objectives outlined for MESUR and future Mars missions. Among others, primary objectives for the exploration of Mars include the intense study of local areas on Mars to establish the chemical, mineralogical, and petrological character of different components of the surface material; to determine the distribution, abundance, and sources and sinks of volatile materials, including an assessment of the biologic potential, now and during past epoches; and to establish the global chemical and physical characteristics of the Martian surface. The XRF/XRD breadboard instrument identifies and quantifies soil surface elemental, mineralogical, and petrological characteristics and acquires data necessary to address questions on volatile abundance and distribution. Additionally, the breadboard is able to characterize the biogenic element constituents of soil samples providing information on the biologic potential of the Mars environment. Preliminary breadboard experiments confirmed the fundamental instrument design approach and measurement performance
Algorithmic statistics: forty years later
Algorithmic statistics has two different (and almost orthogonal) motivations.
From the philosophical point of view, it tries to formalize how the statistics
works and why some statistical models are better than others. After this notion
of a "good model" is introduced, a natural question arises: it is possible that
for some piece of data there is no good model? If yes, how often these bad
("non-stochastic") data appear "in real life"?
Another, more technical motivation comes from algorithmic information theory.
In this theory a notion of complexity of a finite object (=amount of
information in this object) is introduced; it assigns to every object some
number, called its algorithmic complexity (or Kolmogorov complexity).
Algorithmic statistic provides a more fine-grained classification: for each
finite object some curve is defined that characterizes its behavior. It turns
out that several different definitions give (approximately) the same curve.
In this survey we try to provide an exposition of the main results in the
field (including full proofs for the most important ones), as well as some
historical comments. We assume that the reader is familiar with the main
notions of algorithmic information (Kolmogorov complexity) theory.Comment: Missing proofs adde
Algorithmic statistics revisited
The mission of statistics is to provide adequate statistical hypotheses
(models) for observed data. But what is an "adequate" model? To answer this
question, one needs to use the notions of algorithmic information theory. It
turns out that for every data string one can naturally define
"stochasticity profile", a curve that represents a trade-off between complexity
of a model and its adequacy. This curve has four different equivalent
definitions in terms of (1)~randomness deficiency, (2)~minimal description
length, (3)~position in the lists of simple strings and (4)~Kolmogorov
complexity with decompression time bounded by busy beaver function. We present
a survey of the corresponding definitions and results relating them to each
other
Spatial heterogeneity and irreversible vegetation change in semi-arid grazing systems
Recent theoretical studies have shown that spatial redistribution of surface water may explain the occurrence of patterns of alternating vegetated and degraded patches in semiarid grasslands. These results implied, however, that spatial redistribution processes cannot explain the collapse of production on coarser scales observed in these systems. We present a spatially explicit vegetation model to investigate possible mechanisms explaining irreversible vegetation collapse on coarse spatial scales. The model results indicate that the dynamics of vegetation on coarse scales are determined by the interaction of two spatial feedback processes. Loss of plant cover in a certain area results in increased availability of water in remaining vegetated patches through run-on of surface water, promoting within-patch plant production. Hence, spatial redistribution of surface water creates negative feedback between reduced plant cover and increased plant growth in remaining vegetation. Reduced plant cover, however, results in focusing of herbivore grazing in the remaining vegetation. Hence, redistribution of herbivores creates positive feedback between reduced plant cover and increased losses due to grazing in remaining vegetated patches, leading to collapse of the entire vegetation. This may explain irreversible vegetation shifts in semiarid grasslands on coarse spatial scales
The inverse Laplace transform as the ultimate tool for transverse mass spectra
New high statistics data from the second generation of ultrarelativistic
heavy-ion experiments open up new possibilities in terms of data analysis. To
fully utilize the potential we propose to analyze the -spectra of
hadrons using the inverse Laplace transform. The problems with its inherent
ill-definedness can be overcome and several applications in other fields like
biology, chemistry or optics have already shown its feasability. Moreover, the
method also promises to deliver upper bounds on the total information content
of the spectra, which is of big importance for all other means of analysis.
Here we compute several Laplace inversions from different thermal scenarios,
both analytically and numerically, to test the efficiency of the method.
Especially the case of a two component structure, related to a possible first
order phase transition to a quark gluon plasma, is closer investigated and it
is shown that at least a signal to noise ratio of is necessary to
resolve two individual components.Comment: 13 pages (PostScript, including figures), BNL-NTHES
- …