4,014 research outputs found
Ranking influential spreaders is an ill-defined problem
Finding influential spreaders of information and disease in networks is an
important theoretical problem, and one of considerable recent interest. It has
been almost exclusively formulated as a node-ranking problem -- methods for
identifying influential spreaders rank nodes according to how influential they
are. In this work, we show that the ranking approach does not necessarily work:
the set of most influential nodes depends on the number of nodes in the set.
Therefore, the set of most important nodes to vaccinate does not need to
have any node in common with the set of most important nodes. We propose
a method for quantifying the extent and impact of this phenomenon, and show
that it is common in both empirical and model networks
VizSeq: A Visual Analysis Toolkit for Text Generation Tasks
Automatic evaluation of text generation tasks (e.g. machine translation, text
summarization, image captioning and video description) usually relies heavily
on task-specific metrics, such as BLEU and ROUGE. They, however, are abstract
numbers and are not perfectly aligned with human assessment. This suggests
inspecting detailed examples as a complement to identify system error patterns.
In this paper, we present VizSeq, a visual analysis toolkit for instance-level
and corpus-level system evaluation on a wide variety of text generation tasks.
It supports multimodal sources and multiple text references, providing
visualization in Jupyter notebook or a web app interface. It can be used
locally or deployed onto public servers for centralized data hosting and
benchmarking. It covers most common n-gram based metrics accelerated with
multiprocessing, and also provides latest embedding-based metrics such as
BERTScore
The position profiles of order cancellations in an emerging stock market
Order submission and cancellation are two constituent actions of stock
trading behaviors in order-driven markets. Order submission dynamics has been
extensively studied for different markets, while order cancellation dynamics is
less understood. There are two positions associated with a cancellation, that
is, the price level in the limit-order book (LOB) and the position in the queue
at each price level. We study the profiles of these two order cancellation
positions through rebuilding the limit-order book using the order flow data of
23 liquid stocks traded on the Shenzhen Stock Exchange in the year 2003. We
find that the profiles of relative price levels where cancellations occur obey
a log-normal distribution. After normalizing the relative price level by
removing the factor of order numbers stored at the price level, we find that
the profiles exhibit a power-law scaling behavior on the right tails for both
buy and sell orders. When focusing on the order cancellation positions in the
queue at each price level, we find that the profiles increase rapidly in the
front of the queue, and then fluctuate around a constant value till the end of
the queue. These profiles are similar for different stocks. In addition, the
profiles of cancellation positions can be fitted by an exponent function for
both buy and sell orders. These two kinds of cancellation profiles seem
universal for different stocks investigated and exhibit minor asymmetry between
buy and sell orders. Our empirical findings shed new light on the order
cancellation dynamics and pose constraints on the construction of order-driven
stock market models.Comment: 17 pages, 6 figures and 6 table
User-Entity Differential Privacy in Learning Natural Language Models
In this paper, we introduce a novel concept of user-entity differential
privacy (UeDP) to provide formal privacy protection simultaneously to both
sensitive entities in textual data and data owners in learning natural language
models (NLMs). To preserve UeDP, we developed a novel algorithm, called
UeDP-Alg, optimizing the trade-off between privacy loss and model utility with
a tight sensitivity bound derived from seamlessly combining user and sensitive
entity sampling processes. An extensive theoretical analysis and evaluation
show that our UeDP-Alg outperforms baseline approaches in model utility under
the same privacy budget consumption on several NLM tasks, using benchmark
datasets.Comment: Accepted at IEEE BigData 202
Improving a Named Entity Recognizer Trained on Noisy Data with a Few Clean Instances
To achieve state-of-the-art performance, one still needs to train NER models
on large-scale, high-quality annotated data, an asset that is both costly and
time-intensive to accumulate. In contrast, real-world applications often resort
to massive low-quality labeled data through non-expert annotators via
crowdsourcing and external knowledge bases via distant supervision as a
cost-effective alternative. However, these annotation methods result in noisy
labels, which in turn lead to a notable decline in performance. Hence, we
propose to denoise the noisy NER data with guidance from a small set of clean
instances. Along with the main NER model we train a discriminator model and use
its outputs to recalibrate the sample weights. The discriminator is capable of
detecting both span and category errors with different discriminative prompts.
Results on public crowdsourcing and distant supervision datasets show that the
proposed method can consistently improve performance with a small guidance set.Comment: 14 page
Iron oxide-modified nanoporous geopolymers for arsenic removal from ground water
AbstractComposite materials of hierarchically porous geopolymer and amorphous hydrous ferric oxide were produced and characterized as a new potentially cost-effective arsenic adsorbent. The arsenic removal capabilities of the iron (hydr)oxide (HFO) media were carried out using batch reactor experiments and laboratory scale continuous flow experiments. The Rapid Small-Scale Column Tests (RSSCT) were employed to mimic a scaled up packed bed reactor and the toxicity characteristic leaching procedure (TCLP) test of arsenic adsorbed solid material was carried out to investigate the mechanical robustness of the adsorbent. The best performing media which contained ~20 wt% Fe could remove over 95 µg of arsenic per gram of dry media from arsenic only water matric. The role of the high porosity in arsenic adsorption characteristics was further quantified in conjunction with accessibility of the adsorption sites. The new hierarchically porous geopolymer-based composites were shown to be a good candidate for cost-effective removal of arsenic from contaminated water under realistic conditions owing to their favorable adsorption capacity and very low leachability
Evidence for a New Excitation at the Interface Between a High-Tc Superconductor and a Topological Insulator
High-temperature superconductors exhibit a wide variety of novel excitations.
If contacted with a topological insulator, the lifting of spin rotation
symmetry in the surface states can lead to the emergence of unconventional
superconductivity and novel particles. In pursuit of this possibility, we
fabricated high critical-temperature (Tc ~ 85 K) superconductor/topological
insulator (Bi2Sr2CaCu2O8+delta/Bi2Te2Se) junctions. Below 75 K, a zero-bias
conductance peak (ZBCP) emerges in the differential conductance spectra of this
junction. The magnitude of the ZBCP is suppressed at the same rate for magnetic
fields applied parallel or perpendicular to the junction. Furthermore, it can
still be observed and does not split up to at least 8.5 T. The temperature and
magnetic field dependence of the excitation we observe appears to fall outside
the known paradigms for a ZBCP
A reflection-based localized surface plasmon resonance fiber-optic probe for biochemical sensing
We report the fabrication and characterization of an optical fiber biochemical sensing probe based on localized surface plasmon resonance (LSPR) and spectra reflection. Ordered array of gold nanodots was fabricated on the optical fiber end facet using electron-beam lithography (EBL). We experimentally demonstrated for the first time the blue shift of the LSPR scattering spectrum with respected to the LSPR extinction spectrum, which had been predicted theoretically. High sensitivity [195.72 nm/refractive index unit (RIU)] of this sensor for detecting changes in the bulk refractive indices has been demonstrated. The label-free affinity bio-molecules sensing capability has also been demonstrated using biotin and streptavidin as the receptor and the analyte
- …