4,014 research outputs found

    Ranking influential spreaders is an ill-defined problem

    Full text link
    Finding influential spreaders of information and disease in networks is an important theoretical problem, and one of considerable recent interest. It has been almost exclusively formulated as a node-ranking problem -- methods for identifying influential spreaders rank nodes according to how influential they are. In this work, we show that the ranking approach does not necessarily work: the set of most influential nodes depends on the number of nodes in the set. Therefore, the set of nn most important nodes to vaccinate does not need to have any node in common with the set of n+1n+1 most important nodes. We propose a method for quantifying the extent and impact of this phenomenon, and show that it is common in both empirical and model networks

    VizSeq: A Visual Analysis Toolkit for Text Generation Tasks

    Full text link
    Automatic evaluation of text generation tasks (e.g. machine translation, text summarization, image captioning and video description) usually relies heavily on task-specific metrics, such as BLEU and ROUGE. They, however, are abstract numbers and are not perfectly aligned with human assessment. This suggests inspecting detailed examples as a complement to identify system error patterns. In this paper, we present VizSeq, a visual analysis toolkit for instance-level and corpus-level system evaluation on a wide variety of text generation tasks. It supports multimodal sources and multiple text references, providing visualization in Jupyter notebook or a web app interface. It can be used locally or deployed onto public servers for centralized data hosting and benchmarking. It covers most common n-gram based metrics accelerated with multiprocessing, and also provides latest embedding-based metrics such as BERTScore

    The position profiles of order cancellations in an emerging stock market

    Full text link
    Order submission and cancellation are two constituent actions of stock trading behaviors in order-driven markets. Order submission dynamics has been extensively studied for different markets, while order cancellation dynamics is less understood. There are two positions associated with a cancellation, that is, the price level in the limit-order book (LOB) and the position in the queue at each price level. We study the profiles of these two order cancellation positions through rebuilding the limit-order book using the order flow data of 23 liquid stocks traded on the Shenzhen Stock Exchange in the year 2003. We find that the profiles of relative price levels where cancellations occur obey a log-normal distribution. After normalizing the relative price level by removing the factor of order numbers stored at the price level, we find that the profiles exhibit a power-law scaling behavior on the right tails for both buy and sell orders. When focusing on the order cancellation positions in the queue at each price level, we find that the profiles increase rapidly in the front of the queue, and then fluctuate around a constant value till the end of the queue. These profiles are similar for different stocks. In addition, the profiles of cancellation positions can be fitted by an exponent function for both buy and sell orders. These two kinds of cancellation profiles seem universal for different stocks investigated and exhibit minor asymmetry between buy and sell orders. Our empirical findings shed new light on the order cancellation dynamics and pose constraints on the construction of order-driven stock market models.Comment: 17 pages, 6 figures and 6 table

    User-Entity Differential Privacy in Learning Natural Language Models

    Full text link
    In this paper, we introduce a novel concept of user-entity differential privacy (UeDP) to provide formal privacy protection simultaneously to both sensitive entities in textual data and data owners in learning natural language models (NLMs). To preserve UeDP, we developed a novel algorithm, called UeDP-Alg, optimizing the trade-off between privacy loss and model utility with a tight sensitivity bound derived from seamlessly combining user and sensitive entity sampling processes. An extensive theoretical analysis and evaluation show that our UeDP-Alg outperforms baseline approaches in model utility under the same privacy budget consumption on several NLM tasks, using benchmark datasets.Comment: Accepted at IEEE BigData 202

    Improving a Named Entity Recognizer Trained on Noisy Data with a Few Clean Instances

    Full text link
    To achieve state-of-the-art performance, one still needs to train NER models on large-scale, high-quality annotated data, an asset that is both costly and time-intensive to accumulate. In contrast, real-world applications often resort to massive low-quality labeled data through non-expert annotators via crowdsourcing and external knowledge bases via distant supervision as a cost-effective alternative. However, these annotation methods result in noisy labels, which in turn lead to a notable decline in performance. Hence, we propose to denoise the noisy NER data with guidance from a small set of clean instances. Along with the main NER model we train a discriminator model and use its outputs to recalibrate the sample weights. The discriminator is capable of detecting both span and category errors with different discriminative prompts. Results on public crowdsourcing and distant supervision datasets show that the proposed method can consistently improve performance with a small guidance set.Comment: 14 page

    Iron oxide-modified nanoporous geopolymers for arsenic removal from ground water

    Get PDF
    AbstractComposite materials of hierarchically porous geopolymer and amorphous hydrous ferric oxide were produced and characterized as a new potentially cost-effective arsenic adsorbent. The arsenic removal capabilities of the iron (hydr)oxide (HFO) media were carried out using batch reactor experiments and laboratory scale continuous flow experiments. The Rapid Small-Scale Column Tests (RSSCT) were employed to mimic a scaled up packed bed reactor and the toxicity characteristic leaching procedure (TCLP) test of arsenic adsorbed solid material was carried out to investigate the mechanical robustness of the adsorbent. The best performing media which contained ~20 wt% Fe could remove over 95 µg of arsenic per gram of dry media from arsenic only water matric. The role of the high porosity in arsenic adsorption characteristics was further quantified in conjunction with accessibility of the adsorption sites. The new hierarchically porous geopolymer-based composites were shown to be a good candidate for cost-effective removal of arsenic from contaminated water under realistic conditions owing to their favorable adsorption capacity and very low leachability

    Evidence for a New Excitation at the Interface Between a High-Tc Superconductor and a Topological Insulator

    Full text link
    High-temperature superconductors exhibit a wide variety of novel excitations. If contacted with a topological insulator, the lifting of spin rotation symmetry in the surface states can lead to the emergence of unconventional superconductivity and novel particles. In pursuit of this possibility, we fabricated high critical-temperature (Tc ~ 85 K) superconductor/topological insulator (Bi2Sr2CaCu2O8+delta/Bi2Te2Se) junctions. Below 75 K, a zero-bias conductance peak (ZBCP) emerges in the differential conductance spectra of this junction. The magnitude of the ZBCP is suppressed at the same rate for magnetic fields applied parallel or perpendicular to the junction. Furthermore, it can still be observed and does not split up to at least 8.5 T. The temperature and magnetic field dependence of the excitation we observe appears to fall outside the known paradigms for a ZBCP

    A reflection-based localized surface plasmon resonance fiber-optic probe for biochemical sensing

    Get PDF
    We report the fabrication and characterization of an optical fiber biochemical sensing probe based on localized surface plasmon resonance (LSPR) and spectra reflection. Ordered array of gold nanodots was fabricated on the optical fiber end facet using electron-beam lithography (EBL). We experimentally demonstrated for the first time the blue shift of the LSPR scattering spectrum with respected to the LSPR extinction spectrum, which had been predicted theoretically. High sensitivity [195.72 nm/refractive index unit (RIU)] of this sensor for detecting changes in the bulk refractive indices has been demonstrated. The label-free affinity bio-molecules sensing capability has also been demonstrated using biotin and streptavidin as the receptor and the analyte
    corecore