390 research outputs found
A Hybrid Model for Monolingual and Multilingual Toxic Comment Detection
Social media provides a public and convenient platform for people to communicate. However, it is also open to hateful behavior and toxic comments. Social networks, like Facebook, Twitter, and many others, have been working on developing effective toxic comment detection methods to provide better service. Monolingual language model focuses on a single-language and provides high accuracy in detection. Multilingual language model provides better generalization performance. In order to improve the effectiveness of detecting toxic comments in multiple languages, we propose a hybrid model, which fuses monolingual model and multilingual model. We use labeled data to fine-tune the monolingual pre-trained model. We use masked language modeling to semi-supervise the fine-tuning of multilingual pre-trained model on unlabeled data and then use labeled data to fine-tune the model. Through this way, we can fully utilize the large amount of unlabeled data; reduce dependence on labeled comment data; and improve the effectiveness of detection. We also design several comparative experiments. The results demonstrate the effectiveness and advantage of our proposed model, especially compared to the XLM-RoBERTa multilingual fine-tuning model
Diagnostic Evaluation of Policy-Gradient-Based Ranking
Learning-to-rank has been intensively studied and has shown significantly increasing values in a wide range of domains, such as web search, recommender systems, dialogue systems, machine translation, and even computational biology, to name a few. In light of recent advances in neural networks, there has been a strong and continuing interest in exploring how to deploy popular techniques, such as reinforcement learning and adversarial learning, to solve ranking problems. However, armed with the aforesaid popular techniques, most studies tend to show how effective a new method is. A comprehensive comparison between techniques and an in-depth analysis of their deficiencies are somehow overlooked. This paper is motivated by the observation that recent ranking methods based on either reinforcement learning or adversarial learning boil down to policy-gradient-based optimization. Based on the widely used benchmark collections with complete information (where relevance labels are known for all items), such as MSLRWEB30K and Yahoo-Set1, we thoroughly investigate the extent to which policy-gradient-based ranking methods are effective. On one hand, we analytically identify the pitfalls of policy-gradient-based ranking. On the other hand, we experimentally compare a wide range of representative methods. The experimental results echo our analysis and show that policy-gradient-based ranking methods are, by a large margin, inferior to many conventional ranking methods. Regardless of whether we use reinforcement learning or adversarial learning, the failures are largely attributable to the gradient estimation based on sampled rankings, which significantly diverge from ideal rankings. In particular, the larger the number of documents per query and the more fine-grained the ground-truth labels, the greater the impact policy-gradient-based ranking suffers. Careful examination of this weakness is highly recommended for developing enhanced methods based on policy gradient
BigVideo: A Large-scale Video Subtitle Translation Dataset for Multimodal Machine Translation
We present a large-scale video subtitle translation dataset, BigVideo, to
facilitate the study of multi-modality machine translation. Compared with the
widely used How2 and VaTeX datasets, BigVideo is more than 10 times larger,
consisting of 4.5 million sentence pairs and 9,981 hours of videos. We also
introduce two deliberately designed test sets to verify the necessity of visual
information: Ambiguous with the presence of ambiguous words, and Unambiguous in
which the text context is self-contained for translation. To better model the
common semantics shared across texts and videos, we introduce a contrastive
learning method in the cross-modal encoder. Extensive experiments on the
BigVideo show that: a) Visual information consistently improves the NMT model
in terms of BLEU, BLEURT, and COMET on both Ambiguous and Unambiguous test
sets. b) Visual information helps disambiguation, compared to the strong text
baseline on terminology-targeted scores and human evaluation. Dataset and our
implementations are available at https://github.com/DeepLearnXMU/BigVideo-VMT.Comment: Accepted to ACL 2023 Finding
Combining Context and Knowledge Representations for Chemical-Disease Relation Extraction
Automatically extracting the relationships between chemicals and diseases is
significantly important to various areas of biomedical research and health
care. Biomedical experts have built many large-scale knowledge bases (KBs) to
advance the development of biomedical research. KBs contain huge amounts of
structured information about entities and relationships, therefore plays a
pivotal role in chemical-disease relation (CDR) extraction. However, previous
researches pay less attention to the prior knowledge existing in KBs. This
paper proposes a neural network-based attention model (NAM) for CDR extraction,
which makes full use of context information in documents and prior knowledge in
KBs. For a pair of entities in a document, an attention mechanism is employed
to select important context words with respect to the relation representations
learned from KBs. Experiments on the BioCreative V CDR dataset show that
combining context and knowledge representations through the attention
mechanism, could significantly improve the CDR extraction performance while
achieve comparable results with state-of-the-art systems.Comment: Published on IEEE/ACM Transactions on Computational Biology and
Bioinformatics, 11 pages, 5 figure
Stationary shapes of deformable particles moving at low Reynolds numbers
Lecture Notes of the Summer School ``Microswimmers -- From Single Particle
Motion to Collective Behaviour'', organised by the DFG Priority Programme SPP
1726 (Forschungszentrum J{\"{u}}lich, 2015).Comment: Pages C7.1-16 of G. Gompper et al. (ed.), Microswimmers - From Single
Particle Motion to Collective Behaviour, Lecture Notes of the DFG SPP 1726
Summer School 2015, Forschungszentrum J\"ulich GmbH, Schriften des
Forschungszentrums J\"ulich, Reihe Key Technologies, Vol 110, ISBN
978-3-95806-083-
Sensing remote nuclear spins
Sensing single nuclear spins is a central challenge in magnetic resonance
based imaging techniques. Although different methods and especially diamond
defect based sensing and imaging techniques in principle have shown sufficient
sensitivity, signals from single nuclear spins are usually too weak to be
distinguished from background noise. Here, we present the detection and
identification of remote single C-13 nuclear spins embedded in nuclear spin
baths surrounding a single electron spins of a nitrogen-vacancy centre in
diamond. With dynamical decoupling control of the centre electron spin, the
weak magnetic field ~10 nT from a single nuclear spin located ~3 nm from the
centre with hyperfine coupling as weak as ~500 Hz is amplified and detected.
The quantum nature of the coupling is confirmed and precise position and the
vector components of the nuclear field are determined. Given the distance over
which nuclear magnetic fields can be detected the technique marks a firm step
towards imaging, detecting and controlling nuclear spin species external to the
diamond sensor
Synthesis, self-assembly, and immunological activity of α-galactose-functionalized dendronâlipid amphiphiles
Nanoassemblies presenting multivalent displays of biologically active carbohydrates are of significant interest for a wide array of biomedical applications ranging from drug delivery to immunotherapy. In this study, glycodendronâlipid hybrids were developed as a new and tunable class of dendritic amphiphiles. A modular synthesis was used to prepare dendronâlipid hybrids comprising distearylglycerol and 0 through 4th generation polyester dendrons with peripheral protected amines. Following deprotection of the amines, an isothiocyanate derivative of C-linked α-galactose (α-Gal) was conjugated to the dendron peripheries, affording amphiphiles with 1 to 16 α-Gal moieties. Self-assembly in water through a solvent exchange process resulted in vesicles for the 0 through 2nd generation systems and micelles for the 3rd and 4th generation systems. The critical aggregation concentrations decreased with increasing dendron generation, suggesting that the effects of increasing molar mass dominated over the effects of increasing the hydrophilic weight fraction. The binding of the assemblies to Griffonia simplicifolia Lectin I (GSL 1), a protein with specificity for α-Gal was studied by quantifying the binding of fluorescently labeled assemblies to GSL 1-coated beads. It was found that binding was enhanced for amphiphiles containing higher generation dendrons. Despite their substantial structural differences with the natural ligands for the CD1d receptor, the glycodendronâlipid hybrids were capable of stimulating invariant natural killer T (iNKT) cells, a class of innate-like T cells that recognize lipid and glycolipid antigens presented by CD1d and that are implicated in a wide range of diseases and conditions including but not limited to infectious diseases, diabetes and cancer
Tuning the Properties of ZnO, Hematite, and Ag Nanoparticles by Adjusting the Surface Charge
Nanomaterials have become a central focus of scientific research and technological development over the last decade due to their broad applications in a variety of physicochemical and biological fields, including lasers,[1] solar cells,[2] catalysts,[3] sensors,[4â6] biological labels,[7] drug delivery,[8,9] and cancer therapy.[10â13] Controlling the size and/or shape of nanoparticles (NPs) has been widely used to modify and improve NP properties for designated applications.[1,6,11,14â19] Recently, it has been found that adjusting the surface charge (SC) can be a effective method to modify the cytotoxicity, cellular uptake, and specificity of targeting of NPs.[9,10,12,20â22] Electrons and/or other electrical charges play an essential role in many key material properties, such as electrostatic interactions, photoluminescence (PL), magnetism, plasmon properties, chemical bonds, and related chemical properties
- âŠ