Search CORE

395 research outputs found

Approximating Word Ranking and Negative Sampling for Word Embedding

Author: Guo Guibing
Ouyang Shichang
Wang Xingwei
Yuan Fajie
Publication venue: 'International Joint Conferences on Artificial Intelligence'
Publication date: 01/07/2018
Field of study

CBOW (Continuous Bag-Of-Words) is one of the most commonly used techniques to generate word embeddings in various NLP tasks. However, it fails to reach the optimal performance due to uniform involvements of positive words and a simple sampling distribution of negative words. To resolve these issues, we propose OptRank to optimize word ranking and approximate negative sampling for bettering word embedding. Specifically, we first formalize word embedding as a ranking problem. Then, we weigh the positive words by their ranks such that highly ranked words have more importance, and adopt a dynamic sampling strategy to select informative negative words. In addition, an approximation method is designed to efficiently compute word ranks. Empirical experiments show that OptRank consistently outperforms its counterparts on a benchmark dataset with different sampling scales, especially when the sampled subset is small. The code and datasets can be obtained from https://github.com/ouououououou/OptRank

Crossref

Enlighten

Manipulation of Abro1 Localization in U2OS Cells

Author: Liu Shichang
Lopez Sophia
Wang Bin
Publication venue: OpenWorks @ MD Anderson
Publication date: 09/08/2022
Field of study

https://openworks.mdanderson.org/sumexp22/1142/thumbnail.jp

OpenWorks @ MD Anderson

Mechanical Turk-based Experiment vs Laboratory-based Experiment: A Case Study on the Comparison of Semantic Transparency Rating Data

Author: Chan Angel
Huang Chu-Ren
Wang Shichang
Yao Yao
Publication venue
Publication date: 01/01/2015
Field of study

In this paper, we conducted semantic trans-parency rating experiments using both the traditional laboratory-based method and the crowdsourcing-based method. Then we com-pared the rating data obtained from these two experiments. We observed very strong cor-relation coefficients for both overall seman-tic transparency rating data and constituent se-mantic transparency data (rho> 0:9) which means the two experiments may yield com-parable data and crowdsourcing-based experi-ment is a feasible alternative to the laboratory-based experiment in linguistic studies. We also observed a scale shrinkage phenomenon in both experiments: the actual scale of the rat-ing results cannot cover the ideal scale [0; 1], both ends of the actual scale shrink towards the center. However, the scale shrinkage of the crowdsourcing-based experiment is stronger than that of the laboratory-based experiment, this makes the rating results obtained in these two experiments not directly comparable. In order to make the results directly compara-ble, we explored two data transformation al-gorithms, z-score transformation and adjusted normalization to unify the scales. We also in-vestigated the uncertainty of semantic trans-parency judgment among raters, we found that it had a regular relation with semantic trans-parency magnitude and this may further reveal a general cognitivemechanism of human judg-ment.

CiteSeerX

The Hong Kong Polytechnic University Pao Yue-kong Library

Waseda University Repository

Mechanical Turk-based Experiment vs Laboratory-based Experiment: A Case Study on the Comparison of Semantic Transparency Rating Data

Author: Wang Shichang
Huang Chu-Ren
Yao Yao
Chan Angel
Publication venue
Publication date: 01/01/2006
Field of study

Have you thought of why you get tired or why you get hungry? Something in your body keeps track of time. It is almost like you have a clock that tells you all those things. And indeed, in the suparachiasmatic region of our hypothalamus reside cells which each act like an oscillator, and together form a coherent circadian rhythm to help our body keep track of time. In fact, such circadian clocks are not limited to mammals but can be found in many organisms including single-cell, reptiles and birds. The study of such rhythms constitutes a field of biology, chronobiology, and forms the background for my research and this thesis. Pioneers of chronobiology, Pittendrigh and Aschoff, studied biological clocks from an input-output view, across a range of organisms by observing and analyzing their overt activity in response to stimulus such as light. Their study was made without recourse to knowledge of the biological underpinnings of the circadian pacemaker. The advent of the new biology has now made it possible to "break open the box" and identify biological feedback systems comprised of gene transcription and protein translation as the core mechanism of a biological clock. My research has focused on a simple transcription-translation clock model which nevertheless possesses many of the features of a circadian pacemaker including its entrainability by light. This model consists of two nonlinear coupled and delayed differential equations. Light pulses can reset the phase of this clock, whereas constant light of different intensity can speed it up or slow it down. This latter property is a signature property of circadian clocks and is referred to in chronobiology as "Aschoff's rule". The discussion in this thesis focus on develop a connection and also a understanding of how constant light effect this clock model

Publikationer från Linköpings universitet

Waseda University Repository

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Detect and remove watermark in deep neural networks via generative adversarial networks

Author: Liu Weiqiang
Sun Shichang
Wang Haoqi
Wang Jian
Xue Mingfu
Zhang Yushu
Publication venue
Publication date: 15/06/2021
Field of study

Deep neural networks (DNN) have achieved remarkable performance in various fields. However, training a DNN model from scratch requires a lot of computing resources and training data. It is difficult for most individual users to obtain such computing resources and training data. Model copyright infringement is an emerging problem in recent years. For instance, pre-trained models may be stolen or abuse by illegal users without the authorization of the model owner. Recently, many works on protecting the intellectual property of DNN models have been proposed. In these works, embedding watermarks into DNN based on backdoor is one of the widely used methods. However, when the DNN model is stolen, the backdoor-based watermark may face the risk of being detected and removed by an adversary. In this paper, we propose a scheme to detect and remove watermark in deep neural networks via generative adversarial networks (GAN). We demonstrate that the backdoor-based DNN watermarks are vulnerable to the proposed GAN-based watermark removal attack. The proposed attack method includes two phases. In the first phase, we use the GAN and few clean images to detect and reverse the watermark in the DNN model. In the second phase, we fine-tune the watermarked DNN based on the reversed backdoor images. Experimental evaluations on the MNIST and CIFAR10 datasets demonstrate that, the proposed method can effectively remove about 98% of the watermark in DNN models, as the watermark retention rate reduces from 100% to less than 2% after applying the proposed attack. In the meantime, the proposed attack hardly affects the model's performance. The test accuracy of the watermarked DNN on the MNIST and the CIFAR10 datasets drops by less than 1% and 3%, respectively

arXiv.org e-Print Archive

Robust Backdoor Attacks against Deep Neural Networks in Real Physical World

Author: He Can
Liu Weiqiang
Sun Shichang
Wang Jian
Xue Mingfu
Publication venue
Publication date: 15/07/2021
Field of study

Deep neural networks (DNN) have been widely deployed in various applications. However, many researches indicated that DNN is vulnerable to backdoor attacks. The attacker can create a hidden backdoor in target DNN model, and trigger the malicious behaviors by submitting specific backdoor instance. However, almost all the existing backdoor works focused on the digital domain, while few studies investigate the backdoor attacks in real physical world. Restricted to a variety of physical constraints, the performance of backdoor attacks in the real physical world will be severely degraded. In this paper, we propose a robust physical backdoor attack method, PTB (physical transformations for backdoors), to implement the backdoor attacks against deep learning models in the real physical world. Specifically, in the training phase, we perform a series of physical transformations on these injected backdoor instances at each round of model training, so as to simulate various transformations that a backdoor may experience in real world, thus improves its physical robustness. Experimental results on the state-of-the-art face recognition model show that, compared with the backdoor methods that without PTB, the proposed attack method can significantly improve the performance of backdoor attacks in real physical world. Under various complex physical conditions, by injecting only a very small ratio (0.5%) of backdoor instances, the attack success rate of physical backdoor attacks with the PTB method on VGGFace is 82%, while the attack success rate of backdoor attacks without the proposed PTB method is lower than 11%. Meanwhile, the normal performance of the target DNN model has not been affected

arXiv.org e-Print Archive