Search CORE

650 research outputs found

Common physical mechanism for integer and fractional quantum Hall effects

Author: Li Kang
Long Shuming
wang Jianhua
Yuan Yi
Publication venue
Publication date: 24/01/2012
Field of study

Integer and fractional quantum Hall effects were studied with different physics models and explained by different physical mechanisms. In this paper, the common physical mechanism for integer and fractional quantum Hall effects is studied, where a new unified formulation of integer and fractional quantum Hall effect is presented. Firstly, we introduce a 2-dimensional ideal electron gas model in the presence of strong magnetic field with symmetry gauge, and the transverse electric filed

\varepsilon_2

is also introduced to balance Lorentz force. Secondly, the Pauli equation is solved where the wave function and energy levels is given explicitly. Thirdly, after the calculation of the degeneracy density for 2-dimensional ideal electron gas system, the Hall resistance of the system is obtained, where the quantum Hall number

\nu

is introduced. It is found that the new defined

\nu

, called filling factor in the literature, is related to radial quantum number n and angular quantum number

|m|

, the different

n

and

|m|

correspond to different

\nu

. This provides unification explaination for integer and fractional quantum Hall effects. It is predicated that more new cases exist of fractional quantum Hall effects without the concept of fractional charge.Comment: Latex, 9 page

arXiv.org e-Print Archive

meProp: Sparsified Back Propagation for Accelerated Deep Learning with Reduced Overfitting

Author: Ma Shuming
Ren Xuancheng
Sun Xu
Wang Houfeng
Publication venue
Publication date: 10/03/2019
Field of study

We propose a simple yet effective technique for neural network learning. The forward propagation is computed as usual. In back propagation, only a small subset of the full gradient is computed to update the model parameters. The gradient vectors are sparsified in such a way that only the top-

k

elements (in terms of magnitude) are kept. As a result, only

k

rows or columns (depending on the layout) of the weight matrix are modified, leading to a linear reduction (

k

divided by the vector dimension) in the computational cost. Surprisingly, experimental results demonstrate that we can update only 1-4% of the weights at each back propagation pass. This does not result in a larger number of training iterations. More interestingly, the accuracy of the resulting models is actually improved rather than degraded, and a detailed analysis is given. The code is available at https://github.com/lancopku/mePropComment: Accepted by the 34th International Conference on Machine Learning (ICML 2017

arXiv.org e-Print Archive

Bag-of-Words as Target for Neural Machine Translation

Author: Lin Junyang
Ma Shuming
Sun Xu
Wang Yizhong
Publication venue
Publication date: 13/05/2018
Field of study

A sentence can be translated into more than one correct sentences. However, most of the existing neural machine translation models only use one of the correct translations as the targets, and the other correct sentences are punished as the incorrect sentences in the training stage. Since most of the correct translations for one sentence share the similar bag-of-words, it is possible to distinguish the correct translations from the incorrect ones by the bag-of-words. In this paper, we propose an approach that uses both the sentences and the bag-of-words as targets in the training stage, in order to encourage the model to generate the potentially correct sentences that are not appeared in the training set. We evaluate our model on a Chinese-English translation dataset, and experiments show our model outperforms the strong baselines by the BLEU score of 4.55.Comment: accepted by ACL 201

arXiv.org e-Print Archive

Theoretical basis for the unification of the integer and the fractional quantum Hall effects

Author: Li Kang
Long Shuming
Wang Jianhua
Yuan Yi
Publication venue
Publication date: 07/07/2011
Field of study

This paper intends to provide a theoretical basis for the unification of the integer and the fractional quantum Hall effects. Guided by concepts and theories of quantum mechanics and with the solution of the Pauli equation in a magnetic field under the symmetric gauge, wave functions, energy levels of single electrons, and the expectation value of electron's spatial scope are presented. After the quotation of non-interaction dilute gas system, the product of single electron's wave functions is used to construct wave functions of the N electron gas system in magnetic field. Then the expectation value of the system's motion area and the electron's surface density are obtained. In this way, the unification explaination of the integer and the fractional quantum Hall effects is formulated without the help of the concept of fractional charge.Comment: 10 pages, 1 figur

arXiv.org e-Print Archive

Autoencoder as Assistant Supervisor: Improving Text Representation for Chinese Social Media Text Summarization

Author: Lin Junyang
Ma Shuming
Sun Xu
Wang Houfeng
Publication venue
Publication date: 13/05/2018
Field of study

Most of the current abstractive text summarization models are based on the sequence-to-sequence model (Seq2Seq). The source content of social media is long and noisy, so it is difficult for Seq2Seq to learn an accurate semantic representation. Compared with the source content, the annotated summary is short and well written. Moreover, it shares the same meaning as the source content. In this work, we supervise the learning of the representation of the source content with that of the summary. In implementation, we regard a summary autoencoder as an assistant supervisor of Seq2Seq. Following previous work, we evaluate our model on a popular Chinese social media dataset. Experimental results show that our model achieves the state-of-the-art performances on the benchmark dataset.Comment: accepted by ACL 201

arXiv.org e-Print Archive

Conditional Fault Diagnosis of Bubble Sort Graphs under the PMC Model

Author: Wang Jian
Xu Jun-Ming
Xu Xirong
Zhou Shuming
Publication venue
Publication date: 19/04/2012
Field of study

As the size of a multiprocessor system increases, processor failure is inevitable, and fault identification in such a system is crucial for reliable computing. The fault diagnosis is the process of identifying faulty processors in a multiprocessor system through testing. For the practical fault diagnosis systems, the probability that all neighboring processors of a processor are faulty simultaneously is very small, and the conditional diagnosability, which is a new metric for evaluating fault tolerance of such systems, assumes that every faulty set does not contain all neighbors of any processor in the systems. This paper shows that the conditional diagnosability of bubble sort graphs

B_n

under the PMC model is

4n-11

for

n \geq 4

, which is about four times its ordinary diagnosability under the PMC model

arXiv.org e-Print Archive

Microblog Hashtag Generation via Encoding Conversation Contexts

Author: King Irwin
Li Jing
Lyu Michael R.
Shi Shuming
Wang Yue
Publication venue
Publication date: 18/05/2019
Field of study

Automatic hashtag annotation plays an important role in content understanding for microblog posts. To date, progress made in this field has been restricted to phrase selection from limited candidates, or word-level hashtag discovery using topic models. Different from previous work considering hashtags to be inseparable, our work is the first effort to annotate hashtags with a novel sequence generation framework via viewing the hashtag as a short sequence of words. Moreover, to address the data sparsity issue in processing short microblog posts, we propose to jointly model the target posts and the conversation contexts initiated by them with bidirectional attention. Extensive experimental results on two large-scale datasets, newly collected from English Twitter and Chinese Weibo, show that our model significantly outperforms state-of-the-art models based on classification. Further studies demonstrate our ability to effectively generate rare and even unseen hashtags, which is however not possible for most existing methods.Comment: NAACL 2019 (10 pages

arXiv.org e-Print Archive

Improving Semantic Relevance for Sequence-to-Sequence Learning of Chinese Social Media Text Summarization

Author: Li Wenjie
Ma Shuming
Su Qi
Sun Xu
Wang Houfeng
Xu Jingjing
Publication venue
Publication date: 08/06/2017
Field of study

Current Chinese social media text summarization models are based on an encoder-decoder framework. Although its generated summaries are similar to source texts literally, they have low semantic relevance. In this work, our goal is to improve semantic relevance between source texts and summaries for Chinese social media summarization. We introduce a Semantic Relevance Based neural model to encourage high semantic similarity between texts and summaries. In our model, the source text is represented by a gated attention encoder, while the summary representation is produced by a decoder. Besides, the similarity score between the representations is maximized during training. Our experiments show that the proposed model outperforms baseline systems on a social media corpus.Comment: Accepted by AC

arXiv.org e-Print Archive

Exploiting Sentential Context for Neural Machine Translation

Author: Shi Shuming
Tu Zhaopeng
Wang Longyue
Wang Xing
Publication venue
Publication date: 04/06/2019
Field of study

In this work, we present novel approaches to exploit sentential context for neural machine translation (NMT). Specifically, we first show that a shallow sentential context extracted from the top encoder layer only, can improve translation performance via contextualizing the encoding representations of individual words. Next, we introduce a deep sentential context, which aggregates the sentential context representations from all the internal layers of the encoder to form a more comprehensive context representation. Experimental results on the WMT14 English-to-German and English-to-French benchmarks show that our model consistently improves performance over the strong TRANSFORMER model (Vaswani et al., 2017), demonstrating the necessity and effectiveness of exploiting sentential context for NMT.Comment: Accepted by ACL 201

arXiv.org e-Print Archive

Skeleton-to-Response: Dialogue Generation Guided by Retrieval Memory

Author: Bi Victoria
Cai Deng
Lam Wai
Liu Xiaojiang
Shi Shuming
Tu Zhaopeng
Wang Yan
Publication venue
Publication date: 28/02/2020
Field of study

For dialogue response generation, traditional generative models generate responses solely from input queries. Such models rely on insufficient information for generating a specific response since a certain query could be answered in multiple ways. Consequentially, those models tend to output generic and dull responses, impeding the generation of informative utterances. Recently, researchers have attempted to fill the information gap by exploiting information retrieval techniques. When generating a response for a current query, similar dialogues retrieved from the entire training data are considered as an additional knowledge source. While this may harvest massive information, the generative models could be overwhelmed, leading to undesirable performance. In this paper, we propose a new framework which exploits retrieval results via a skeleton-then-response paradigm. At first, a skeleton is generated by revising the retrieved responses. Then, a novel generative model uses both the generated skeleton and the original query for response generation. Experimental results show that our approaches significantly improve the diversity and informativeness of the generated responses.Comment: accepted to NAACL201

arXiv.org e-Print Archive