2,705 research outputs found

    Extrapolation in NLP

    Get PDF
    We argue that extrapolation to examples outside the training space will often be easier for models that capture global structures, rather than just maximise their local fit to the training data. We show that this is true for two popular models: the Decomposable Attention Model and word2vec

    Iso-vector and Iso-scalar Tensor Charges of the Nucleon from Lattice QCD

    Full text link
    We present results for the iso-vector and flavor diagonal tensor charges gTudg^{u-d}_T, gTug^{u}_T, gTdg^{d}_T, and gTsg^{s}_T needed to probe novel tensor interactions at the TeV scale in neutron and nuclear β\beta-decays and the contribution of the quark electric dipole moment (EDM) to the neutron EDM. The lattice QCD calculations were done using nine ensembles of gauge configurations generated by the MILC collaboration using the HISQ action with 2+1+1 dynamical flavors. These ensembles span three lattice spacings a0.06,0.09a \approx 0.06, 0.09 and 0.120.12 fm and three quark masses corresponding to the pion masses Mπ130,220M_\pi \approx 130, 220 and 310310 MeV. Using estimates from these ensembles, we quantify all systematic uncertainties and perform a simultaneous extrapolation in the lattice spacing, volume and light quark masses for the connected contributions. The final estimates of the connected nucleon (proton) tensor charge for the iso-vector combination is gTud=1.020(76)g_T^{u-d} = 1.020(76) in the MSˉ\bar{\text{MS}} scheme at 22 GeV. The additional disconnected quark loop contributions needed for the flavor-diagonal matrix elements are calculated using a stochastic estimator employing the truncated solver method with the all-mode-averaging technique. We find that the size of the disconnected contribution is smaller than the statistical error in the connected contribution. This allows us to bound the disconnected contribution and include it as an additional uncertainty in the flavor-diagonal charges. After a continuum extrapolation, we find gTu=0.774(66)g_T^{u} = 0.774(66) , gTd=0.233(28)g_T^{d} = -0.233(28) and gTu+d=0.541(67)g_T^{u+d} = 0.541(67) . The strangeness tensor charge, that can make a significant contribution to the neutron EDM due to the large ratio ms/mu,dm_s/m_{u,d}, is gTs=0.008(9)g_T^{s}=0.008(9) in the continuum limit.Comment: Final published versio

    Quark number susceptibilities from HTL-resummed thermodynamics

    Get PDF
    We compute analytically the diagonal quark number susceptibilities for a quark-gluon plasma at finite temperature and zero chemical potential, and compare with recent lattice results. The calculation uses the approximately self-consistent resummation of hard thermal and dense loops that we have developed previously. For temperatures between 1.5 to 5 TcT_c, our results follow the same trend as the lattice data, but exceed them in magnitude by about 5-10%. We also compute the lowest order contribution, of order αs3log(1/αs)\alpha_s^3\log(1/\alpha_s), to the off-diagonal susceptibility. This contribution, which is not a part of our self-consistent calculation, is numerically small, but not small enough to be compatible with a recent lattice simulation.Comment: 13 pages, 5 figures, uses elsart.cls; v2: minor corrections; v3: sign in eq.(1) correcte

    Nucleon average quark momentum fraction with Nf=2+1N_\mathrm{f}=2+1 Wilson fermions

    Full text link
    We report on an analysis of the average quark momentum fraction of the nucleon and related quantities using Nf=2+1N_\mathrm{f}=2+1 Wilson fermions. Computations are performed on four CLS ensembles covering three values of the lattice spacing at pion masses down to Mπ200MeVM_\pi \approx 200\,\mathrm{MeV}. Several source-sink separations (1.0fm\sim 1.0\,\mathrm{fm} to 1.4fm\sim 1.4\,\mathrm{fm}) are used to assess the excited-state contamination. To gain further insight, the generalized pencil-of-functions approach has been implemented to reduce the excited-state contamination in the relevant two- and three-point functions. Preliminary results are shown for the isovector nucleon charges from vector, axial vector and tensor derivative (twist-2) operators.Comment: 8 pages, 3 figures, 2 tables. Talk presented at the 35th International Symposium on Lattice Field Theory, 18-24 June 2017, Granada, Spai

    QCD calculations of Bπ,KB \to \pi, K form factors with higher-twist corrections

    Full text link
    We update QCD calculations of Bπ,KB \to \pi, K form factors at large hadronic recoil by including the subleading-power corrections from the higher-twist BB-meson light-cone distribution amplitudes (LCDAs) up to the twist-six accuracy and the strange-quark mass effects at leading-power in Λ/mb\Lambda/m_b from the twist-two BB-meson LCDA ϕB+(ω,μ)\phi_B^{+}(\omega, \mu). The higher-twist corrections from both the two-particle and three-particle BB-meson LCDAs are computed from the light-cone QCD sum rules (LCSR) at tree level. In particular, we construct the local duality model for the twist-five and -six BB-meson LCDAs, in agreement with the corresponding asymptotic behaviours at small quark and gluon momenta, employing the QCD sum rules in heavy quark effective theory at leading order in αs\alpha_s. The strange quark mass effects in semileptonic BKB \to K form factors yield the leading-power contribution in the heavy quark expansion, consistent with the power-counting analysis in soft-collinear effective theory, and they are also computed from the LCSR approach due to the appearance of the rapidity singularities. We further explore the phenomenological aspects of the semileptonic BπνB \to \pi \ell \nu decays and the rare exclusive processes BKννB \to K \nu \nu, including the determination of the CKM matrix element Vub|V_{ub}|, the normalized differential q2q^2 distributions and precision observables defined by the ratios of branching fractions for the above-mentioned two channels in the same intervals of q2q^2.Comment: 36 pages, 9 figure

    Forgetting Exceptions is Harmful in Language Learning

    Get PDF
    We show that in language learning, contrary to received wisdom, keeping exceptional training instances in memory can be beneficial for generalization accuracy. We investigate this phenomenon empirically on a selection of benchmark natural language processing tasks: grapheme-to-phoneme conversion, part-of-speech tagging, prepositional-phrase attachment, and base noun phrase chunking. In a first series of experiments we combine memory-based learning with training set editing techniques, in which instances are edited based on their typicality and class prediction strength. Results show that editing exceptional instances (with low typicality or low class prediction strength) tends to harm generalization accuracy. In a second series of experiments we compare memory-based learning and decision-tree learning methods on the same selection of tasks, and find that decision-tree learning often performs worse than memory-based learning. Moreover, the decrease in performance can be linked to the degree of abstraction from exceptions (i.e., pruning or eagerness). We provide explanations for both results in terms of the properties of the natural language processing tasks and the learning algorithms.Comment: 31 pages, 7 figures, 10 tables. uses 11pt, fullname, a4wide tex styles. Pre-print version of article to appear in Machine Learning 11:1-3, Special Issue on Natural Language Learning. Figures on page 22 slightly compressed to avoid page overloa

    Building Program Vector Representations for Deep Learning

    Full text link
    Deep learning has made significant breakthroughs in various fields of artificial intelligence. Advantages of deep learning include the ability to capture highly complicated features, weak involvement of human engineering, etc. However, it is still virtually impossible to use deep learning to analyze programs since deep architectures cannot be trained effectively with pure back propagation. In this pioneering paper, we propose the "coding criterion" to build program vector representations, which are the premise of deep learning for program analysis. Our representation learning approach directly makes deep learning a reality in this new field. We evaluate the learned vector representations both qualitatively and quantitatively. We conclude, based on the experiments, the coding criterion is successful in building program representations. To evaluate whether deep learning is beneficial for program analysis, we feed the representations to deep neural networks, and achieve higher accuracy in the program classification task than "shallow" methods, such as logistic regression and the support vector machine. This result confirms the feasibility of deep learning to analyze programs. It also gives primary evidence of its success in this new field. We believe deep learning will become an outstanding technique for program analysis in the near future.Comment: This paper was submitted to ICSE'1
    corecore