3,228 research outputs found

    Generating Synthetic Data for Task-Oriented Semantic Parsing with Hierarchical Representations

    Full text link
    Modern conversational AI systems support natural language understanding for a wide variety of capabilities. While a majority of these tasks can be accomplished using a simple and flat representation of intents and slots, more sophisticated capabilities require complex hierarchical representations supported by semantic parsing. State-of-the-art semantic parsers are trained using supervised learning with data labeled according to a hierarchical schema which might be costly to obtain or not readily available for a new domain. In this work, we explore the possibility of generating synthetic data for neural semantic parsing using a pretrained denoising sequence-to-sequence model (i.e., BART). Specifically, we first extract masked templates from the existing labeled utterances, and then fine-tune BART to generate synthetic utterances conditioning on the extracted templates. Finally, we use an auxiliary parser (AP) to filter the generated utterances. The AP guarantees the quality of the generated data. We show the potential of our approach when evaluating on the Facebook TOP dataset for navigation domain.Comment: Workshop on Structured Prediction for NLP, EMNLP 202

    Adatom Doping-Enriched Geometric and Electronic Properties of Pristine Graphene: a Method to Modify the Band Gap

    Full text link
    We have investigated the way in which the concentration and distribution of adatoms affect the geometric and electronic properties of graphene. Our calculations were based on the use of first principle under the density functional theory which reveal various types of π\pi-bonding. The energy band structure of this doped graphene material may be explored experimentally by employing angle-resolved photo-emission spectroscopy (ARPES) for electronic band structure measurements and scanning tunneling spectroscopy (STS) for the density-of-states (DOS) both of which have been calculated and reported in this paper. Our calculations show that such adatom doping is responsible for the destruction or appearance of the Dirac cone structure.Comment: 15 pages, 4 figure

    Protein identification with deep learning: from abc to xyz

    Full text link
    Proteins are the main workhorses of biological functions in a cell, a tissue, or an organism. Identification and quantification of proteins in a given sample, e.g. a cell type under normal/disease conditions, are fundamental tasks for the understanding of human health and disease. In this paper, we present DeepNovo, a deep learning-based tool to address the problem of protein identification from tandem mass spectrometry data. The idea was first proposed in the context of de novo peptide sequencing [1] in which convolutional neural networks and recurrent neural networks were applied to predict the amino acid sequence of a peptide from its spectrum, a similar task to generating a caption from an image. We further develop DeepNovo to perform sequence database search, the main technique for peptide identification that greatly benefits from numerous existing protein databases. We combine two modules de novo sequencing and database search into a single deep learning framework for peptide identification, and integrate de Bruijn graph assembly technique to offer a complete solution to reconstruct protein sequences from tandem mass spectrometry data. This paper describes a comprehensive protocol of DeepNovo for protein identification, including training neural network models, dynamic programming search, database querying, estimation of false discovery rate, and de Bruijn graph assembly. Training and testing data, model implementations, and comprehensive tutorials in form of IPython notebooks are available in our GitHub repository (https://github.com/nh2tran/DeepNovo)

    Rich Essential Properties of Si-Doped Graphene

    Full text link
    A theoretical framework, which is under the first-principles calculations, is developed to fully explore the dramatic changes of essential properties due to the silicon-atom chemical modifications on monolayer graphenes. For the Si-chemisorption and Si-substituted graphenes, the guest-atom-diversified geometric structures, the Si- and C-dominated energy bands, the magnetic moments, the charge transfers, the spatial charge densities, the spin distribution configurations, and the van Hove singularities in the atom- and orbital-projected density of states are investigated thoroughly by the delicate evaluations and analyses. Such fundamental properties are sufficient in determining the critical physical and chemical pictures, in which the accurate multi-orbital hybridizations are very useful in comprehending the diverse phenomena, e.g., the C- and Si-co-dominated energy bands, the semiconducting or metallic behaviors, and the existence/absence of Dirac-cone band structures. This developing model could be generalized to other emergent layered materials

    π\pi-bonding-dominated energy gaps in graphene oxides

    Full text link
    Chemical bondings of graphene oxides with oxygen concentration from 1\% to 50\% are investigated using first-principle calculations. Energy gaps are mainly determined by the competition of orbital hybridizations in C-C, O-O, and C-O bonds. They are very sensitive to the changes in oxygen concentration and distributions. There exists five types of π\pi bondings during the variation from the full to vanishing adsorptions, namely the complete termination, the partial suppression, the 1D bonding, the deformed planar bonding, and the well-behaved one. They can account for the finite and gapless characteristics, corresponding to the O-concentrations of >>25\% and <<3\%, respectively. The feature-rich chemical bondings dominate band structures and density of states, leading to diverse electronic properties.Comment: 19 pages, 4 figure

    Concentration-Diversified Magnetic and Electronic Properties of Halogen-Adsorbed Silicene

    Full text link
    Diverse magnetic and electronic properties of halogen-adsorbed silicene are investigated by the first-principles theoretical framework, including the adatom-diversified geometric structures, the atom-dominated energy bands, the spatial spin density distributions, the spatial charge density distributions and its variations, and the spin- and orbital-projected density of states. Also, such physical quantities are sufficient to identify similar and different features in the double-side and single-side adsorptions. The former belongs to the concentration-depended finite gap semiconductors or p-type metals, while the latter display the valence energy bands with/without spin-splitting intersecting with the Fermi level. Both adsorption types show the halogen-related weakly dispersed bands at deep energies, the adatom-modified middle-energy sigma bands, and the recovery of low-energy pi bands during the destruction of the halogen concentrations. Such feature-rich band structures can be verified by the angle-resolved photoemission spectroscopy experiment

    Chemical Bondings Induced Rich Electronic Properties of Oxygen Absorbed Few-layer Graphenes

    Full text link
    Electronic properties of graphene oxides enriched by the strong chemical bondings are investigated using first-principle calculations. They are very sensitive to the changes in the number of graphene layer, stacking configuration, and distribution of oxygen. The feature-rich electronic structures exhibit the destruction or distortion of Dirac cone, opening of band gap, anisotropic energy dispersions, O- and (C,O)-dominated energy dispersions, and extra critical points. All the few-layer graphene oxides are semi-metals except for the semiconducting monolayer ones. For the former, the distorted Dirac-cone structures and the O-dominated energy bands near the Fermi level are revealed simultaneously. The orbital-projected density of states (DOS) have many special structures mainly coming from a composite energy band, the parabolic and partially flat ones. The DOS and spatial charge distributions clearly indicate the critical bondings in O-O, C-O and C-C bonds, being responsible for the diversified properties

    DeepNovoV2: Better de novo peptide sequencing with deep learning

    Full text link
    Personalized cancer vaccines are envisioned as the next generation rational cancer immunotherapy. The key step in developing personalized therapeutic cancer vaccines is to identify tumor-specific neoantigens that are on the surface of tumor cells. A promising method for this is through de novo peptide sequencing from mass spectrometry data. In this paper we introduce DeepNovoV2, the state-of-the-art model for peptide sequencing. In DeepNovoV2, a spectrum is directly represented as a set of (m/z, intensity) pairs, therefore it does not suffer from the accuracy-speed/memory trade-off problem. The model combines an order invariant network structure (T-Net) and recurrent neural networks and provides a complete end-to-end training and prediction framework to sequence patterns of peptides. Our experiments on a wide variety of data from different species show that DeepNovoV2 outperforms previous state-of-the-art methods, achieving 13.01-23.95\% higher accuracy at the peptide level

    Diversified essential properties in halogenated graphenes

    Full text link
    The significant halogenation effects on the essential properties of graphene are investigated by the first-principles method. The geometric structures, electronic properties, and magnetic configurations are greatly diversified under the various halogen adsorptions. Fluorination, with the strong multi-orbital chemical bondings, can create the buckled graphene structure, while the other halogenations do not change the planar {\sigma} bonding in the presence of single-orbital hybridization. Electronic structures consist of the carbon-, adatom- and (carbon, adatom)-dominated energy bands. All halogenated graphenes belong to hole-doped metals except that fluorinated systems are middle-gap semiconductors at sufficiently high concentration. Moreover, the metallic ferromagnetism is revealed in certain adatom distributions. The unusual hybridization-induced features are clearly evidenced in many van Hove singularities of the density of states. The structure- and adatom-enriched essential properties are compared with the measured results, and potential applications are also discussed.Comment: arXiv admin note: substantial text overlap with arXiv:1702.0203

    Fundamental properties of transition-metals-adsorbed graphene

    Full text link
    The revealing properties of transition metal (T)-doped graphene systems are investigated with the use of the first-principles method. The detailed calculations cover the bond length, position and height of adatoms, binding energy, atom-dominated band structure, adatom-induced free carrier density as well as energy gap, spin-density distributions, spatial charge distribution, and atom-, orbital- and spin-projected density-of-states (DOS). The magnetic configurations are clearly identified from the total magnetic moments, spin-split energy bands, spin-density distributions and spin-decomposed DOS. Moreover, the single- or multi-orbital hybridizations in T-C, T-T, and C-C bonds can be accurately deduced from the careful analyses of the above-mentioned physical quantities. They are responsible for the optimal geometric structure, the unusual electronic properties, as well as the diverse magnetic properties. All the doped systems are metals except for the low-concentration Ni-doped ones with semiconducting behavior. In contrast, ferromagnetism is exhibited in various Fe/Co-concentrations but only under high Ni-concentrations. Our theoretical predictions are compared with available experimental data, and potential applications are also discussed.Comment: 37 pages, 7 figure
    • …
    corecore