3,228 research outputs found
Generating Synthetic Data for Task-Oriented Semantic Parsing with Hierarchical Representations
Modern conversational AI systems support natural language understanding for a
wide variety of capabilities. While a majority of these tasks can be
accomplished using a simple and flat representation of intents and slots, more
sophisticated capabilities require complex hierarchical representations
supported by semantic parsing. State-of-the-art semantic parsers are trained
using supervised learning with data labeled according to a hierarchical schema
which might be costly to obtain or not readily available for a new domain. In
this work, we explore the possibility of generating synthetic data for neural
semantic parsing using a pretrained denoising sequence-to-sequence model (i.e.,
BART). Specifically, we first extract masked templates from the existing
labeled utterances, and then fine-tune BART to generate synthetic utterances
conditioning on the extracted templates. Finally, we use an auxiliary parser
(AP) to filter the generated utterances. The AP guarantees the quality of the
generated data. We show the potential of our approach when evaluating on the
Facebook TOP dataset for navigation domain.Comment: Workshop on Structured Prediction for NLP, EMNLP 202
Adatom Doping-Enriched Geometric and Electronic Properties of Pristine Graphene: a Method to Modify the Band Gap
We have investigated the way in which the concentration and distribution of
adatoms affect the geometric and electronic properties of graphene. Our
calculations were based on the use of first principle under the density
functional theory which reveal various types of -bonding. The energy band
structure of this doped graphene material may be explored experimentally by
employing angle-resolved photo-emission spectroscopy (ARPES) for electronic
band structure measurements and scanning tunneling spectroscopy (STS) for the
density-of-states (DOS) both of which have been calculated and reported in this
paper. Our calculations show that such adatom doping is responsible for the
destruction or appearance of the Dirac cone structure.Comment: 15 pages, 4 figure
Protein identification with deep learning: from abc to xyz
Proteins are the main workhorses of biological functions in a cell, a tissue,
or an organism. Identification and quantification of proteins in a given
sample, e.g. a cell type under normal/disease conditions, are fundamental tasks
for the understanding of human health and disease. In this paper, we present
DeepNovo, a deep learning-based tool to address the problem of protein
identification from tandem mass spectrometry data. The idea was first proposed
in the context of de novo peptide sequencing [1] in which convolutional neural
networks and recurrent neural networks were applied to predict the amino acid
sequence of a peptide from its spectrum, a similar task to generating a caption
from an image. We further develop DeepNovo to perform sequence database search,
the main technique for peptide identification that greatly benefits from
numerous existing protein databases. We combine two modules de novo sequencing
and database search into a single deep learning framework for peptide
identification, and integrate de Bruijn graph assembly technique to offer a
complete solution to reconstruct protein sequences from tandem mass
spectrometry data. This paper describes a comprehensive protocol of DeepNovo
for protein identification, including training neural network models, dynamic
programming search, database querying, estimation of false discovery rate, and
de Bruijn graph assembly. Training and testing data, model implementations, and
comprehensive tutorials in form of IPython notebooks are available in our
GitHub repository (https://github.com/nh2tran/DeepNovo)
Rich Essential Properties of Si-Doped Graphene
A theoretical framework, which is under the first-principles calculations, is
developed to fully explore the dramatic changes of essential properties due to
the silicon-atom chemical modifications on monolayer graphenes. For the
Si-chemisorption and Si-substituted graphenes, the guest-atom-diversified
geometric structures, the Si- and C-dominated energy bands, the magnetic
moments, the charge transfers, the spatial charge densities, the spin
distribution configurations, and the van Hove singularities in the atom- and
orbital-projected density of states are investigated thoroughly by the delicate
evaluations and analyses. Such fundamental properties are sufficient in
determining the critical physical and chemical pictures, in which the accurate
multi-orbital hybridizations are very useful in comprehending the diverse
phenomena, e.g., the C- and Si-co-dominated energy bands, the semiconducting or
metallic behaviors, and the existence/absence of Dirac-cone band structures.
This developing model could be generalized to other emergent layered materials
-bonding-dominated energy gaps in graphene oxides
Chemical bondings of graphene oxides with oxygen concentration from 1\% to
50\% are investigated using first-principle calculations. Energy gaps are
mainly determined by the competition of orbital hybridizations in C-C, O-O, and
C-O bonds. They are very sensitive to the changes in oxygen concentration and
distributions. There exists five types of bondings during the variation
from the full to vanishing adsorptions, namely the complete termination, the
partial suppression, the 1D bonding, the deformed planar bonding, and the
well-behaved one. They can account for the finite and gapless characteristics,
corresponding to the O-concentrations of 25\% and 3\%, respectively. The
feature-rich chemical bondings dominate band structures and density of states,
leading to diverse electronic properties.Comment: 19 pages, 4 figure
Concentration-Diversified Magnetic and Electronic Properties of Halogen-Adsorbed Silicene
Diverse magnetic and electronic properties of halogen-adsorbed silicene are
investigated by the first-principles theoretical framework, including the
adatom-diversified geometric structures, the atom-dominated energy bands, the
spatial spin density distributions, the spatial charge density distributions
and its variations, and the spin- and orbital-projected density of states.
Also, such physical quantities are sufficient to identify similar and different
features in the double-side and single-side adsorptions. The former belongs to
the concentration-depended finite gap semiconductors or p-type metals, while
the latter display the valence energy bands with/without spin-splitting
intersecting with the Fermi level. Both adsorption types show the
halogen-related weakly dispersed bands at deep energies, the adatom-modified
middle-energy sigma bands, and the recovery of low-energy pi bands during the
destruction of the halogen concentrations. Such feature-rich band structures
can be verified by the angle-resolved photoemission spectroscopy experiment
Chemical Bondings Induced Rich Electronic Properties of Oxygen Absorbed Few-layer Graphenes
Electronic properties of graphene oxides enriched by the strong chemical
bondings are investigated using first-principle calculations. They are very
sensitive to the changes in the number of graphene layer, stacking
configuration, and distribution of oxygen. The feature-rich electronic
structures exhibit the destruction or distortion of Dirac cone, opening of band
gap, anisotropic energy dispersions, O- and (C,O)-dominated energy dispersions,
and extra critical points. All the few-layer graphene oxides are semi-metals
except for the semiconducting monolayer ones. For the former, the distorted
Dirac-cone structures and the O-dominated energy bands near the Fermi level are
revealed simultaneously. The orbital-projected density of states (DOS) have
many special structures mainly coming from a composite energy band, the
parabolic and partially flat ones. The DOS and spatial charge distributions
clearly indicate the critical bondings in O-O, C-O and C-C bonds, being
responsible for the diversified properties
DeepNovoV2: Better de novo peptide sequencing with deep learning
Personalized cancer vaccines are envisioned as the next generation rational
cancer immunotherapy. The key step in developing personalized therapeutic
cancer vaccines is to identify tumor-specific neoantigens that are on the
surface of tumor cells. A promising method for this is through de novo peptide
sequencing from mass spectrometry data. In this paper we introduce DeepNovoV2,
the state-of-the-art model for peptide sequencing. In DeepNovoV2, a spectrum is
directly represented as a set of (m/z, intensity) pairs, therefore it does not
suffer from the accuracy-speed/memory trade-off problem. The model combines an
order invariant network structure (T-Net) and recurrent neural networks and
provides a complete end-to-end training and prediction framework to sequence
patterns of peptides. Our experiments on a wide variety of data from different
species show that DeepNovoV2 outperforms previous state-of-the-art methods,
achieving 13.01-23.95\% higher accuracy at the peptide level
Diversified essential properties in halogenated graphenes
The significant halogenation effects on the essential properties of graphene
are investigated by the first-principles method. The geometric structures,
electronic properties, and magnetic configurations are greatly diversified
under the various halogen adsorptions. Fluorination, with the strong
multi-orbital chemical bondings, can create the buckled graphene structure,
while the other halogenations do not change the planar {\sigma} bonding in the
presence of single-orbital hybridization. Electronic structures consist of the
carbon-, adatom- and (carbon, adatom)-dominated energy bands. All halogenated
graphenes belong to hole-doped metals except that fluorinated systems are
middle-gap semiconductors at sufficiently high concentration. Moreover, the
metallic ferromagnetism is revealed in certain adatom distributions. The
unusual hybridization-induced features are clearly evidenced in many van Hove
singularities of the density of states. The structure- and adatom-enriched
essential properties are compared with the measured results, and potential
applications are also discussed.Comment: arXiv admin note: substantial text overlap with arXiv:1702.0203
Fundamental properties of transition-metals-adsorbed graphene
The revealing properties of transition metal (T)-doped graphene systems are
investigated with the use of the first-principles method. The detailed
calculations cover the bond length, position and height of adatoms, binding
energy, atom-dominated band structure, adatom-induced free carrier density as
well as energy gap, spin-density distributions, spatial charge distribution,
and atom-, orbital- and spin-projected density-of-states (DOS). The magnetic
configurations are clearly identified from the total magnetic moments,
spin-split energy bands, spin-density distributions and spin-decomposed DOS.
Moreover, the single- or multi-orbital hybridizations in T-C, T-T, and C-C
bonds can be accurately deduced from the careful analyses of the
above-mentioned physical quantities. They are responsible for the optimal
geometric structure, the unusual electronic properties, as well as the diverse
magnetic properties. All the doped systems are metals except for the
low-concentration Ni-doped ones with semiconducting behavior. In contrast,
ferromagnetism is exhibited in various Fe/Co-concentrations but only under high
Ni-concentrations. Our theoretical predictions are compared with available
experimental data, and potential applications are also discussed.Comment: 37 pages, 7 figure
- …