582 research outputs found
Water Soluble Derivatives of Camptothecin/Homocamptothecin
Camptothecin and homocamptothecin analogs and derivatives are provided incorporating alkylamine and polyalkylamine moieties
Long-Short-Range Message-Passing: A Physics-Informed Framework to Capture Non-Local Interaction for Scalable Molecular Dynamics Simulation
Computational simulation of chemical and biological systems using ab initio
molecular dynamics has been a challenge over decades. Researchers have
attempted to address the problem with machine learning and fragmentation-based
methods, however the two approaches fail to give a satisfactory description of
long-range and many-body interactions, respectively. Inspired by
fragmentation-based methods, we propose the Long-Short-Range Message-Passing
(LSR-MP) framework as a generalization of the existing equivariant graph neural
networks (EGNNs) with the intent to incorporate long-range interactions
efficiently and effectively. We apply the LSR-MP framework to the recently
proposed ViSNet and demonstrate the state-of-the-art results with up to
error reduction for molecules in MD22 and Chignolin datasets. Consistent
improvements to various EGNNs will also be discussed to illustrate the general
applicability and robustness of our LSR-MP framework
SoftCorrect: Error Correction with Soft Detection for Automatic Speech Recognition
Error correction in automatic speech recognition (ASR) aims to correct those
incorrect words in sentences generated by ASR models. Since recent ASR models
usually have low word error rate (WER), to avoid affecting originally correct
tokens, error correction models should only modify incorrect words, and
therefore detecting incorrect words is important for error correction. Previous
works on error correction either implicitly detect error words through
target-source attention or CTC (connectionist temporal classification) loss, or
explicitly locate specific deletion/substitution/insertion errors. However,
implicit error detection does not provide clear signal about which tokens are
incorrect and explicit error detection suffers from low detection accuracy. In
this paper, we propose SoftCorrect with a soft error detection mechanism to
avoid the limitations of both explicit and implicit error detection.
Specifically, we first detect whether a token is correct or not through a
probability produced by a dedicatedly designed language model, and then design
a constrained CTC loss that only duplicates the detected incorrect tokens to
let the decoder focus on the correction of error tokens. Compared with implicit
error detection with CTC loss, SoftCorrect provides explicit signal about which
words are incorrect and thus does not need to duplicate every token but only
incorrect tokens; compared with explicit error detection, SoftCorrect does not
detect specific deletion/substitution/insertion errors but just leaves it to
CTC loss. Experiments on AISHELL-1 and Aidatatang datasets show that
SoftCorrect achieves 26.1% and 9.4% CER reduction respectively, outperforming
previous works by a large margin, while still enjoying fast speed of parallel
generation.Comment: AAAI 202
FastCorrect 2: Fast Error Correction on Multiple Candidates for Automatic Speech Recognition
Error correction is widely used in automatic speech recognition (ASR) to
post-process the generated sentence, and can further reduce the word error rate
(WER). Although multiple candidates are generated by an ASR system through beam
search, current error correction approaches can only correct one sentence at a
time, failing to leverage the voting effect from multiple candidates to better
detect and correct error tokens. In this work, we propose FastCorrect 2, an
error correction model that takes multiple ASR candidates as input for better
correction accuracy. FastCorrect 2 adopts non-autoregressive generation for
fast inference, which consists of an encoder that processes multiple source
sentences and a decoder that generates the target sentence in parallel from the
adjusted source sentence, where the adjustment is based on the predicted
duration of each source token. However, there are some issues when handling
multiple source sentences. First, it is non-trivial to leverage the voting
effect from multiple source sentences since they usually vary in length. Thus,
we propose a novel alignment algorithm to maximize the degree of token
alignment among multiple sentences in terms of token and pronunciation
similarity. Second, the decoder can only take one adjusted source sentence as
input, while there are multiple source sentences. Thus, we develop a candidate
predictor to detect the most suitable candidate for the decoder. Experiments on
our inhouse dataset and AISHELL-1 show that FastCorrect 2 can further reduce
the WER over the previous correction model with single candidate by 3.2% and
2.6%, demonstrating the effectiveness of leveraging multiple candidates in ASR
error correction. FastCorrect 2 achieves better performance than the cascaded
re-scoring and correction pipeline and can serve as a unified post-processing
module for ASR.Comment: Findings of EMNLP 202
Carbon-Chain Molecules in Molecular Outflows and Lupus I Region--New Producing Region and New Forming Mechanism
Using the new equipment of the Shanghai Tian Ma Radio Telescope, we have
searched for carbon-chain molecules (CCMs) towards five outflow sources and six
Lupus I starless dust cores, including one region known to be characterized by
warm carbon-chain chemistry (WCCC), Lupus I-1 (IRAS 15398-3359), and one TMC-1
like cloud, Lupus I-6 (Lupus-1A). Lines of HC3N J=2-1, HC5N J=6-5, HC7N
J=14-13, 15-14, 16-15 and C3S J=3-2 were detected in all the targets except in
the outflow source L1660 and the starless dust core Lupus I-3/4. The column
densities of nitrogen-bearing species range from 10 to 10
cm and those of CS are about 10 cm. Two outflow
sources, I20582+7724 and L1221, could be identified as new
carbon-chain--producing regions. Four of the Lupus I dust cores are newly
identified as early quiescent and dark carbon-chain--producing regions similar
to Lup I-6, which together with the WCCC source, Lup I-1, indicate that
carbon-chain-producing regions are popular in Lupus I which can be regard as a
Taurus like molecular cloud complex in our Galaxy. The column densities of C3S
are larger than those of HC7N in the three outflow sources I20582, L1221 and
L1251A. Shocked carbon-chain chemistry (SCCC) is proposed to explain the
abnormal high abundances of C3S compared with those of nitrogen-bearing CCMs.
Gas-grain chemical models support the idea that shocks can fuel the environment
of those sources with enough thus driving the generation of S-bearing
CCMs.Comment: 7 figures, 8 tables, accepted by MNRA
Possible Eliashberg-type superconductivity enhancement effects in a two-band superconductor MgB2 driven by narrow-band THz pulses
We study THz-driven condensate dynamics in epitaxial thin films of MgB,
a prototype two-band superconductor (SC) with weak interband coupling. The
temperature and excitation density dependent dynamics follow the behavior
predicted by the phenomenological bottleneck model for the single-gap SC,
implying adiabatic coupling between the two condensates on the ps timescale.
The amplitude of the THz-driven suppression of condensate density reveals an
unexpected decrease in pair-breaking efficiency with increasing temperature -
unlike in the case of optical excitation. The reduced pair-breaking efficiency
of narrow-band THz pulses, displaying minimum near T, is
attributed to THz-driven, long-lived, non-thermal quasiparticle distribution,
resulting in Eliashberg-type enhancement of superconductivity, competing with
pair-breaking
- …