24 research outputs found
Non-Autoregressive Neural Machine Translation with Enhanced Decoder Input
Non-autoregressive translation (NAT) models, which remove the dependence on
previous target tokens from the inputs of the decoder, achieve significantly
inference speedup but at the cost of inferior accuracy compared to
autoregressive translation (AT) models. Previous work shows that the quality of
the inputs of the decoder is important and largely impacts the model accuracy.
In this paper, we propose two methods to enhance the decoder inputs so as to
improve NAT models. The first one directly leverages a phrase table generated
by conventional SMT approaches to translate source tokens to target tokens,
which are then fed into the decoder as inputs. The second one transforms
source-side word embeddings to target-side word embeddings through
sentence-level alignment and word-level adversary learning, and then feeds the
transformed word embeddings into the decoder as inputs. Experimental results
show our method largely outperforms the NAT baseline~\citep{gu2017non} by
BLEU scores on WMT14 English-German task and BLEU scores on WMT16
English-Romanian task.Comment: AAAI 201
Fine-Tuning by Curriculum Learning for Non-Autoregressive Neural Machine Translation
Non-autoregressive translation (NAT) models remove the dependence on previous
target tokens and generate all target tokens in parallel, resulting in
significant inference speedup but at the cost of inferior translation accuracy
compared to autoregressive translation (AT) models. Considering that AT models
have higher accuracy and are easier to train than NAT models, and both of them
share the same model configurations, a natural idea to improve the accuracy of
NAT models is to transfer a well-trained AT model to an NAT model through
fine-tuning. However, since AT and NAT models differ greatly in training
strategy, straightforward fine-tuning does not work well. In this work, we
introduce curriculum learning into fine-tuning for NAT. Specifically, we design
a curriculum in the fine-tuning process to progressively switch the training
from autoregressive generation to non-autoregressive generation. Experiments on
four benchmark translation datasets show that the proposed method achieves good
improvement (more than BLEU score) over previous NAT baselines in terms of
translation accuracy, and greatly speed up (more than times) the inference
process over AT baselines.Comment: AAAI 202
Recommended from our members
A new model to downscale urban and rural surface and air temperatures evaluated in Shanghai, China
A simple model, TsT2m (Surface Temperature and near surface air Temperature (at 2 m) model), is developed to downscale numerical model output (such as from ECMWF) to obtain higher temporal and spatial resolution surface and near surface air temperature. It is evaluated in Shanghai, China. Surface temperature (TS) and near surface air temperature (Ta) sub-models account for variations in land covers and their different thermal properties, resulting in spatial variations of surface and air temperature. The Net All Wave Radiation Parameterization (NARP) scheme is used to compute net wave radiation for the surface temperature sub-model, the Objective Hysteresis Model (OHM) is used to calculate the net storage heat fluxes, and the surface temperature is obtained by the force-restore method. The near surface air temperature sub-model considers the horizontal and vertical energy changes for a column of well mixed air above the surface. Modeled surface temperatures reproduce the general pattern of MODIS images well, while providing more detailed patterns of the surface urban heat island. However, the simulated surface temperatures capture the warmer urban land cover and are 10.3°C warmer on average than those derived from the coarser MODIS data. For other land cover types values are more similar. Downscaled, higher temporal and spatial resolution air temperatures are compared to observations at 110 Automatic Weather Stations across Shanghai. After downscaling with the TsT2m model, the average forecast accuracy of near surface air temperature is improved by about 20%. The scheme developed has considerable potential for prediction and mitigation of urban climate conditions, particularly for weather and climate services related to heat stres
Insight-HXMT observations of Swift J0243.6+6124 during its 2017-2018 outburst
The recently discovered neutron star transient Swift J0243.6+6124 has been
monitored by {\it the Hard X-ray Modulation Telescope} ({\it Insight-\rm HXMT).
Based on the obtained data, we investigate the broadband spectrum of the source
throughout the outburst. We estimate the broadband flux of the source and
search for possible cyclotron line in the broadband spectrum. No evidence of
line-like features is, however, found up to . In the absence of
any cyclotron line in its energy spectrum, we estimate the magnetic field of
the source based on the observed spin evolution of the neutron star by applying
two accretion torque models. In both cases, we get consistent results with
, and peak luminosity of which makes the source the first Galactic ultraluminous
X-ray source hosting a neutron star.Comment: publishe
Overview to the Hard X-ray Modulation Telescope (Insight-HXMT) Satellite
As China's first X-ray astronomical satellite, the Hard X-ray Modulation
Telescope (HXMT), which was dubbed as Insight-HXMT after the launch on June 15,
2017, is a wide-band (1-250 keV) slat-collimator-based X-ray astronomy
satellite with the capability of all-sky monitoring in 0.2-3 MeV. It was
designed to perform pointing, scanning and gamma-ray burst (GRB) observations
and, based on the Direct Demodulation Method (DDM), the image of the scanned
sky region can be reconstructed. Here we give an overview of the mission and
its progresses, including payload, core sciences, ground calibration/facility,
ground segment, data archive, software, in-orbit performance, calibration,
background model, observations and some preliminary results.Comment: 29 pages, 40 figures, 6 tables, to appear in Sci. China-Phys. Mech.
Astron. arXiv admin note: text overlap with arXiv:1910.0443
Seismic Response Analysis of Steel–Concrete Composite Frame Structures with URSP Connectors
The uplift-restricted and slip-permitted (URSP) connector is a new type of connector used in steel–concrete composite structures that has been proven to improve the structural performance of negative moment regions. Since this connector changes the interface restraint between the slab and steel beam, there is an imperative to study the seismic performance of steel–concrete composite frame systems with this new type of connector. In this study, the dynamic behavior of composite frame structures with URSP connectors under seismic loads was numerically investigated. First, a beam–shell mixed model was used and complex interfaces of different connectors were considered while establishing a numerical model to conduct elasto–plastic time history analysis under various seismic loads. This numerical model was validated with the frame sub-assemblage experimental results of quasi-static cyclic tests. Second, the model analysis results of structures with URSP connectors were obtained and compared with those of traditional structures. Third, dynamic response results including roof displacement, inter-story displacement, and the distribution and failure modes of plastic hinges were analyzed and compared. The comparisons indicated that the arrangement of full-span URSP connectors had a non-negligible influence on the dynamic behavior of the systems. The arrangement increased the maximum inter-story displacement by 31.5% and induced adverse effects in certain cases, which is not suggested in the application of URSP connectors. The partial arrangement of URSP connectors had little influence on the dynamic behavior of the systems, and the frame systems still showed a good seismic performance, which was the same as the traditional composite structural system. These findings may promote the application of URSP connectors in composite structures
Analysis of Walking Accessibility of Park Green Space in Weidu District of Xuchang City Based on GIS
Taking the park green space in Weidu District of Xuchang City as the research object, the spatial service area of the park green space under the mode of pedestrian transportation is analyzed by using the grid analysis function of geographic information system (GIS). The results show that under the walking mode in the research area, only less than 1/10 of the residents can walk to the park within 10 minutes, and nearly 1/5 of the residents can enjoy the service function of urban park green space within 20 minutes. Under the walking mode of 30 minutes, the accessibility area of park green space is only 36.63%, which is mainly concentrated in the new urban area. There are many urban parks with large scale. The number of urban parks in the old urban area is small and the accessibility is relatively poor. The research results can provide theoretical basis for optimizing the spatial structure of green space in Weidu District of Xuchang City
Influence of different seismic motion input modes on the performance of isolated structures with different seismic measures
In order to obtain the influence of different seismic motion input modes on the performance of isolated structures with different seismic measures, the two aspects from near-fault seismic motion velocity pulse input and different dimension seismic motion input modes are studied. The finite element model of traditional seismic and base isolation frame structure with different aspect ratios is established. The actual near-seismic strong earthquake record with forward directional effect and slipping speed pulse is used as the input method of structural seismic motion to carry out nonlinear dynamics. The different dimensional seismic motion input method is selected as the quantitative, the tensile–compression stiffness ratio is the variable, and the time-history analysis of the isolation performance of a high-rise isolated structure is carried out. The experimental results show that for structures with an aspect ratio H/B of 1, 2, 3, and 4, the smaller the aspect ratio is, the better the damping effect is; the different dimensional vibration input has less isolation performance for the isolation bearing. From small to large, it is: one-dimensional vibration input, two-dimensional vibration input, three-dimensional vibration input
Differential Function of Endogenous and Exogenous Abscisic Acid during Bacterial Pattern-Induced Production of Reactive Oxygen Species in <i>Arabidopsis</i>
Abscisic acid (ABA) plays important roles in positively or negatively regulating plant disease resistance to pathogens. Here, we reassess the role of endogenous and exogenous ABA by using: 35S::ABA2, a previously reported transgenic Arabidopsis line with increased endogenous ABA levels; aba2-1, a previously reported ABA2 mutant with reduced endogenous ABA levels; and exogenous application of ABA. We found that bacterial susceptibility promoted by exogenous ABA was suppressed in 35S::ABA2 plants. The 35S::ABA2 and aba2-1 plants displayed elevated and reduced levels, respectively, of bacterial flagellin peptide (flg22)-induced H2O2. Surprisingly, ABA pre-treatment reduced flg22-induced H2O2 generation. Exogenous, but not endogenous ABA, increased catalase activity. Loss of nicotinamide adenine dinucleotide phosphate oxidase genes, RBOHD and RBOHF, restored exogenous ABA-promoted bacterial susceptibility of 35S::ABA2 transgenic plants. In addition, endogenous and exogenous ABA had similar effects on callose deposition and salicylic acid (SA) signaling. These results reveal an underlying difference between endogenous and exogenous ABA in regulating plant defense responses. Given that some plant pathogens are able to synthesize ABA and affect endogenous ABA levels in plants, our results highlight the importance of reactive oxygen species in the dual function of ABA during plant-pathogen interactions