39 research outputs found

    KAT: A Knowledge Augmented Transformer for Vision-and-Language

    Full text link
    The primary focus of recent work with largescale transformers has been on optimizing the amount of information packed into the model's parameters. In this work, we ask a different question: Can multimodal transformers leverage explicit knowledge in their reasoning? Existing, primarily unimodal, methods have explored approaches under the paradigm of knowledge retrieval followed by answer prediction, but leave open questions about the quality and relevance of the retrieved knowledge used, and how the reasoning processes over implicit and explicit knowledge should be integrated. To address these challenges, we propose a novel model - Knowledge Augmented Transformer (KAT) - which achieves a strong state-of-the-art result (+6 points absolute) on the open-domain multimodal task of OK-VQA. Our approach integrates implicit and explicit knowledge in an end to end encoder-decoder architecture, while still jointly reasoning over both knowledge sources during answer generation. An additional benefit of explicit knowledge integration is seen in improved interpretability of model predictions in our analysis.Comment: Accepted by NAACL 202

    Lateral and Longitudinal Coordinated Control of Intelligent Vehicle Based on High-Precision Dynamics Model under High-Speed Limit Condition

    Get PDF
    This study focuses on improving the trajectorytracking control for intelligent vehicles in high-speed and largecurvature limit conditions. To this end, a high-precision fivedegree-of-freedom (5-DOF) dynamics model (HPM) that incorporates suspension characteristics is introduced. Furthermore, acoordinated lateral and longitudinal control system is developed.The lateral model predictive control (MPC) involves two crucialstages: initially, a desired trajectory with associated speed datais generated based on path curvature. Subsequently, using thehigh-precision 5-DOF dynamics model, an objective functionis formulated to minimize the difference between the vehicle’scurrent state and the desired state. This process determines theoptimal front wheel steering angle, taking into account vehiclepositional constraints and steering limitations. Additionally, adouble proportional–integral–derivative (PID) controller for longitudinal control adjusts the throttle and brake pressure basedon real-time position and speed data, ensuring integrated controlover both lateral and longitudinal movements. The effectivenessof this approach is confirmed through real vehicle testing andsimulation. Results show that the high-precision 5-DOF dynamicsmodel markedly enhances the accuracy of vehicle response modeling, and the coordinated control system successfully executesprecise trajectory tracking. In extreme scenarios of high-speedand large curvature, the enhanced model substantially improvestrajectory accuracy and driving stability, thus promoting safevehicle operation

    Lateral and Longitudinal Coordinated Control of Intelligent Vehicle Based on High-Precision Dynamics Model under High-Speed Limit Condition

    Get PDF
    This study focuses on improving the trajectorytracking control for intelligent vehicles in high-speed and largecurvature limit conditions. To this end, a high-precision fivedegree-of-freedom (5-DOF) dynamics model (HPM) that incorporates suspension characteristics is introduced. Furthermore, acoordinated lateral and longitudinal control system is developed.The lateral model predictive control (MPC) involves two crucialstages: initially, a desired trajectory with associated speed datais generated based on path curvature. Subsequently, using thehigh-precision 5-DOF dynamics model, an objective functionis formulated to minimize the difference between the vehicle’scurrent state and the desired state. This process determines theoptimal front wheel steering angle, taking into account vehiclepositional constraints and steering limitations. Additionally, adouble proportional–integral–derivative (PID) controller for longitudinal control adjusts the throttle and brake pressure basedon real-time position and speed data, ensuring integrated controlover both lateral and longitudinal movements. The effectivenessof this approach is confirmed through real vehicle testing andsimulation. Results show that the high-precision 5-DOF dynamicsmodel markedly enhances the accuracy of vehicle response modeling, and the coordinated control system successfully executesprecise trajectory tracking. In extreme scenarios of high-speedand large curvature, the enhanced model substantially improvestrajectory accuracy and driving stability, thus promoting safevehicle operation

    Evaluation of LAI Estimation of Mangrove Communities Using DLR and ELR Algorithms With UAV, Hyperspectral, and SAR Images

    Get PDF
    The high-precision estimation of mangrove leaf area index (LAI) using a deep learning regression algorithm (DLR) always requires a large amount of training sample data. However, it is difficult for LAI field measurements to collect a sufficient amount of sample data in mangrove wetlands. To tackle this challenge, this paper proposed an approach for expanding training samples and quantitatively evaluated the performance of estimating LAI for mangrove communities using Deep Neural Networks (DNN) and Transformer algorithms. This study also explored the effects of unmanned aerial vehicle (UAV) and Sentinel-2A multispectral, orbital hyper spectral (OHS), and GF-3 SAR images on LAI estimation of different mangrove communities. Finally, this paper evaluated the LAI estimation ability of mangrove communities using ensemble learning regression (ELR) and DLR algorithms. The results showed that: (1) the UAV images achieved the better LAI estimation of different mangrove communities (R2 = 0.5974–0.6186), and GF-3 SAR images were better for LAI estimation of Avicennia marina with high coverage (R2 = 0.567). The optimal spectral range for estimating LAI for mangroves in the optical images was between 650–680 nm. (2) The ELR model outperformed single base model, and produced the high-accuracy LAI estimation (R2 = 0.5266–0.713) for different mangrove communities. (3) The average accuracy (R2) of the ELR model was higher by 0.0019–0.149 than the DLR models, which demonstrated that the ELR model had a better capability (R2 = 0.5865–0.6416) in LAI estimation. The Transformer-based LAI estimation of A. marina (R2 = 0.6355) was better than the DNN model, while the DNN model produced higher accuracy for Kandelia candel (KC) (R2 = 0.5577). (4) With the increase in the expansion ratio of the training sample (10–50%), the LAI estimation accuracy (R2) of DNN and Transformer models for different mangrove communities increased by 0.1166–0.2037 and 0.1037–0.1644, respectively. Under the same estimation accuracy, the sample enhancement method in this paper could reduce the number of filed measurements by 20–40%

    SSIEGNOS: A New Asian Single Site Tropospheric Correction Model

    No full text
    This paper proposes a new Asian single site tropospheric correction model called the Single Site Improved European Geostationary Navigation Overlay Service model (SSIEGNOS) by refining the European Geostationary Navigation Overlay Service (EGNOS) model at a single site. The performance of the SSIEGNOS model is analyzed. The results show that (1) the bias and root mean square (RMS) error of zenith tropospheric delay (ZTD) calculated from the EGNOS model are 0.12 cm and 5.87 cm, respectively; whereas those of the SSIEGNOS model are 0 cm and 2.52 cm, respectively. (2) The bias and RMS error show seasonal variation in the EGNOS model; however, little seasonal variation is observed in the SSIEGNOS model. (3) The RMS error decreases with increasing altitude or latitude in the two models; however, no such relationships were found in the bias. In addition, the annual predicted bias and RMS error in Asia are −0.08 cm and 3.14 cm for the SSIEGNOS model, respectively; however, the EGNOS and UNB3m (University of New Brunswick) models show comparable predicted results. Relative to the EGNOS model, the annual predicted bias and RMS error decreased by 55% and 48%, respectively, for the SSIEGNOS model

    A zenith tropospheric delay correction model based on the regional CORS network

    No full text
    Tropospheric delay is a primary error source in earth observations and a variety of radio navigation technologies. In this paper, the relationship between zenith tropospheric delays and the elevation and longitude of stations is analyzed using the zenith tropospheric delay final products of International GNSS Service (IGS) stations from 2011. Two new models are proposed for estimating zenith tropospheric delays from regional CORS data without meteorological data. The proposed models are compared with the direct interpolation method and the remove-restore method using data from Guangxi CORS. The results show that the new models significantly improve the calculated precision. Finally, the root mean square (RMS) errors of the new models were used to estimate the surface precipitable water vapor (PWV) value at CORS station, which was determined to be less than 2 mm

    A New Vegetation Observable Derived from Spaceborne GNSS-R and Its Application to Vegetation Water Content Retrieval

    No full text
    In this study, a new vegetation observable derived from spaceborne Global Navigation Satellite System-Reflectometry (GNSS-R) was developed. Firstly, a linear relationship between the Cyclone Global Navigation Satellite System (CYGNSS) reflectivity and soil moisture was derived based on the tau-omega (τ−w) model. The intercept and slope of this linear function were associated with the vegetation properties. Moreover, the intercept is not affected by soil moisture and depends only on vegetation properties. Secondly, to validate the new observable, the intercept demonstrated a significant correlation with vegetation water content (VWC), with the highest correlation coefficient of 0.742. Based on the intercept and slope, a linear model and an artificial neural network (ANN) model were established to retrieve VWC by combining geographical location and land cover information. The correlation coefficient and root-mean-square error (RMSE) of VWC retrieval based on the linear model were 0.795 and 2.155 kg/m2, respectively. The correlation coefficient and RMSE for the ANN model were 0.940 and 1.392 kg/m2, respectively. Compared with the linear model, the ANN model greatly improves the global VWC retrieval in accuracy, especially in areas with poor linear model retrieval results. Therefore, compared with conventional remote sensing techniques, the spaceborne GNSS-R can provide a new and effective approach to global VWC monitoring

    Training Vision-Language Transformers from Captions Alone

    Full text link
    We show that Vision-Language Transformers can be learned without human labels (e.g. class labels, bounding boxes, etc). Existing work, whether explicitly utilizing bounding boxes or patches, assumes that the visual backbone must first be trained on ImageNet class prediction before being integrated into a multimodal linguistic pipeline. We show that this is not necessary and introduce a new model Vision-Language from Captions (VLC) built on top of Masked Auto-Encoders that does not require this supervision. In fact, in a head-to-head comparison between ViLT, the current state-of-the-art patch-based vision-language transformer which is pretrained with supervised object classification, and our model, VLC, we find that our approach 1. outperforms ViLT on standard benchmarks, 2. provides more interpretable and intuitive patch visualizations, and 3. is competitive with many larger models that utilize ROIs trained on annotated bounding-boxes

    Spatiotemporal Analysis of Regional Ionospheric TEC Prediction Using Multi-Factor NeuralProphet Model under Disturbed Conditions

    No full text
    The ionospheric total electron content (TEC) is susceptible to factors, such as solar and geomagnetic activities, resulting in the enhancement of its non-stationarity and nonlinear characteristics, which aggravate the impact on radio communications. In this study, based on the NeuralProphet hybrid prediction framework, a regional ionospheric TEC prediction model (multi-factor NeuralProphet model, MF-NPM) considering multiple factors was constructed by taking solar activity index, geomagnetic activity index, geographic coordinates, and IGS GIM data as input parameters. Data from 2009 to 2013 were used to train the model to achieve forecasts of regional ionospheric TEC at different latitudes during the solar maximum phase (2014) and geomagnetic storms by sliding 1 day. In order to verify the prediction performance of the MF-NPM, the multi-factor long short-term memory neural network (LSTMNN) model was also constructed for comparative analysis. At the same time, the TEC prediction results of the two models were compared with the IGS GIM and CODE 1-day predicted GIM products (COPG_P1). The results show that the MF-NPM achieves good prediction performance effectively. The RMSE and relative accuracy (RA) of MF-NPM are 2.33 TECU and 93.75%, respectively, which are 0.77 and 1.87 TECU and 1.91% and 6.68% better than LSTMNN and COPG_P1 in the solar maximum phase (2014). During the geomagnetic storm, the RMSE and RA of TEC prediction results based on the MF-NPM are 3.12 TECU and 92.86%, respectively, which are improved by 1.25 and 2.30 TECU and 2.38% and 7.24% compared with LSTMNN and COPG_P1. Furthermore, the MF-NPM also achieves better performance in low–mid latitudes

    Ingestion of GNSS-Derived ZTD and PWV for Spatial Interpolation of PM2.5 Concentration in Central and Southern China

    No full text
    With the increasing application of global navigation satellite system (GNSS) technology in the field of meteorology, satellite-derived zenith tropospheric delay (ZTD) and precipitable water vapor (PWV) data have been used to explore the spatial coverage pattern of PM2.5 concentrations. In this study, the PM2.5 concentration data obtained from 340 PM2.5 ground stations in south-central China were used to analyze the variation patterns of PM2.5 in south-central China at different time periods, and six PM2.5 interpolation models were developed in the region. The spatial and temporal PM2.5 variation patterns in central and southern China were analyzed from the perspectives of time series variations and spatial distribution characteristics, and six types of interpolation models were established in central and southern China. (1) Through correlation analysis, and exploratory regression and geographical detector methods, the correlation analysis of PM2.5-related variables showed that the GNSS-derived PWV and ZTD were negatively correlated with PM2.5, and that their significances and contributions to the spatial analysis were good. (2) Three types of suitable variable combinations were selected for modeling through a collinearity diagnosis, and six types of models (geographically weighted regression (GWR), geographically weighted regression kriging (GWRK), geographically weighted regression—empirical bayesian kriging (GWR-EBK), multiscale geographically weighted regression (MGWR), multiscale geographically weighted regression kriging (MGWRK), and multiscale geographically weighted regression—empirical bayesian kriging (MGWR-EBK)) were constructed. The overall R2 of the GWR-EBK model construction was the best (annual: 0.962, winter: 0.966, spring: 0.926, summer: 0.873, and autumn: 0.908), and the interpolation accuracy of the GWR-EBK model constructed by inputting ZTD was the best overall, with an average RMSE of 3.22 μg/m3 recorded, while the GWR-EBK model constructed by inputting PWV had the highest interpolation accuracy in winter, with an RMSE of 4.5 μg/m3 recorded; these values were 2.17% and 4.26% higher than the RMSE values of the other two types of models (ZTD and temperature) in winter, respectively. (3) The introduction of the empirical Bayesian kriging method to interpolate the residuals of the models (GWR and MGWR) and to then correct the original interpolation results of the models was the most effective, and the accuracy improvement percentage was better than that of the ordinary kriging method. The average improvement ratios of the GWRK and GWR-EBK models compared with that of the GWR model were 5.04% and 14.74%, respectively, and the average improvement ratios of the MGWRK and MGWR-EBK models compared with that of the MGWR model were 2.79% and 12.66%, respectively. (4) Elevation intervals and provinces were classified, and the influence of the elevation and the spatial distribution of the plane on the accuracy of the PM2.5 regional model was discussed. The experiments showed that the accuracy of the constructed regional model decreased as the elevation increased. The accuracies of the models in representing Henan, Hubei and Hunan provinces were lower than those of the models in representing Guangdong and Guangxi provinces
    corecore