25 research outputs found

    A Pre-trained Data Deduplication Model based on Active Learning

    Full text link
    In the era of big data, the issue of data quality has become increasingly prominent. One of the main challenges is the problem of duplicate data, which can arise from repeated entry or the merging of multiple data sources. These "dirty data" problems can significantly limit the effective application of big data. To address the issue of data deduplication, we propose a pre-trained deduplication model based on active learning, which is the first work that utilizes active learning to address the problem of deduplication at the semantic level. The model is built on a pre-trained Transformer and fine-tuned to solve the deduplication problem as a sequence to classification task, which firstly integrate the transformer with active learning into an end-to-end architecture to select the most valuable data for deduplication model training, and also firstly employ the R-Drop method to perform data augmentation on each round of labeled data, which can reduce the cost of manual labeling and improve the model's performance. Experimental results demonstrate that our proposed model outperforms previous state-of-the-art (SOTA) for deduplicated data identification, achieving up to a 28% improvement in Recall score on benchmark datasets

    Efficacy of Difenoconazole Emulsifiable Concentrate with Ionic Liquids against Cucumbers Powdery Mildew

    Get PDF
    Among eight ionic liquids (ILs) examined, 1-n-butyl-4-methyl-pyridinium bromide (BMPyBr,  5) was used in this study as an appropriate alternative to benzene homologs and derivatives to be used in 10 wt% water-insoluble difenoconazole emulsifiable concentrate (EC). Moreover, 10 wt% difenoconazole EC with BMPyBr (5) exhibited the same efficacy as 10 wt% difenoconazole wettable powder (WP) against powdery mildew on cucumbers under field conditions. The results revealed that difenoconazole EC with BMPyBr (5) had excellent stability at 268 K and 327 K after 14 days through high-performance liquid chromatography (HPLC). Therefore, ILs can be considered as promising environment-friendly adjuvants for pesticides that are commercially processed as EC formulation

    WFTNet: Exploiting Global and Local Periodicity in Long-term Time Series Forecasting

    Full text link
    Recent CNN and Transformer-based models tried to utilize frequency and periodicity information for long-term time series forecasting. However, most existing work is based on Fourier transform, which cannot capture fine-grained and local frequency structure. In this paper, we propose a Wavelet-Fourier Transform Network (WFTNet) for long-term time series forecasting. WFTNet utilizes both Fourier and wavelet transforms to extract comprehensive temporal-frequency information from the signal, where Fourier transform captures the global periodic patterns and wavelet transform captures the local ones. Furthermore, we introduce a Periodicity-Weighted Coefficient (PWC) to adaptively balance the importance of global and local frequency patterns. Extensive experiments on various time series datasets show that WFTNet consistently outperforms other state-of-the-art baseline

    Bitter gourd has the highest azoxystrobinon residue after open field application on four cucurbit vegetables.

    No full text
    The goal of this study was to select a representative cucurbit vegetable crop that contained the highest residue levels of the pesticide azoxystrobinon. To do this, we used open field application of azoxystrobinon in four cucurbit crops (cucumber, zucchini, bitter gourd, and loofah) in Beijing, Shandong, and Anhui. Liquid chromatograph-mass spectrometry/mass spectrometry (LC-MS/MS) with selected reaction monitoring was used to determine azoxystrobinon levels in each of the selected cucurbit vegetables. The azoxystrobinon limit of detection was 0.005 mg kg-1 for all samples. Recoveries of azoxystrobinon ranged from 94.2% to 107.1% at spiked levels of 0.005-0.5 mg kg-1. In field trials, the half-life of azoxystrobinon in each of the four cucurbit crops was within the range of 1.4-3.1 d. Based on these results, we recommend that bitter gourd is selected as a representative cucurbit vegetable for future studies of azoxystrobinon. The obtained residual data were also assessed for their dietary risk and results indicated that there is no chronic dietary risk in any of the four, selected cucurbit vegetables. The recommended maximum residue limit (MRL) of azoxystrobinon in this subgroup was 0.2 mg/kg

    TRRNet : tiered relation reasoning for compositional visual question answering

    No full text
    Compositional visual question answering requires reasoning over both semantic and geometry object relations. We propose a novel tiered reasoning method that dynamically selects object level candidates based on language representations and generates robust pairwise relations within the selected candidate objects. The proposed tiered relation reasoning method can be compatible with the majority of the existing visual reasoning frameworks, leading to significant performance improvement with very little extra computational cost. Moreover, we propose a policy network that decides the appropriate reasoning steps based on question complexity and current reasoning status. In experiments, our model achieves state-of-the-art performance on two VQA datasets.AI SingaporeMinistry of Education (MOE)National Research Foundation (NRF)Accepted versionThis research was supported by the National Research Foundation Singapore under its AI Singapore Programme (Award Number: AISG-RP-2018-003) and the MOE Tier-1 research grants: RG28/18 (S) and RG22/19 (S). F. Lv’s participation is supported by National Natural Science Foundation of China (No.11829101 and 11931014). Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not reflect the views of National Research Foundation, Singapore

    Weakly-supervised cross-domain road scene segmentation via multi-level curriculum adaptation

    No full text
    Semantic segmentation, which aims to acquire pixel-level understanding about images, is among the key components in computer vision. To train a good segmentation model for real-world images, it usually requires a huge amount of time and labor effort to obtain sufficient pixel-level annotations of real-world images beforehand. To get rid of such a nontrivial burden, one can use simulators to automatically generate synthetic images that inherently contain full pixel-level annotations and use them to train a segmentation model for the real-world images. However, training with synthetic images usually cannot lead to good performance due to the domain difference between the synthetic images (i.e., source domain) and the real-world images (i.e., target domain). To deal with this issue, a number of unsupervised domain adaptation (UDA) approaches have been proposed, where no labeled real-world images are available. Different from those methods, in this work, we conduct a pioneer attempt by using easy-to-collect image-level annotations for target images to improve the performance of cross-domain segmentation. Specifically, we leverage those image-level annotations to construct curriculums for the domain adaptation problem. The curriculums describe multi-level properties of the target domain, including label distributions over full images, local regions and single pixels. Since image annotations are 'weak' labels compared to pixel annotations for segmentation, we coin this new problem as weakly-supervised cross-domain segmentation. Comprehensive experiments on the GTA5 -> Cityscapes and SYNTHIA -> Cityscapes settings demonstrate the effectiveness of our method over the existing state-of-the-art baselines.This work was supported in part by the Major Project for New Generation of AI under Grant 2018AAA0100400; in part by the National Natural Science Foundation of China under Grant 11829101, Grant 11931014, and Grant 61772118; and in part by the Fundamental Research Funds for the Central Universities of China under Grant JBK1806002
    corecore