29 research outputs found

    ProMix: Combating Label Noise via Maximizing Clean Sample Utility

    Full text link
    The ability to train deep neural networks under label noise is appealing, as imperfectly annotated data are relatively cheaper to obtain. State-of-the-art approaches are based on semi-supervised learning(SSL), which selects small loss examples as clean and then applies SSL techniques for boosted performance. However, the selection step mostly provides a medium-sized and decent-enough clean subset, which overlooks a rich set of clean samples. In this work, we propose a novel noisy label learning framework ProMix that attempts to maximize the utility of clean samples for boosted performance. Key to our method, we propose a matched high-confidence selection technique that selects those examples having high confidence and matched prediction with its given labels. Combining with the small-loss selection, our method is able to achieve a precision of 99.27 and a recall of 98.22 in detecting clean samples on the CIFAR-10N dataset. Based on such a large set of clean data, ProMix improves the best baseline method by +2.67% on CIFAR-10N and +1.61% on CIFAR-100N datasets. The code and data are available at https://github.com/Justherozen/ProMixComment: Winner of the 1st Learning and Mining with Noisy Labels Challenge in IJCAI-ECAI 2022 (an informal technical report

    GaitFormer: Revisiting Intrinsic Periodicity for Gait Recognition

    Full text link
    Gait recognition aims to distinguish different walking patterns by analyzing video-level human silhouettes, rather than relying on appearance information. Previous research on gait recognition has primarily focused on extracting local or global spatial-temporal representations, while overlooking the intrinsic periodic features of gait sequences, which, when fully utilized, can significantly enhance performance. In this work, we propose a plug-and-play strategy, called Temporal Periodic Alignment (TPA), which leverages the periodic nature and fine-grained temporal dependencies of gait patterns. The TPA strategy comprises two key components. The first component is Adaptive Fourier-transform Position Encoding (AFPE), which adaptively converts features and discrete-time signals into embeddings that are sensitive to periodic walking patterns. The second component is the Temporal Aggregation Module (TAM), which separates embeddings into trend and seasonal components, and extracts meaningful temporal correlations to identify primary components, while filtering out random noise. We present a simple and effective baseline method for gait recognition, based on the TPA strategy. Extensive experiments conducted on three popular public datasets (CASIA-B, OU-MVLP, and GREW) demonstrate that our proposed method achieves state-of-the-art performance on multiple benchmark tests

    Controllable Textual Inversion for Personalized Text-to-Image Generation

    Full text link
    The recent large-scale generative modeling has attained unprecedented performance especially in producing high-fidelity images driven by text prompts. Text inversion (TI), alongside the text-to-image model backbones, is proposed as an effective technique in personalizing the generation when the prompts contain user-defined, unseen or long-tail concept tokens. Despite that, we find and show that the deployment of TI remains full of "dark-magics" -- to name a few, the harsh requirement of additional datasets, arduous human efforts in the loop and lack of robustness. In this work, we propose a much-enhanced version of TI, dubbed Controllable Textual Inversion (COTI), in resolving all the aforementioned problems and in turn delivering a robust, data-efficient and easy-to-use framework. The core to COTI is a theoretically-guided loss objective instantiated with a comprehensive and novel weighted scoring mechanism, encapsulated by an active-learning paradigm. The extensive results show that COTI significantly outperforms the prior TI-related approaches with a 26.05 decrease in the FID score and a 23.00% boost in the R-precision.Comment: 10 pages, 6 figures, 2 tables. Project Page: https://github.com/jnzju/COT

    FreeAL: Towards Human-Free Active Learning in the Era of Large Language Models

    Full text link
    Collecting high-quality labeled data for model training is notoriously time-consuming and labor-intensive for various NLP tasks. While copious solutions, such as active learning for small language models (SLMs) and prevalent in-context learning in the era of large language models (LLMs), have been proposed and alleviate the labeling burden to some extent, their performances are still subject to human intervention. It is still underexplored how to reduce the annotation cost in the LLMs era. To bridge this, we revolutionize traditional active learning and propose an innovative collaborative learning framework FreeAL to interactively distill and filter the task-specific knowledge from LLMs. During collaborative training, an LLM serves as an active annotator inculcating its coarse-grained knowledge, while a downstream SLM is incurred as a student to filter out high-quality in-context samples to feedback LLM for the subsequent label refinery. Extensive experiments on eight benchmark datasets demonstrate that FreeAL largely enhances the zero-shot performances for both SLM and LLM without any human supervision. The code is available at https://github.com/Justherozen/FreeAL .Comment: Accepted to EMNLP 2023 (Main conference

    PEO-Store: Practical and Economical Oblivious Store with Peer-to-Peer Delegation

    Get PDF
    The growing popularity of cloud storage has brought attention to critical need for preventing information leakage from cloud access patterns. To this end, recent efforts have extended Oblivious RAM (ORAM) to the cloud environment in the form of Oblivious Store. However, its impracticality due to the use of probability encryption with fake accesses to obfuscate the access pattern, as well as the security requirements of conventional obliviousness designs, which hinder cloud interests in improving storage utilization by removing redundant data among cross-users, limit its effectiveness. Thus, we propose a practical Oblivious Store, PEO-Store, which integrates the obliviousness property into the cloud while removing redundancy without compromising security. Unlike conventional schemes, PEO-Store randomly selects a delegate for each client to communicate with the cloud, breaking the mapping link between a valid access pattern sequence and a specific client. Each client encrypts their data and shares it with selected delegates, who act as intermediaries with the cloud provider. This design leverages non-interactive zero-knowledge-based redundancy detection, discrete logarithm problem-based key sharing, and secure time-based delivery proof to protect access pattern privacy and accurately identify and remove redundancy in the cloud. The theoretical proof demonstrates that the probability of identifying the valid access pattern with a specific user is negligible in our design. Experimental results show that PEO-Store outperforms state-of-the-art methods, achieving an average throughput of up to 3 times faster and saving 74% of storage space

    HeteroYARN: A Heterogeneous FPGA-Accelerated Architecture Based on YARN

    No full text

    A Strategy for Preparative Separation of 10 Lignans from Justicia procumbens L. by High-Speed Counter-Current Chromatography

    No full text
    Ten compounds, including three lignan glycosides and seven lignans, were purified from Justicia procumbens L. in 8 h using an efficient strategy based on high-speed counter-current chromatography (HSCCC). The two-phase solvent system composed of petroleum–ethyl acetate–methanol–H2O (1:0.7:1:0.7, v/v) was firstly employed to separate the crude extract (320 mg), from which 19.3 mg of justicidin B (f), 10.8 mg of justicidin A (g), 13.9 mg of 6′-hydroxyjusticidin C (h), 7.7 mg of justicidin E (i), 6.3 mg of lignan J1 (j) were obtained with 91.3 mg of enriched mixture of compounds a–e. The enriched mixture (91.3 mg) was further separated using the solvent system consisting of petroleum–ethyl acetate–methanol–H2O (3:3.8:3:3.8, v/v), yielding 12.1 mg of procumbenoside E (a); 7.6 mg of diphyllin-1-O-β-d-apiofuranoside (b); 7.4 mg of diphyllin (c); 8.3 mg of 6′-hydroxy justicidin B (d); and 7.9 mg of diphyllin acetyl apioside (e). The purities of the 10 components were all above 94%, and their structures were identified by NMR and ESI-MS spectra. The results demonstrated that the strategy based on HSCCC for the separation of lignans and their glycosides was efficient and rapid
    corecore