74 research outputs found

    ProMix: Combating Label Noise via Maximizing Clean Sample Utility

    Full text link
    The ability to train deep neural networks under label noise is appealing, as imperfectly annotated data are relatively cheaper to obtain. State-of-the-art approaches are based on semi-supervised learning(SSL), which selects small loss examples as clean and then applies SSL techniques for boosted performance. However, the selection step mostly provides a medium-sized and decent-enough clean subset, which overlooks a rich set of clean samples. In this work, we propose a novel noisy label learning framework ProMix that attempts to maximize the utility of clean samples for boosted performance. Key to our method, we propose a matched high-confidence selection technique that selects those examples having high confidence and matched prediction with its given labels. Combining with the small-loss selection, our method is able to achieve a precision of 99.27 and a recall of 98.22 in detecting clean samples on the CIFAR-10N dataset. Based on such a large set of clean data, ProMix improves the best baseline method by +2.67% on CIFAR-10N and +1.61% on CIFAR-100N datasets. The code and data are available at https://github.com/Justherozen/ProMixComment: Winner of the 1st Learning and Mining with Noisy Labels Challenge in IJCAI-ECAI 2022 (an informal technical report

    FreeAL: Towards Human-Free Active Learning in the Era of Large Language Models

    Full text link
    Collecting high-quality labeled data for model training is notoriously time-consuming and labor-intensive for various NLP tasks. While copious solutions, such as active learning for small language models (SLMs) and prevalent in-context learning in the era of large language models (LLMs), have been proposed and alleviate the labeling burden to some extent, their performances are still subject to human intervention. It is still underexplored how to reduce the annotation cost in the LLMs era. To bridge this, we revolutionize traditional active learning and propose an innovative collaborative learning framework FreeAL to interactively distill and filter the task-specific knowledge from LLMs. During collaborative training, an LLM serves as an active annotator inculcating its coarse-grained knowledge, while a downstream SLM is incurred as a student to filter out high-quality in-context samples to feedback LLM for the subsequent label refinery. Extensive experiments on eight benchmark datasets demonstrate that FreeAL largely enhances the zero-shot performances for both SLM and LLM without any human supervision. The code is available at https://github.com/Justherozen/FreeAL .Comment: Accepted to EMNLP 2023 (Main conference

    Controllable Textual Inversion for Personalized Text-to-Image Generation

    Full text link
    The recent large-scale generative modeling has attained unprecedented performance especially in producing high-fidelity images driven by text prompts. Text inversion (TI), alongside the text-to-image model backbones, is proposed as an effective technique in personalizing the generation when the prompts contain user-defined, unseen or long-tail concept tokens. Despite that, we find and show that the deployment of TI remains full of "dark-magics" -- to name a few, the harsh requirement of additional datasets, arduous human efforts in the loop and lack of robustness. In this work, we propose a much-enhanced version of TI, dubbed Controllable Textual Inversion (COTI), in resolving all the aforementioned problems and in turn delivering a robust, data-efficient and easy-to-use framework. The core to COTI is a theoretically-guided loss objective instantiated with a comprehensive and novel weighted scoring mechanism, encapsulated by an active-learning paradigm. The extensive results show that COTI significantly outperforms the prior TI-related approaches with a 26.05 decrease in the FID score and a 23.00% boost in the R-precision.Comment: 10 pages, 6 figures, 2 tables. Project Page: https://github.com/jnzju/COT

    Broadband and continuous wave pumped second-harmonic generation from microfiber coated with layered GaSe crystal

    Get PDF
    The conversion-efficiency for second-harmonic (SH) in optical fibers is significantly limited by extremely weak second-order nonlinearity of fused silica, and pulse pump lasers with high peak power are widely employed. Here, we propose a simple strategy to efficiently realize the broadband and continuous wave (CW) pumped SH, by transferring a crystalline GaSe coating onto a microfiber with phase-matching diameter. In the experiment, high efficiency up to 0.08 %W-1mm-1 is reached for a C-band pump laser. The high enough efficiency not only guarantees SH at a single frequency pumped by a CW laser, but also multi-frequencies mixing supported by three CW light sources. Moreover, broadband SH spectrum is also achieved under the pump of a superluminescent light-emitting diode source with a 79.3 nm bandwidth. The proposed scheme provides a beneficial method to the enhancement of various nonlinear parameter processes, development of quasi-monochromatic or broadband CW light sources at new wavelength regions

    An exploration of the correlations between seven psychiatric disorders and the risks of breast cancer, breast benign tumors and breast inflammatory diseases: Mendelian randomization analyses

    Get PDF
    BackgroundPrevious observational studies have showed that certain psychiatric disorders may be linked to breast cancer risk, there is, however, little understanding of relationships between mental disorders and a variety of breast diseases. This study aims to investigate if mental disorders influence the risks of overall breast cancer, the two subtypes of breast cancer (ER+ and ER-), breast benign tumors and breast inflammatory diseases.MethodsDuring our research, genome-wide association study (GWAS) data for seven psychiatric disorders (schizophrenia, major depressive disorder, bipolar disorder, post-traumatic stress disorder, panic disorder, obsessive-compulsive disorder and anorexia nervosa) from the Psychiatric Genomics Consortium (PGC) and the UK Biobank were selected, and single-nucleotide polymorphisms (SNPs) significantly linked to these mental disorders were identified as instrumental variables. GWAS data for breast diseases came from the Breast Cancer Association Consortium (BCAC) as well as the FinnGen consortium. We performed two-sample Mendelian randomization (MR) analyses and multivariable MR analyses to assess these SNPs’ effects on various breast diseases. Both heterogeneity and pleiotropy were evaluated by sensitivity analyses.ResultsWhen the GWAS data of psychiatric disorders were derived from the PGC, our research found that schizophrenia significantly increased the risks of overall breast cancer (two-sample MR: OR 1.05, 95%CI [1.03-1.07], p = 3.84 × 10−6; multivariable MR: OR 1.06, 95%CI [1.04-1.09], p = 2.34 × 10−6), ER+ (OR 1.05, 95%CI [1.02-1.07], p = 5.94 × 10−5) and ER- (two-sample MR: OR 1.04, 95%CI [1.01-1.07], p = 0.006; multivariable MR: OR 1.06, 95%CI [1.02-1.10], p = 0.001) breast cancer. Nevertheless, major depressive disorder only showed significant positive association with overall breast cancer (OR 1.12, 95%CI [1.04-1.20], p = 0.003) according to the two-sample MR analysis, but not in the multivariable MR analysis. In regards to the remainder of the mental illnesses and breast diseases, there were no significant correlations. While as for the data from the UK Biobank, schizophrenia did not significantly increase the risk of breast cancer.ConclusionsThe correlation between schizophrenia and breast cancer found in this study may be false positive results caused by underlying horizontal pleiotropy, rather than a true cause-and-effect relationship. More prospective studies are still needed to be carried out to determine the definitive links between mental illnesses and breast diseases

    Photochemical route for synthesizing atomically dispersed palladium catalysts

    Get PDF
    该工作由校内外多个课题组共同努力,历时三年多完成。我校郑南峰、傅钢等课题组紧密协作负责催化剂的合成、表征、催化测试及机理研究;中科院物理研究所谷林研究员主要负责催化剂的球差校正透射电子显微研究;加拿大达尔豪斯大学的张鹏课题组参与催化剂的同步辐射X-射线吸收谱研究。该研究工作的第一、二作者刘朋昕、赵云均为我校博士生。【Abstract】Atomically dispersed noble metal catalysts often exhibit high catalytic performances, but the metal loading density must be kept low (usually below 0.5%) to avoid the formation of metal nanoparticles through sintering. We report a photochemical strategy to fabricate a stable atomically dispersed palladium–titanium oxide catalyst (Pd 1 /TiO2 ) on ethylene glycolate (EG)–stabilized ultrathin TiO2 nanosheets containing Pd up to 1.5%.The Pd 1 /TiO2 catalyst exhibited high catalytic activity in hydrogenation of C=C bonds, exceeding that of surface Pd atoms on commercial Pd catalysts by a factor of 9.No decay in the activity was observed for 20 cycles. More important, the Pd 1 /TiO2 -EG system could activate H2 in a heterolytic pathway, leading to a catalytic enhancement in hydrogenation of aldehydes by a factor of more than 55.Supported by Ministry of Science and Technology of China grant 2015CB932303; National Natural Science Foundation of China grants 21420102001, 21131005, 21390390, 21133004, 21373167, 21573178, and 21333008; a NSERC CGS Alexander Graham Bell scholarship (D.M.C.); and a NSERC Discovery grant (P.Z.)

    Ultrastable Atomic Copper Nanosheets for Selective Electrochemical Reduction of Carbon Dioxide

    Get PDF
    金属铜表面很容易被空气氧化,因此铜纳米材料在空气中极不稳定,如何制备原子级厚度的二维铜纳米片一直是纳米材料领域的一个挑战性难题。厦门大学化学化工学院郑南峰教授课题组发展了一种制备稳定超薄二维铜基纳米材料的有效方法,并将这类材料应用于二氧化碳的选择性电催化还原。该项研究还发现所合成的复合纳米材料能够将二氧化碳和水选择性地电化学还原为组成可调的合成气(一氧化碳和氢气混合气),在较低的还原电位下可高选择性地将二氧化碳还原成一氧化碳(其法拉第效率高达92%)。铜基纳米材料在二氧化碳电化学还原中具有优异的性能,但产物异常多样,选择性控制的难度很大。该项工作利用简单的表面配位修饰大幅改善电催化选择性的策略为二氧化碳还原电催化剂的设计提供了新思路。 该工作是在郑南峰教授指导下,并与傅钢教授课题组、加拿大Dalhousie大学张鹏教授合作完成,第一作者为化学化工学院博士生代磊,硕士生钦青、博士生汪佩、赵小静等参与了该工作。【Abstract】The electrochemical conversion of CO2 and H2O into syngas using renewably generated electricity is an attractive approach to simultaneously achieve chemical fixation of CO2 and storage of renewable energy. Developing cost-effective catalysts for selective electroreduction of CO2 into CO is essential to the practical applications of the approach. We report a simple synthetic strategy for the preparation of ultrathin Cu/Ni(OH)2 nanosheets as an excellent cost-effective catalyst for the electrochemical conversion of CO2 and H2O into tunable syngas under low overpotentials. These hybrid nanosheets with Cu(0)-enriched surface behave like noble metal nanocatalysts in both air stability and catalysis. Uniquely, Cu(0) within the nanosheets is stable against air oxidation for months because of the presence of formate on their surface. With the presence of atomically thick ultrastable Cu nanosheets, the hybrid Cu/Ni(OH)2 nanosheets display both excellent activity and selectivity in the electroreduction of CO2 to CO. At a low overpotential of 0.39 V, the nanosheets provide a current density of 4.3 mA/cm2 with a CO faradaic efficiency of 92%. No decay in the current is observed for more than 22 hours. The catalysts developed in this work are promising for building low-cost CO2 electrolyzers to produce CO.We thank the beamline BL14W1 (Shanghai Synchrotron Radiation Facility) for providing the beam time. the Ministry of Science and Technology of China (2017YFA0207302 and 2015CB93230)and the National Natural Science Foundation of China (21731005, 21420102001, 21333008). 研究工作得到了科技部和国家自然科学基金委的资助,X-射线吸收光谱测试在上海光源BL14W1线站完成

    SoccerNet 2023 Challenges Results

    Full text link
    peer reviewedThe SoccerNet 2023 challenges were the third annual video understanding challenges organized by the SoccerNet team. For this third edition, the challenges were composed of seven vision-based tasks split into three main themes. The first theme, broadcast video understanding, is composed of three high-level tasks related to describing events occurring in the video broadcasts: (1) action spotting, focusing on retrieving all timestamps related to global actions in soccer, (2) ball action spotting, focusing on retrieving all timestamps related to the soccer ball change of state, and (3) dense video captioning, focusing on describing the broadcast with natural language and anchored timestamps. The second theme, field understanding, relates to the single task of (4) camera calibration, focusing on retrieving the intrinsic and extrinsic camera parameters from images. The third and last theme, player understanding, is composed of three low-level tasks related to extracting information about the players: (5) re-identification, focusing on retrieving the same players across multiple views, (6) multiple object tracking, focusing on tracking players and the ball through unedited video streams, and (7) jersey number recognition, focusing on recognizing the jersey number of players from tracklets. Compared to the previous editions of the SoccerNet challenges, tasks (2-3-7) are novel, including new annotations and data, task (4) was enhanced with more data and annotations, and task (6) now focuses on end-to-end approaches. More information on the tasks, challenges, and leaderboards are available on https://www.soccer-net.org. Baselines and development kits can be found on https://github.com/SoccerNet
    corecore