41 research outputs found

    Rewarded soups: towards Pareto-optimal alignment by interpolating weights fine-tuned on diverse rewards

    Full text link
    Foundation models are first pre-trained on vast unsupervised datasets and then fine-tuned on labeled data. Reinforcement learning, notably from human feedback (RLHF), can further align the network with the intended usage. Yet the imperfections in the proxy reward may hinder the training and lead to suboptimal results; the diversity of objectives in real-world tasks and human opinions exacerbate the issue. This paper proposes embracing the heterogeneity of diverse rewards by following a multi-policy strategy. Rather than focusing on a single a priori reward, we aim for Pareto-optimal generalization across the entire space of preferences. To this end, we propose rewarded soup, first specializing multiple networks independently (one for each proxy reward) and then interpolating their weights linearly. This succeeds empirically because we show that the weights remain linearly connected when fine-tuned on diverse rewards from a shared pre-trained initialization. We demonstrate the effectiveness of our approach for text-to-text (summarization, Q&A, helpful assistant, review), text-image (image captioning, text-to-image generation, visual grounding, VQA), and control (locomotion) tasks. We hope to enhance the alignment of deep models, and how they interact with the world in all its diversity

    Photography-based taxonomy is inadequate, unnecessary, and potentially harmful for biological sciences

    Get PDF
    The question whether taxonomic descriptions naming new animal species without type specimen(s) deposited in collections should be accepted for publication by scientific journals and allowed by the Code has already been discussed in Zootaxa (Dubois & NemĂ©sio 2007; Donegan 2008, 2009; NemĂ©sio 2009a–b; Dubois 2009; Gentile & Snell 2009; Minelli 2009; Cianferoni & Bartolozzi 2016; Amorim et al. 2016). This question was again raised in a letter supported by 35 signatories published in the journal Nature (Pape et al. 2016) on 15 September 2016. On 25 September 2016, the following rebuttal (strictly limited to 300 words as per the editorial rules of Nature) was submitted to Nature, which on 18 October 2016 refused to publish it. As we think this problem is a very important one for zoological taxonomy, this text is published here exactly as submitted to Nature, followed by the list of the 493 taxonomists and collection-based researchers who signed it in the short time span from 20 September to 6 October 2016

    A simulation methodology for establishing IR-drop-induced clock jitter for high precision timing ASICs

    No full text
    International audienceThe combination of 3D tracking and high-precision timing measurements has been identified by the European Committee for Future Accelerators as a fundamental requirement to increase detection capabilities for future applications. Among others, on-chip high-quality clock is a key factor determining the overall resolution of timing ASICs. However, in large and dense chips, power-grid drops can severely affect the non-deterministic jitter of the clock, representing a limit to the performances. This contribution presents a simulation framework based on commercial tools to derive power supply-induced jitter, providing a pre-silicon methodology to assess its impact to timing indeterminism. The flow is presented together with practical examples and results

    Exploring cetacean stranding pattern in light of variation in at-sea encounter rate and fishing activity : lessons from time surveys in the south Bay of Biscay (East-Atlantic; France)

    No full text
    International audienceTo date, the scarcity of year-round and long-term programmes integrating multi-dimensional data has hindered the development of a good understanding of cetacean mortality worldwide. This study uses data from: 1) standardised shipboard surveys (1980-2002), 2) standardised stranding surveys (1980-2002) and 3) landings of fishing fleets (2000-2002) in the Bay of Biscay. It investigates the correlations between stranding, at-sea encounter rate and the fishing index for three common cetaceans: common dolphin (Delphinus delphis), bottlenose dolphin (Tursiops truncatus) and long-finned pilot whale (Globicephala melas). At the monthly scale, a seasonal stranding pattern significantly congruent with the at-sea encounter rate and the fishing index for D. delphis is revealed. At the inter-annual scale, stranding and at-sea encounter rates are shown to be correlated (p = 0.013-0.044 according to species) and significantly increasing in abundance. Temporal variation in the ratio between individuals seen alive at sea and those stranded shows no significant trend suggesting that stranding is better explained by at-sea abundance than by the fishing index. Managers can use these findings to re-evaluate the relative contribution of by-catch fisheries in the context of observed changes of at-sea cetacean abundance and the link to oceano-climatic changes and other anthropogenic causes

    Dynamic Range of Hepatitis C Virus RNA Quantification with the Cobas Ampliprep-Cobas Amplicor HCV Monitor v2.0 Assay

    No full text
    Accurate quantification of hepatitis C virus (HCV) RNA is needed in clinical practice to decide whether to continue or stop pegylated interferon-α-ribavirin combination therapy at week 12 of treatment for patients with chronic hepatitis C. Currently the HCV RNA quantification assay most widely used worldwide is the Amplicor HCV Monitor v2.0 assay (Roche Molecular Systems, Pleasanton, Calif.). The HCV RNA extraction step can be automated in the Cobas Ampliprep device. In this work, we show that the dynamic range of HCV RNA quantification of the Cobas Ampliprep/Cobas Amplicor HCV Monitor v2.0 procedure is 600 to 200,000 HCV RNA IU/ml (2.8 to 5.3 log IU/ml), which does not cover the full range of HCV RNA levels in infected patients. Any sample containing more than 200,000 IU/ml (5.3 log IU/ml) must thus be retested after dilution for accurate quantification. These results emphasize the need for commercial HCV RNA quantification assays with a broader range of linear quantification, such as real-time PCR-based assays
    corecore