469 research outputs found

    Confronting Reward Model Overoptimization with Constrained RLHF

    Full text link
    Large language models are typically aligned with human preferences by optimizing reward models\textit{reward models} (RMs) fitted to human feedback. However, human preferences are multi-faceted, and it is increasingly common to derive reward from a composition of simpler reward models which each capture a different aspect of language quality. This itself presents a challenge, as it is difficult to appropriately weight these component RMs when combining them. Compounding this difficulty, because any RM is only a proxy for human evaluation, this process is vulnerable to overoptimization\textit{overoptimization}, wherein past a certain point, accumulating higher reward is associated with worse human ratings. In this paper, we perform, to our knowledge, the first study on overoptimization in composite RMs, showing that correlation between component RMs has a significant effect on the locations of these points. We then introduce an approach to solve this issue using constrained reinforcement learning as a means of preventing the agent from exceeding each RM's threshold of usefulness. Our method addresses the problem of weighting component RMs by learning dynamic weights, naturally expressed by Lagrange multipliers. As a result, each RM stays within the range at which it is an effective proxy, improving evaluation performance. Finally, we introduce an adaptive method using gradient-free optimization to identify and optimize towards these points during a single run

    Spontaneous central venous thrombosis and shunt occlusion following peritoneovenous shunt placement for intractable ascites

    Get PDF
    A 43-year-old man had a peritoneovenous shunt inserted for the treatment of chylous ascites secondary to myelofibrosis. Despite being on anticoagulation for superior mesenteric vein thrombosis, he developed shunt dysfunction within two weeks of insertion. Superior venacavography showed multiple filling defects in the right axillary vein, no filling of the right brachiocephalic and right subclavian vein, and thrombotic occlusion of the internal jugular veins bilaterally. The shunt was removed 11 days after insertion, and there was extensive thrombosis of the venous end of the shunt and the compressible pump chamber. Shunt thrombosis is known to occur but remains a rare complication, with 87% of such obstructions being due to a thrombus at the tip of the venous end of the shunt. Extensive thrombosis of the shunt (as in the present case) is very rare

    When the Love Hormone Leads to Violence: Oxytocin Increases Intimate Partner Violence Inclinations Among High Trait Aggressive People

    Get PDF
    This is the author's final draft. Copyright 2014 SAGE PublicationsDoes oxytocin influence intimate partner violence (IPV)? Clues from prior research suggest that oxytocin increases prosocial behavior, but this effect is reversed among people with aggressive tendencies or in situations involving defensive aggression. Animal research also indicates that oxytocin plays a central role in defensive maternal aggression (i.e., protecting pups from intruders). Among highly aggressive people, a boost of oxytocin may cause them to use aggression toward close others as a means of maintaining their relationship. Adopting an interactionist approach, we predicted that oxytocin would increase IPV inclinations, but this effect would be limited to people high in trait physical aggression. In a double-blind, placebo-controlled, between-subject experiment, participants varying in trait physical aggression received either 24 international unit of oxytocin or a placebo. Following two provocation tasks, participants rated the probability that they would engage in various aggressive behaviors (e.g., slapping, throwing an object that could hurt) toward a romantic partner. Oxytocin increased IPV inclinations, but this effect was limited to participants prone to physical aggression. These data offer the first evidence that IPV inclinations have a biological basis in a combination of oxytocin and trait physical aggressiveness

    Conditional Allocation of Control Rights in Venture Capital Finance

    Get PDF
    When a young entrepreneurial firm matures, it is often necessary to replace the founding entrepreneur by a professional manager. This replacement decision can be affected by the private benefits of control enjoyed by the entrepreneur which gives rise to a conflict of interest between the entrepreneur and the venture capitalist. We show that a combination of convertible securities and contingent control rights can be used to resolve this conflict efficiently. This contractual arrangement is frequently observed in venture capital finance

    A 2km-size asteroid challenging the rubble-pile spin barrier – A case for cohesion

    Get PDF
    The rubble pile spin barrier is an upper limit on the rotation rate of asteroids larger than ~200-300. m. Among thousands of asteroids with diameters larger than ~300. m, only a handful of asteroids are known to rotate faster than 2.0. h, all are in the sub-km range (≤0.6. km). Here we present photometric measurements suggesting that (60716) 2000 GD65, an S-complex, inner-main belt asteroid with a relatively large diameter of 2.3-0.7+0.6km, completes one rotation in 1.9529. ±. 0.0002. h. Its unique diameter and rotation period allow us to examine scenarios about asteroid internal structure and evolution: a rubble pile bound only by gravity; a rubble-pile with strong cohesion; a monolithic structure; an asteroid experiencing mass shedding; an asteroid experiencing YORP spin-up/down; and an asteroid with a unique octahedron shape results with a four-peak lightcurve and a 3.9. h period. We find that the most likely scenario includes a lunar-like cohesion that can prevent (60716) 2000 GD65 from disrupting without requiring a monolithic structure or a unique shape. Due to the uniqueness of (60716) 2000 GD65, we suggest that most asteroids typically have smaller cohesion than that of lunar regolith. Keywords: Asteroids; Asteroids, rotation; Rotational dynamics; PhotometryUnited States. National Aeronautics and Space Administration (Grant NNX12AL26G
    • …
    corecore