63 research outputs found

    Active teacher selection for reinforcement learning from human feedback

    Full text link
    Reinforcement learning from human feedback (RLHF) enables machine learning systems to learn objectives from human feedback. A core limitation of these systems is their assumption that all feedback comes from a single human teacher, despite querying a range of distinct teachers. We propose the Hidden Utility Bandit (HUB) framework to model differences in teacher rationality, expertise, and costliness, formalizing the problem of learning from multiple teachers. We develop a variety of solution algorithms and apply them to two real-world domains: paper recommendation systems and COVID-19 vaccine testing. We find that the Active Teacher Selection (ATS) algorithm outperforms baseline algorithms by actively selecting when and which teacher to query. The HUB framework and ATS algorithm demonstrate the importance of leveraging differences between teachers to learn accurate reward models, facilitating future research on active teacher selection for robust reward modeling

    The DNA repair component Metnase regulates Chk1 stability

    Get PDF
    Chk1 both arrests replication forks and enhances repair of DNA damage by phosphorylation of downstream effectors. Metnase (also termed SETMAR) is a SET histone methylase and transposase nuclease protein that promotes both DNA double strand break (DSB) repair and re-start of stalled replication forks. We previously found that Chk1 phosphorylation of Metnase on S495 enhanced its DNA DSB repair activity but decreased its ability to re-start stalled replication forks. Here we show that phosphorylated Metnase feeds back to increase the half-life of Chk1. Chk1 half-life is regulated by DDB1 targeting it to Cul4A for ubiquitination and destruction. Metnase decreases Chk1 interaction with DDB1, and decreases Chk1 ubiquitination. These data define a novel pathway for Chk1 regulation, whereby a target of Chk1, Metnase, feeds back to amplify Chk1 stability, and therefore enhance replication fork arrest

    Metnase promotes restart and repair of stalled and collapsed replication forks

    Get PDF
    Metnase is a human protein with methylase (SET) and nuclease domains that is widely expressed, especially in proliferating tissues. Metnase promotes non-homologous end-joining (NHEJ), and knockdown causes mild hypersensitivity to ionizing radiation. Metnase also promotes plasmid and viral DNA integration, and topoisomerase IIα (TopoIIα)-dependent chromosome decatenation. NHEJ factors have been implicated in the replication stress response, and TopoIIα has been proposed to relax positive supercoils in front of replication forks. Here we show that Metnase promotes cell proliferation, but it does not alter cell cycle distributions, or replication fork progression. However, Metnase knockdown sensitizes cells to replication stress and confers a marked defect in restart of stalled replication forks. Metnase promotes resolution of phosphorylated histone H2AX, a marker of DNA double-strand breaks at collapsed forks, and it co-immunoprecipitates with PCNA and RAD9, a member of the PCNA-like RAD9–HUS1–RAD1 intra-S checkpoint complex. Metnase also promotes TopoIIα-mediated relaxation of positively supercoiled DNA. Metnase is not required for RAD51 focus formation after replication stress, but Metnase knockdown cells show increased RAD51 foci in the presence or absence of replication stress. These results establish Metnase as a key factor that promotes restart of stalled replication forks, and implicate Metnase in the repair of collapsed forks

    Orbital-selective Kondo lattice and enigmatic f electrons emerging from inside the antiferromagnetic phase of a heavy fermion.

    Get PDF
    Novel electronic phenomena frequently form in heavy-fermions because of the mutual localized and itinerant nature of f-electrons. On the magnetically ordered side of the heavy-fermion phase diagram, f-moments are expected to be localized and decoupled from the Fermi surface. It remains ambiguous whether Kondo lattice can develop inside the magnetically ordered phase. Using spectroscopic imaging with scanning tunneling microscope, complemented by neutron scattering, x-ray absorption spectroscopy, and dynamical mean field theory, we probe the electronic states in antiferromagnetic USb2. We visualize a large gap in the antiferromagnetic phase within which Kondo hybridization develops below ~80 K. Our calculations indicate the antiferromagnetism and Kondo lattice to reside predominantly on different f-orbitals, promoting orbital selectivity as a new conception into how these phenomena coexist in heavy-fermions. Finally, at 45 K, we find a novel first order-like transition through abrupt emergence of nontrivial 5f-electronic states that may resemble the "hidden-order" phase of URu2Si2
    corecore