13 research outputs found

    Proximal Point Imitation Learning

    Full text link
    This work develops new algorithms with rigorous efficiency guarantees for infinite horizon imitation learning (IL) with linear function approximation without restrictive coherence assumptions. We begin with the minimax formulation of the problem and then outline how to leverage classical tools from optimization, in particular, the proximal-point method (PPM) and dual smoothing, for online and offline IL, respectively. Thanks to PPM, we avoid nested policy evaluation and cost updates for online IL appearing in the prior literature. In particular, we do away with the conventional alternating updates by the optimization of a single convex and smooth objective over both cost and Q-functions. When solved inexactly, we relate the optimization errors to the suboptimality of the recovered policy. As an added bonus, by re-interpreting PPM as dual smoothing with the expert policy as a center point, we also obtain an offline IL algorithm enjoying theoretical guarantees in terms of required expert trajectories. Finally, we achieve convincing empirical performance for both linear and neural network function approximation

    Distributed Extra-gradient with Optimal Complexity and Communication Guarantees

    Full text link
    We consider monotone variational inequality (VI) problems in multi-GPU settings where multiple processors/workers/clients have access to local stochastic dual vectors. This setting includes a broad range of important problems from distributed convex minimization to min-max and games. Extra-gradient, which is a de facto algorithm for monotone VI problems, has not been designed to be communication-efficient. To this end, we propose a quantized generalized extra-gradient (Q-GenX), which is an unbiased and adaptive compression method tailored to solve VIs. We provide an adaptive step-size rule, which adapts to the respective noise profiles at hand and achieve a fast rate of O(1/T){\mathcal O}(1/T) under relative noise, and an order-optimal O(1/T){\mathcal O}(1/\sqrt{T}) under absolute noise and show distributed training accelerates convergence. Finally, we validate our theoretical results by providing real-world experiments and training generative adversarial networks on multiple GPUs.Comment: International Conference on Learning Representations (ICLR 2023

    DiGress: Discrete Denoising diffusion for graph generation

    Full text link
    This work introduces DiGress, a discrete denoising diffusion model for generating graphs with categorical node and edge attributes. Our model defines a diffusion process that progressively edits a graph with noise (adding or removing edges, changing the categories), and a graph transformer network that learns to revert this process. With these two ingredients in place, we reduce distribution learning over graphs to a simple sequence of classification tasks. We further improve sample quality by proposing a new Markovian noise model that preserves the marginal distribution of node and edge types during diffusion, and by adding auxiliary graph-theoretic features derived from the noisy graph at each diffusion step. Finally, we propose a guidance procedure for conditioning the generation on graph-level features. Overall, DiGress achieves state-of-the-art performance on both molecular and non-molecular datasets, with up to 3x validity improvement on a dataset of planar graphs. In particular, it is the first model that scales to the large GuacaMol dataset containing 1.3M drug-like molecules without using a molecule-specific representation such as SMILES or fragments.Comment: 22 pages. Preprint, under revie

    CD40L and IL-4 stimulation of acute lymphoblastic leukemia cells results in upregulation of mRNA level of FLICE--an important component of apoptosis.

    Get PDF
    The use of cancer vaccines based on dendritic cells (DC) presenting tumor antigens can be a promising tool in the treatment of leukemia. The functional characteristics of leukemia derived DC is still to be elucidated. CD40 promotes survival, proliferation and differentiation of normal B cells. CD40 triggering was used to enhance the poor antigen-presenting capacity of leukemic B-cells. Since it is still unclear whether CD40 ligation drives neoplastic B-cells to apoptosis or not, we assessed the mRNA expression of FLICE, FAS, FADD and TRADD - important components of apoptosis machinery, using real-time PCR in acute lymphoblastic leukemia cells before and after CD40 and IL-4 stimulation. ALL cells stimulated with CD40L/IL-4 expressed dendritic cell phenotype at mRNA and protein levels (upregulation of main costimulatory and adhesion molecules noted in real-time RT PCR and flow cytometry); they also expressed higher amounts of mRNA for FLICE, TRADD and FADD after CD40L/IL-4 stimulation. However differences statistically significant comparing cells cultured with CD40L/IL-4 and medium alone regarded only FLICE. Concluding, we showed upregulation of important elements of apoptosis at mRNA level in ALL cells after CD40 ligation

    MEDITRON-70B: Scaling Medical Pretraining for Large Language Models

    Full text link
    Large language models (LLMs) can potentially democratize access to medical knowledge. While many efforts have been made to harness and improve LLMs' medical knowledge and reasoning capacities, the resulting models are either closed-source (e.g., PaLM, GPT-4) or limited in scale (<= 13B parameters), which restricts their abilities. In this work, we improve access to large-scale medical LLMs by releasing MEDITRON: a suite of open-source LLMs with 7B and 70B parameters adapted to the medical domain. MEDITRON builds on Llama-2 (through our adaptation of Nvidia's Megatron-LM distributed trainer), and extends pretraining on a comprehensively curated medical corpus, including selected PubMed articles, abstracts, and internationally-recognized medical guidelines. Evaluations using four major medical benchmarks show significant performance gains over several state-of-the-art baselines before and after task-specific finetuning. Overall, MEDITRON achieves a 6% absolute performance gain over the best public baseline in its parameter class and 3% over the strongest baseline we finetuned from Llama-2. Compared to closed-source LLMs, MEDITRON-70B outperforms GPT-3.5 and Med-PaLM and is within 5% of GPT-4 and 10% of Med-PaLM-2. We release our code for curating the medical pretraining corpus and the MEDITRON model weights to drive open-source development of more capable medical LLMs

    Prognostic impact of combined fludarabine, treosulfan and mitoxantrone resistance profile in childhood acute myeloid leukemia

    Get PDF
    Background: The role of cellular drug resistance in childhood acute myeloid leukemia (AML) has not yet been established. The aim of the study was the analysis of the clinical value of ex vivo drug resistance in pediatric AML. Patients and Methods: A cohort of 90 children with de novo AML were assayed for drug resistance profile by the 3-4,5- dimethylthiazol-2-yl-2,5-difenyl tetrazolium bromide (MTT) assay and prognostic model of in vitro drug sensitivity was analyzed. Results: Children who relapsed during follow-up showed higher in vitro resistance of leukemic blasts to most of the drugs tested, except for cytarabine, cladribine, vincristine, mercaptopurine and thioguanine. A combined in vitro drug resistance profile to fludarabine, treosulfan and mitoxantrone (FTM score) was defined and it had an independent prognostic significance for disease free survival in pediatric AML. Conclusion: The combined fludarabine, treosulfan and mitoxantrone resistance profile to possibly may be used for better stratification of children with AML or indicate the necessity for additional therapy

    Proximal Point Imitation Learning

    Full text link
    This work develops new algorithms with rigorous efficiency guarantees for infinite horizon imitation learning (IL) with linear function approximation without restrictive coherence assumptions. We begin with the minimax formulation of the problem and then outline how to leverage classical tools from optimization, in particular, the proximal-point method (PPM) and dual smoothing, for online and offline IL, respectively. Thanks to PPM, we avoid nested policy evaluation and cost updates for online IL appearing in the prior literature. In particular, we do away with the conventional alternating updates by the optimization of a single convex and smooth objective over both cost and Q-functions. When solved inexactly, we relate the optimization errors to the suboptimality of the recovered policy. As an added bonus, by re-interpreting PPM as dual smoothing with the expert policy as a center point, we also obtain an offline IL algorithm enjoying theoretical guarantees in terms of required expert trajectories. Finally, we achieve convincing empirical performance for both linear and neural network function approximation.LION

    Effect of metal buffer layer and thermal annealing on HfOx-based ReRAMs

    Full text link
    In this paper, we investigate different methods and approaches in order to improve the electrical characteristics of Pt/HfOx/TiN ReRAM devices. We discuss the improvement of the ReRAM electrical characteristics after the insertion of a Hf and Ti buffer layer. As a result, the resistance window increases more that 10 times, and the set and reset voltages decrease both in absolute value and variability. Furthermore, we show the influence of an annealing step at different temperatures on the Pt/HfOx/Hf/TiN memory devices on forming voltage and HRS. Considering the importance of achieving high density memory, we demonstrated the possibility of multi-level resistance state in the fabricated devices bu controlling the enforced compliance current. In addition, we show the endurance characteristic of the fabricated memories and their error rate. Finally, we report the transient behavior of the memory devices, investigating the device speed and switching mechanism
    corecore