96 research outputs found

    Presynaptic stochasticity improves energy efficiency and helps alleviate the stability-plasticity dilemma

    Full text link
    When an action potential arrives at a synapse there is a large probability that no neurotransmitter is released. Surprisingly, simple computational models suggest that these synaptic failures enable information processing at lower metabolic costs. However, these models only consider information transmission at single synapses ignoring the remainder of the neural network as well as its overall computational goal. Here, we investigate how synaptic failures affect the energy efficiency of models of entire neural networks that solve a goal-driven task. We find that presynaptic stochasticity and plasticity improve energy efficiency and show that the network allocates most energy to a sparse subset of important synapses. We demonstrate that stabilising these synapses helps to alleviate the stability-plasticity dilemma, thus connecting a presynaptic notion of importance to a computational role in lifelong learning. Overall, our findings present a set of hypotheses for how presynaptic plasticity and stochasticity contribute to sparsity, energy efficiency and improved trade-offs in the stability-plasticity dilemma

    A contrastive rule for meta-learning

    Full text link
    Meta-learning algorithms leverage regularities that are present on a set of tasks to speed up and improve the performance of a subsidiary learning process. Recent work on deep neural networks has shown that prior gradient-based learning of meta-parameters can greatly improve the efficiency of subsequent learning. Here, we present a biologically plausible meta-learning algorithm based on equilibrium propagation. Instead of explicitly differentiating the learning process, our contrastive meta-learning rule estimates meta-parameter gradients by executing the subsidiary process more than once. This avoids reversing the learning dynamics in time and computing second-order derivatives. In spite of this, and unlike previous first-order methods, our rule recovers an arbitrarily accurate meta-parameter update given enough compute. We establish theoretical bounds on its performance and present experiments on a set of standard benchmarks and neural network architectures

    Learning where to learn: Gradient sparsity in meta and continual learning

    Full text link
    Finding neural network weights that generalize well from small datasets is difficult. A promising approach is to learn a weight initialization such that a small number of weight changes results in low generalization error. We show that this form of meta-learning can be improved by letting the learning algorithm decide which weights to change, i.e., by learning where to learn. We find that patterned sparsity emerges from this process, with the pattern of sparsity varying on a problem-by-problem basis. This selective sparsity results in better generalization and less interference in a range of few-shot and continual learning problems. Moreover, we find that sparse learning also emerges in a more expressive model where learning rates are meta-learned. Our results shed light on an ongoing debate on whether meta-learning can discover adaptable features and suggest that learning by sparse gradient descent is a powerful inductive bias for meta-learning systems

    Learning where to learn: Gradient sparsity in meta and continual learning

    Full text link
    Finding neural network weights that generalize well from small datasets is difficult. A promising approach is to learn a weight initialization such that a small number of weight changes results in low generalization error. We show that this form of meta-learning can be improved by letting the learning algorithm decide which weights to change, i.e., by learning where to learn. We find that patterned sparsity emerges from this process, with the pattern of sparsity varying on a problem-by-problem basis. This selective sparsity results in better generalization and less interference in a range of few-shot and continual learning problems. Moreover, we find that sparse learning also emerges in a more expressive model where learning rates are meta-learned. Our results shed light on an ongoing debate on whether meta-learning can discover adaptable features and suggest that learning by sparse gradient descent is a powerful inductive bias for meta-learning systems

    Random initialisations performing above chance and how to find them

    Full text link
    Neural networks trained with stochastic gradient descent (SGD) starting from different random initialisations typically find functionally very similar solutions, raising the question of whether there are meaningful differences between different SGD solutions. Entezari et al.\ recently conjectured that despite different initialisations, the solutions found by SGD lie in the same loss valley after taking into account the permutation invariance of neural networks. Concretely, they hypothesise that any two solutions found by SGD can be permuted such that the linear interpolation between their parameters forms a path without significant increases in loss. Here, we use a simple but powerful algorithm to find such permutations that allows us to obtain direct empirical evidence that the hypothesis is true in fully connected networks. Strikingly, we find that two networks already live in the same loss valley at the time of initialisation and averaging their random, but suitably permuted initialisation performs significantly above chance. In contrast, for convolutional architectures, our evidence suggests that the hypothesis does not hold. Especially in a large learning rate regime, SGD seems to discover diverse modes.Comment: NeurIPS 2022, 14th Annual Workshop on Optimization for Machine Learning (OPT2022

    CAG-Repeat RNA Hairpin Folding and Recruitment to Nuclear Speckles with a Pivotal Role of ATP as a Cosolute

    Get PDF
    A hallmark of Huntington’s disease (HD) is a prolonged polyglutamine sequence in the huntingtin protein and, correspondingly, an expanded cytosine, adenine, and guanine (CAG) triplet repeat region in the mRNA. A majority of studies investigating disease pathology were concerned with toxic huntingtin protein, but the mRNA moved into focus due to its recruitment to RNA foci and emerging novel therapeutic approaches targeting the mRNA. A hallmark of CAG-RNA is that it forms a stable hairpin in vitro which seems to be crucial for specific protein interactions. Using in-cell folding experiments, we show that the CAG-RNA is largely destabilized in cells compared to dilute buffer solutions but remains folded in the cytoplasm and nucleus. Surprisingly, we found the same folding stability in the nucleoplasm and in nuclear speckles under physiological conditions suggesting that CAG-RNA does not undergo a conformational transition upon recruitment to the nuclear speckles. We found that the metabolite adenosine triphosphate (ATP) plays a crucial role in promoting unfolding, enabling its recruitment to nuclear speckles and preserving its mobility. Using in vitro experiments and molecular dynamics simulations, we found that the ATP effects can be attributed to a direct interaction of ATP with the nucleobases of the CAG-RNA rather than ATP acting as “a fuel” for helicase activity. ATP-driven changes in CAG-RNA homeostasis could be disease-relevant since mitochondrial function is affected in HD disease progression leading to a decline in cellular ATP levels

    Hemodynamic impact of isobaric levobupivacaine versus hyperbaric bupivacaine for subarachnoid anesthesia in patients aged 65 and older undergoing hip surgery

    Get PDF
    BackgroundThe altered hemodynamics, and therefore the arterial hypotension is the most prevalent adverse effect after subarachnoid anesthesia. The objective of the study was to determine the exact role of local anesthetic selection underlying spinal anesthesia-induced hypotension in the elderly patient. We conducted a descriptive, observational pilot study to assess the hemodynamic impact of subarachnoid anesthesia with isobaric levobupivacaine versus hyperbaric bupivacaine for hip fracture surgery.DescriptionHundred twenty ASA status I-IV patients aged 65 and older undergoing hip fracture surgery were enrolled. The primary objective of our study was to compare hemodynamic effects based on systolic blood pressure (SBP) and dyastolic blood pressure (DBP) values, heart rate (HR) and hemoglobin (Hb) and respiratory effects based on partial oxygen saturation (SpO2%) values. The secondary objective was to assess potential adverse events with the use of levobupivacaine versus bupivacaine. Assessments were performed preoperatively, at 30 minutes into surgery, at the end of anesthesia and at 48 hours and 6 months after surgery.Among intraoperative events, the incidence of hypotension was statistically significantly higher (p <0.05) in group BUPI (38.3%) compared to group LEVO (13.3%). There was a decrease (p <0.05) in systolic blood pressure (SBP) and diastolic blood pressure (DBP) at 30 minutes intraoperatively (19% in group BUPI versus 17% in group LEVO). SpO2% increased at 30 minutes after anesthesia onset (1% in group BUPI versus 1.5% in group LEVO). Heart rate (HR) decreased at 30 minutes after anesthesia onset (5% in group BUPI versus 9% in group L). Hemoglobin (Hb) decreased from time of operating room (OR) admission to the end of anesthesia (9.3% in group BUPI versus 12.5% in group LEVO). The incidence of red blood cell (RBC) transfusion was 13.3% in group BUPI versus 31.7% in group LEVO, this difference was statistically significant. Among postoperative events, the incidence of congestive heart failure (CHF) was significantly higher in group BUPI (8,3%). At 6 months after anesthesia, no differences were found.ConclusionsGiven the hemodynamic stability and lower incidence of intraoperative hypotension observed, levobupivacaine could be the agent of choice for subarachnoid anesthesia in elderly patients

    Mutations and Deregulation of Ras/Raf/MEK/ERK and PI3K/PTEN/Akt/mTOR Cascades Which Alter Therapy Response

    Get PDF
    The Ras/Raf/MEK/ERK and PI3K/PTEN/Akt/mTOR cascades are often activated by genetic alterations in upstream signaling molecules such as receptor tyrosine kinases (RTK). Certain components of these pathways, RAS, NF1, BRAF, MEK1, DUSP5, PP2A, PIK3CA, PIK3R1, PIK3R4, PIK3R5, IRS4, AKT, NFKB1, MTOR, PTEN, TSC1, and TSC2 may also be activated/inactivated by mutations or epigenetic silencing. Upstream mutations in one signaling pathway or even in downstream components of the same pathway can alter the sensitivity of the cells to certain small molecule inhibitors. These pathways have profound effects on proliferative, apoptotic and differentiation pathways. Dysregulation of components of these cascades can contribute to: resistance to other pathway inhibitors, chemotherapeutic drug resistance, premature aging as well as other diseases. This review will first describe these pathways and discuss how genetic mutations and epigenetic alterations can result in resistance to various inhibitors

    Would I have gotten that reward? Long-term credit assignment by counterfactual contribution analysis

    Full text link
    To make reinforcement learning more sample efficient, we need better credit assignment methods that measure an action's influence on future rewards. Building upon Hindsight Credit Assignment (HCA), we introduce Counterfactual Contribution Analysis (COCOA), a new family of model-based credit assignment algorithms. Our algorithms achieve precise credit assignment by measuring the contribution of actions upon obtaining subsequent rewards, by quantifying a counterfactual query: "Would the agent still have reached this reward if it had taken another action?". We show that measuring contributions w.r.t. rewarding states, as is done in HCA, results in spurious estimates of contributions, causing HCA to degrade towards the high-variance REINFORCE estimator in many relevant environments. Instead, we measure contributions w.r.t. rewards or learned representations of the rewarding objects, resulting in gradient estimates with lower variance. We run experiments on a suite of problems specifically designed to evaluate long-term credit assignment capabilities. By using dynamic programming, we measure ground-truth policy gradients and show that the improved performance of our new model-based credit assignment methods is due to lower bias and variance compared to HCA and common baselines. Our results demonstrate how modeling action contributions towards rewarding outcomes can be leveraged for credit assignment, opening a new path towards sample-efficient reinforcement learning
    • …
    corecore