5 research outputs found

    Why do networks have inhibitory/negative connections?

    Full text link
    Why do brains have inhibitory connections? Why do deep networks have negative weights? We propose an answer from the perspective of representation capacity. We believe representing functions is the primary role of both (i) the brain in natural intelligence, and (ii) deep networks in artificial intelligence. Our answer to why there are inhibitory/negative weights is: to learn more functions. We prove that, in the absence of negative weights, neural networks with non-decreasing activation functions are not universal approximators. While this may be an intuitive result to some, to the best of our knowledge, there is no formal theory, in either machine learning or neuroscience, that demonstrates why negative weights are crucial in the context of representation capacity. Further, we provide insights on the geometric properties of the representation space that non-negative deep networks cannot represent. We expect these insights will yield a deeper understanding of more sophisticated inductive priors imposed on the distribution of weights that lead to more efficient biological and machine learning.Comment: ICCV2023 camera-read

    Omnidirectional Transfer for Quasilinear Lifelong Learning

    Full text link
    In biological learning, data are used to improve performance not only on the current task, but also on previously encountered and as yet unencountered tasks. In contrast, classical machine learning starts from a blank slate, or tabula rasa, using data only for the single task at hand. While typical transfer learning algorithms can improve performance on future tasks, their performance on prior tasks degrades upon learning new tasks (called catastrophic forgetting). Many recent approaches for continual or lifelong learning have attempted to maintain performance given new tasks. But striving to avoid forgetting sets the goal unnecessarily low: the goal of lifelong learning, whether biological or artificial, should be to improve performance on all tasks (including past and future) with any new data. We propose omnidirectional transfer learning algorithms, which includes two special cases of interest: decision forests and deep networks. Our key insight is the development of the omni-voter layer, which ensembles representations learned independently on all tasks to jointly decide how to proceed on any given new data point, thereby improving performance on both past and future tasks. Our algorithms demonstrate omnidirectional transfer in a variety of simulated and real data scenarios, including tabular data, image data, spoken data, and adversarial tasks. Moreover, they do so with quasilinear space and time complexity

    Deep discriminative to kernel generative modeling

    Full text link
    The fight between discriminative versus generative goes deep, in both the study of artificial and natural intelligence. In our view, both camps have complementary value, so, we sought to synergistic combine them. Here, we propose a methodology to convert deep discriminative networks to kernel generative networks. We leveraged the fact that deep models, including both random forests and deep networks, learn internal representations which are unions of polytopes with affine activation functions to conceptualize them both as generalized partitioning rules. From that perspective, we used foundational results on the relationship between histogram rules and kernel density estimators to obtain class conditional kernel density estimators from the deep models. We then studied the trade-offs we observed from implementing this strategy in low-dimensional settings, both theoretically and empirically, as a first step towards understanding. Theoretically, we show conditions under which our generative models are more efficient than the corresponding discriminative approaches. Empirically, when sample sizes are relatively high, the discriminative models tend to perform as well or better on discriminative metrics, such as classification rates and posterior calibration. However, when sample sizes are relatively low, the generative models outperform the discriminative ones even on discriminative metrics. Moreover, the generative ones can also sample from the distribution, obtain smoother posteriors, and extrapolate beyond the convex hull of the training data to handle OOD inputs more reasonably. Via human experiments we illustrate that our kernel generative networks (Kragen) behave more like humans than deep discriminative networks. We believe this approach may be an important step in unifying the thinking and approaches across the discriminative and generative divide

    Prospective Learning: Back to the Future

    Full text link
    Research on both natural intelligence (NI) and artificial intelligence (AI) generally assumes that the future resembles the past: intelligent agents or systems (what we call 'intelligence') observe and act on the world, then use this experience to act on future experiences of the same kind. We call this 'retrospective learning'. For example, an intelligence may see a set of pictures of objects, along with their names, and learn to name them. A retrospective learning intelligence would merely be able to name more pictures of the same objects. We argue that this is not what true intelligence is about. In many real world problems, both NIs and AIs will have to learn for an uncertain future. Both must update their internal models to be useful for future tasks, such as naming fundamentally new objects and using these objects effectively in a new context or to achieve previously unencountered goals. This ability to learn for the future we call 'prospective learning'. We articulate four relevant factors that jointly define prospective learning. Continual learning enables intelligences to remember those aspects of the past which it believes will be most useful in the future. Prospective constraints (including biases and priors) facilitate the intelligence finding general solutions that will be applicable to future problems. Curiosity motivates taking actions that inform future decision making, including in previously unmet situations. Causal estimation enables learning the structure of relations that guide choosing actions for specific outcomes, even when the specific action-outcome contingencies have never been observed before. We argue that a paradigm shift from retrospective to prospective learning will enable the communities that study intelligence to unite and overcome existing bottlenecks to more effectively explain, augment, and engineer intelligences

    Guidance on mucositis assessment from the MASCC Mucositis Study Group and ISOO: an international Delphi studyResearch in context

    Full text link
    Summary: Background: Mucositis is a common and highly impactful side effect of conventional and emerging cancer therapy and thus the subject of intense investigation. Although common practice, mucositis assessment is heterogeneously adopted and poorly guided, impacting evidence synthesis and translation. The Multinational Association of Supportive Care in Cancer (MASCC) Mucositis Study Group (MSG) therefore aimed to establish expert recommendations for how existing mucositis assessment tools should be used, in clinical care and trials contexts, to improve the consistency of mucositis assessment. Methods: This study was conducted over two stages (January 2022–July 2023). The first phase involved a survey to MASCC-MSG members (January 2022–May 2022), capturing current practices, challenges and preferences. These then informed the second phase, in which a set of initial recommendations were prepared and refined using the Delphi method (February 2023–May 2023). Consensus was defined as agreement on a parameter by >80% of respondents. Findings: Seventy-two MASCC-MSG members completed the first phase of the study (37 females, 34 males, mainly oral care specialists). High variability was noted in the use of mucositis assessment tools, with a high reliance on clinician assessment compared to patient reported outcome measures (PROMs, 47% vs 3%, 37% used a combination). The World Health Organization (WHO) and Common Terminology Criteria for Adverse Events (CTCAE) scales were most commonly used to assess mucositis across multiple settings. Initial recommendations were reviewed by experienced MSG members and following two rounds of Delphi survey consensus was achieved in 91 of 100 recommendations. For example, in patients receiving chemotherapy, the recommended tool for clinician assessment in clinical practice is WHO for oral mucositis (89.5% consensus), and WHO or CTCAE for gastrointestinal mucositis (85.7% consensus). The recommended PROM in clinical trials is OMD/WQ for oral mucositis (93.3% consensus), and PRO-CTCAE for gastrointestinal mucositis (83.3% consensus). Interpretation: These new recommendations provide much needed guidance on mucositis assessment and may be applied in both clinical practice and research to streamline comparison and synthesis of global data sets, thus accelerating translation of new knowledge into clinical practice. Funding: No funding was received
    corecore