378 research outputs found

    Re-learning How to Teach

    Get PDF
    This artifact addresses teacher education during the pandemic, when teachers have faced great challenges. Many of the student-teacher concepts that Dr. Kazemi and Dr. Ghousseini have advocated for in a pre-pandemic time—such as ambitious teaching, rehearsals, and playful learning—are important right now. Teaching is more than just delivering content; it involves building relationships with students and their families, which is much more difficult in a pandemic world

    Surgical Feature-Space Decomposition of LLMs: Why, When and How?

    Full text link
    Low-rank approximations, of the weight and feature space can enhance the performance of deep learning models, whether in terms of improving generalization or reducing the latency of inference. However, there is no clear consensus yet on \emph{how}, \emph{when} and \emph{why} these approximations are helpful for large language models (LLMs). In this work, we empirically study the efficacy of weight and feature space decomposition in transformer-based LLMs. We demonstrate that surgical decomposition not only provides critical insights into the trade-off between compression and language modelling performance, but also sometimes enhances commonsense reasoning performance of LLMs. Our empirical analysis identifies specific network segments that intrinsically exhibit a low-rank structure. Furthermore, we extend our investigation to the implications of low-rank approximations on model bias. Overall, our findings offer a novel perspective on optimizing LLMs, presenting the low-rank approximation not only as a tool for performance enhancements, but also as a means to potentially rectify biases within these models. Our code is available at \href{https://github.com/nyunAI/SFSD-LLM}{GitHub}.Comment: Accepted at ACL 202

    Rethinking Compression: Reduced Order Modelling of Latent Features in Large Language Models

    Full text link
    Due to the substantial scale of Large Language Models (LLMs), the direct application of conventional compression methodologies proves impractical. The computational demands associated with even minimal gradient updates present challenges, particularly on consumer-grade hardware. This paper introduces an innovative approach for the parametric and practical compression of LLMs based on reduced order modelling, which entails low-rank decomposition within the feature space and re-parameterization in the weight space. Notably, this compression technique operates in a layer-wise manner, obviating the need for a GPU device and enabling the compression of billion-scale models within stringent constraints of both memory and time. Our method represents a significant advancement in model compression by leveraging matrix decomposition, demonstrating superior efficacy compared to the prevailing state-of-the-art structured pruning method.Comment: Brief technical report; Code will be made available at https://github.com/transmuteAI/trailmet/tree/main/trailmet/algorithms/llm-ro

    Exploring Multilingual Unseen Speaker Emotion Recognition: Leveraging Co-Attention Cues in Multitask Learning

    Full text link
    Advent of modern deep learning techniques has given rise to advancements in the field of Speech Emotion Recognition (SER). However, most systems prevalent in the field fail to generalize to speakers not seen during training. This study focuses on handling challenges of multilingual SER, specifically on unseen speakers. We introduce CAMuLeNet, a novel architecture leveraging co-attention based fusion and multitask learning to address this problem. Additionally, we benchmark pretrained encoders of Whisper, HuBERT, Wav2Vec2.0, and WavLM using 10-fold leave-speaker-out cross-validation on five existing multilingual benchmark datasets: IEMOCAP, RAVDESS, CREMA-D, EmoDB and CaFE and, release a novel dataset for SER on the Hindi language (BhavVani). CAMuLeNet shows an average improvement of approximately 8% over all benchmarks on unseen speakers determined by our cross-validation strategy.Comment: 5 pages, Accepted to INTERSPEECH 2024. The first two authors contributed equall

    Beyond Uniform Scaling: Exploring Depth Heterogeneity in Neural Architectures

    Full text link
    Conventional scaling of neural networks typically involves designing a base network and growing different dimensions like width, depth, etc. of the same by some predefined scaling factors. We introduce an automated scaling approach leveraging second-order loss landscape information. Our method is flexible towards skip connections a mainstay in modern vision transformers. Our training-aware method jointly scales and trains transformers without additional training iterations. Motivated by the hypothesis that not all neurons need uniform depth complexity, our approach embraces depth heterogeneity. Extensive evaluations on DeiT-S with ImageNet100 show a 2.5% accuracy gain and 10% parameter efficiency improvement over conventional scaling. Scaled networks demonstrate superior performance upon training small scale datasets from scratch. We introduce the first intact scaling mechanism for vision transformers, a step towards efficient model scaling.Comment: Accepted At ICLR 2024 (Tiny Paper Track

    Physical activity, suicidal ideation, suicide attempt and death among individuals with mental or other medical disorders:a systematic review of observational studies

    Get PDF
    A growing body of research has demonstrated the potential role for physical activity as an intervention across mental and other medical disorders. However, the association between physical activity and suicidal ideation, attempts, and deaths has not been systematically appraised in clinical samples. We conducted a PRISMA 2020-compliant systematic review searching MEDLINE, EMBASE, and PsycINFO for observational studies investigating the influence of physical activity on suicidal behaviour up to December 6, 2023. Of 116 eligible full-text studies, seven (n=141691) were included. Depression was the most frequently studied c mental condition (43%, k=3), followed by chronic pain as the most common other medical condition (29%, k=2). Two case-control studies examined suicide attempts and found an association between physical activity and a reduced frequency of such attempts. However, in studies examining suicidal ideation (k=3) or suicide deaths (k=2), no consistent associations with physical activity were observed. Overall, our systematic review found that physical activity may be linked to a lower frequency of suicide attempts in non-prospective studies involving individuals with mental disorders
    corecore