378 research outputs found
Re-learning How to Teach
This artifact addresses teacher education during the pandemic, when teachers have faced great challenges. Many of the student-teacher concepts that Dr. Kazemi and Dr. Ghousseini have advocated for in a pre-pandemic time—such as ambitious teaching, rehearsals, and playful learning—are important right now. Teaching is more than just delivering content; it involves building relationships with students and their families, which is much more difficult in a pandemic world
Surgical Feature-Space Decomposition of LLMs: Why, When and How?
Low-rank approximations, of the weight and feature space can enhance the
performance of deep learning models, whether in terms of improving
generalization or reducing the latency of inference. However, there is no clear
consensus yet on \emph{how}, \emph{when} and \emph{why} these approximations
are helpful for large language models (LLMs). In this work, we empirically
study the efficacy of weight and feature space decomposition in
transformer-based LLMs. We demonstrate that surgical decomposition not only
provides critical insights into the trade-off between compression and language
modelling performance, but also sometimes enhances commonsense reasoning
performance of LLMs. Our empirical analysis identifies specific network
segments that intrinsically exhibit a low-rank structure. Furthermore, we
extend our investigation to the implications of low-rank approximations on
model bias. Overall, our findings offer a novel perspective on optimizing LLMs,
presenting the low-rank approximation not only as a tool for performance
enhancements, but also as a means to potentially rectify biases within these
models. Our code is available at
\href{https://github.com/nyunAI/SFSD-LLM}{GitHub}.Comment: Accepted at ACL 202
Rethinking Compression: Reduced Order Modelling of Latent Features in Large Language Models
Due to the substantial scale of Large Language Models (LLMs), the direct
application of conventional compression methodologies proves impractical. The
computational demands associated with even minimal gradient updates present
challenges, particularly on consumer-grade hardware. This paper introduces an
innovative approach for the parametric and practical compression of LLMs based
on reduced order modelling, which entails low-rank decomposition within the
feature space and re-parameterization in the weight space. Notably, this
compression technique operates in a layer-wise manner, obviating the need for a
GPU device and enabling the compression of billion-scale models within
stringent constraints of both memory and time. Our method represents a
significant advancement in model compression by leveraging matrix
decomposition, demonstrating superior efficacy compared to the prevailing
state-of-the-art structured pruning method.Comment: Brief technical report; Code will be made available at
https://github.com/transmuteAI/trailmet/tree/main/trailmet/algorithms/llm-ro
Exploring Multilingual Unseen Speaker Emotion Recognition: Leveraging Co-Attention Cues in Multitask Learning
Advent of modern deep learning techniques has given rise to advancements in
the field of Speech Emotion Recognition (SER). However, most systems prevalent
in the field fail to generalize to speakers not seen during training. This
study focuses on handling challenges of multilingual SER, specifically on
unseen speakers. We introduce CAMuLeNet, a novel architecture leveraging
co-attention based fusion and multitask learning to address this problem.
Additionally, we benchmark pretrained encoders of Whisper, HuBERT, Wav2Vec2.0,
and WavLM using 10-fold leave-speaker-out cross-validation on five existing
multilingual benchmark datasets: IEMOCAP, RAVDESS, CREMA-D, EmoDB and CaFE and,
release a novel dataset for SER on the Hindi language (BhavVani). CAMuLeNet
shows an average improvement of approximately 8% over all benchmarks on unseen
speakers determined by our cross-validation strategy.Comment: 5 pages, Accepted to INTERSPEECH 2024. The first two authors
contributed equall
Beyond Uniform Scaling: Exploring Depth Heterogeneity in Neural Architectures
Conventional scaling of neural networks typically involves designing a base
network and growing different dimensions like width, depth, etc. of the same by
some predefined scaling factors. We introduce an automated scaling approach
leveraging second-order loss landscape information. Our method is flexible
towards skip connections a mainstay in modern vision transformers. Our
training-aware method jointly scales and trains transformers without additional
training iterations. Motivated by the hypothesis that not all neurons need
uniform depth complexity, our approach embraces depth heterogeneity. Extensive
evaluations on DeiT-S with ImageNet100 show a 2.5% accuracy gain and 10%
parameter efficiency improvement over conventional scaling. Scaled networks
demonstrate superior performance upon training small scale datasets from
scratch. We introduce the first intact scaling mechanism for vision
transformers, a step towards efficient model scaling.Comment: Accepted At ICLR 2024 (Tiny Paper Track
Physical activity, suicidal ideation, suicide attempt and death among individuals with mental or other medical disorders:a systematic review of observational studies
A growing body of research has demonstrated the potential role for physical activity as an intervention across mental and other medical disorders. However, the association between physical activity and suicidal ideation, attempts, and deaths has not been systematically appraised in clinical samples. We conducted a PRISMA 2020-compliant systematic review searching MEDLINE, EMBASE, and PsycINFO for observational studies investigating the influence of physical activity on suicidal behaviour up to December 6, 2023. Of 116 eligible full-text studies, seven (n=141691) were included. Depression was the most frequently studied c mental condition (43%, k=3), followed by chronic pain as the most common other medical condition (29%, k=2). Two case-control studies examined suicide attempts and found an association between physical activity and a reduced frequency of such attempts. However, in studies examining suicidal ideation (k=3) or suicide deaths (k=2), no consistent associations with physical activity were observed. Overall, our systematic review found that physical activity may be linked to a lower frequency of suicide attempts in non-prospective studies involving individuals with mental disorders
- …
