125 research outputs found
Learning Representations from EEG with Deep Recurrent-Convolutional Neural Networks
One of the challenges in modeling cognitive events from electroencephalogram
(EEG) data is finding representations that are invariant to inter- and
intra-subject differences, as well as to inherent noise associated with such
data. Herein, we propose a novel approach for learning such representations
from multi-channel EEG time-series, and demonstrate its advantages in the
context of mental load classification task. First, we transform EEG activities
into a sequence of topology-preserving multi-spectral images, as opposed to
standard EEG analysis techniques that ignore such spatial information. Next, we
train a deep recurrent-convolutional network inspired by state-of-the-art video
classification to learn robust representations from the sequence of images. The
proposed approach is designed to preserve the spatial, spectral, and temporal
structure of EEG which leads to finding features that are less sensitive to
variations and distortions within each dimension. Empirical evaluation on the
cognitive load classification task demonstrated significant improvements in
classification accuracy over current state-of-the-art approaches in this field.Comment: To be published as a conference paper at ICLR 201
Context Attentive Bandits: Contextual Bandit with Restricted Context
We consider a novel formulation of the multi-armed bandit model, which we
call the contextual bandit with restricted context, where only a limited number
of features can be accessed by the learner at every iteration. This novel
formulation is motivated by different online problems arising in clinical
trials, recommender systems and attention modeling. Herein, we adapt the
standard multi-armed bandit algorithm known as Thompson Sampling to take
advantage of our restricted context setting, and propose two novel algorithms,
called the Thompson Sampling with Restricted Context(TSRC) and the Windows
Thompson Sampling with Restricted Context(WTSRC), for handling stationary and
nonstationary environments, respectively. Our empirical results demonstrate
advantages of the proposed approaches on several real-life datasetsComment: IJCAI 201
LORD: Low Rank Decomposition Of Monolingual Code LLMs For One-Shot Compression
Low Rank Decomposition of matrix - splitting a large matrix into a product of
two smaller matrix offers a means for compression that reduces the parameters
of a model without sparsification, and hence delivering more speedup on modern
hardware. Moreover, unlike quantization, the compressed linear layers remain
fully differentiable and all the parameters trainable, while being able to
leverage the existing highly efficient kernels over floating point matrices. We
study the potential to compress Large Language Models (LLMs) for monolingual
Code generation via Low Rank Decomposition (LoRD) and observe that ranks for
the linear layers in these models can be reduced by upto 39.58% with less than
1% increase in perplexity. We then use Low Rank Decomposition (LoRD) to
compress StarCoder 16B to 13.2B parameter with no drop and to 12.3B with
minimal drop in HumanEval Pass@1 score, in less than 10 minutes on a single
A100. The compressed models speeds up inference by up to 22.35% with just a
single line of change in code over huggingface's implementation with pytorch
backend. Low Rank Decomposition (LoRD) models remain compatible with state of
the art near-lossless quantization method such as SpQR, which allows leveraging
further compression gains of quantization. Lastly, QLoRA over Low Rank
Decomposition (LoRD) model further reduces memory requirements by as much as
21.2% over vanilla QLoRA while offering similar gains from parameter efficient
fine tuning. Our work shows Low Rank Decomposition (LoRD) as a promising new
paradigm for LLM compression.Comment: 9 page
- …