5,698 research outputs found
The Capacity and Robustness Trade-off: Revisiting the Channel Independent Strategy for Multivariate Time Series Forecasting
Multivariate time series data comprises various channels of variables. The
multivariate forecasting models need to capture the relationship between the
channels to accurately predict future values. However, recently, there has been
an emergence of methods that employ the Channel Independent (CI) strategy.
These methods view multivariate time series data as separate univariate time
series and disregard the correlation between channels. Surprisingly, our
empirical results have shown that models trained with the CI strategy
outperform those trained with the Channel Dependent (CD) strategy, usually by a
significant margin. Nevertheless, the reasons behind this phenomenon have not
yet been thoroughly explored in the literature. This paper provides
comprehensive empirical and theoretical analyses of the characteristics of
multivariate time series datasets and the CI/CD strategy. Our results conclude
that the CD approach has higher capacity but often lacks robustness to
accurately predict distributionally drifted time series. In contrast, the CI
approach trades capacity for robust prediction. Practical measures inspired by
these analyses are proposed to address the capacity and robustness dilemma,
including a modified CD method called Predict Residuals with Regularization
(PRReg) that can surpass the CI strategy. We hope our findings can raise
awareness among researchers about the characteristics of multivariate time
series and inspire the construction of better forecasting models.Comment: under revie
How to Train Your MAML to Excel in Few-Shot Classification
Model-agnostic meta-learning (MAML) is arguably the most popular
meta-learning algorithm nowadays, given its flexibility to incorporate various
model architectures and to be applied to different problems. Nevertheless, its
performance on few-shot classification is far behind many recent algorithms
dedicated to the problem. In this paper, we point out several key facets of how
to train MAML to excel in few-shot classification. First, we find that a large
number of gradient steps are needed for the inner loop update, which
contradicts the common usage of MAML for few-shot classification. Second, we
find that MAML is sensitive to the permutation of class assignments in
meta-testing: for a few-shot task of classes, there are exponentially many
ways to assign the learned initialization of the -way classifier to the
classes, leading to an unavoidably huge variance. Third, we investigate several
ways for permutation invariance and find that learning a shared classifier
initialization for all the classes performs the best. On benchmark datasets
such as MiniImageNet and TieredImageNet, our approach, which we name
UNICORN-MAML, performs on a par with or even outperforms state-of-the-art
algorithms, while keeping the simplicity of MAML without adding any extra
sub-networks
Unlocking the Transferability of Tokens in Deep Models for Tabular Data
Fine-tuning a pre-trained deep neural network has become a successful
paradigm in various machine learning tasks. However, such a paradigm becomes
particularly challenging with tabular data when there are discrepancies between
the feature sets of pre-trained models and the target tasks. In this paper, we
propose TabToken, a method aims at enhancing the quality of feature tokens
(i.e., embeddings of tabular features). TabToken allows for the utilization of
pre-trained models when the upstream and downstream tasks share overlapping
features, facilitating model fine-tuning even with limited training examples.
Specifically, we introduce a contrastive objective that regularizes the tokens,
capturing the semantics within and across features. During the pre-training
stage, the tokens are learned jointly with top-layer deep models such as
transformer. In the downstream task, tokens of the shared features are kept
fixed while TabToken efficiently fine-tunes the remaining parts of the model.
TabToken not only enables knowledge transfer from a pre-trained model to tasks
with heterogeneous features, but also enhances the discriminative ability of
deep tabular models in standard classification and regression tasks
Improved Noisy Student Training for Automatic Speech Recognition
Recently, a semi-supervised learning method known as "noisy student training"
has been shown to improve image classification performance of deep networks
significantly. Noisy student training is an iterative self-training method that
leverages augmentation to improve network performance. In this work, we adapt
and improve noisy student training for automatic speech recognition, employing
(adaptive) SpecAugment as the augmentation method. We find effective methods to
filter, balance and augment the data generated in between self-training
iterations. By doing so, we are able to obtain word error rates (WERs)
4.2%/8.6% on the clean/noisy LibriSpeech test sets by only using the clean 100h
subset of LibriSpeech as the supervised set and the rest (860h) as the
unlabeled set. Furthermore, we are able to achieve WERs 1.7%/3.4% on the
clean/noisy LibriSpeech test sets by using the unlab-60k subset of LibriLight
as the unlabeled set for LibriSpeech 960h. We are thus able to improve upon the
previous state-of-the-art clean/noisy test WERs achieved on LibriSpeech 100h
(4.74%/12.20%) and LibriSpeech (1.9%/4.1%).Comment: 5 pages, 5 figures, 4 tables; v2: minor revisions, reference adde
Kinematics of a Trinal-Branch Space Robotic Manipulator with Redundancy
AbstractThis paper presents a trinal branch space robotic manipulator with redundancy, due to hash application environments, such as in the station. One end- effector of the manipulator can be attached to the base, and other two be controlled to accomplish tasks. The manipulator permits operation of science payload, during periods when astronauts may not be present. In order to provide theoretic basis for kinematics optimization, dynamics optimization and fault-tolerant control, its inverse kinematics is analyzed by using screw theory, and its unified formulation is established. Base on closed form resolution of spherical wrist, a simplified inverse kinematics is proposed. Computer simulation results demonstrate the validity of the proposed inverse kinematics
- …