24,191 research outputs found
Dynamic Sparsity Is Channel-Level Sparsity Learner
Sparse training has received an upsurging interest in machine learning due to
its tantalizing saving potential for the entire training process as well as
inference. Dynamic sparse training (DST), as a leading sparse training
approach, can train deep neural networks at high sparsity from scratch to match
the performance of their dense counterparts. However, most if not all DST prior
arts demonstrate their effectiveness on unstructured sparsity with highly
irregular sparse patterns, which receives limited support in common hardware.
This limitation hinders the usage of DST in practice. In this paper, we propose
Channel-aware dynamic sparse (Chase), which for the first time seamlessly
translates the promise of unstructured dynamic sparsity to GPU-friendly
channel-level sparsity (not fine-grained N:M or group sparsity) during one
end-to-end training process, without any ad-hoc operations. The resulting small
sparse networks can be directly accelerated by commodity hardware, without
using any particularly sparsity-aware hardware accelerators. This appealing
outcome is partially motivated by a hidden phenomenon of dynamic sparsity:
off-the-shelf unstructured DST implicitly involves biased parameter
reallocation across channels, with a large fraction of channels (up to 60%)
being sparser than others. By progressively identifying and removing these
channels during training, our approach translates unstructured sparsity to
channel-wise sparsity. Our experimental results demonstrate that Chase achieves
1.7 X inference throughput speedup on common GPU devices without compromising
accuracy with ResNet-50 on ImageNet. We release our codes in
https://github.com/luuyin/chase.Comment: Accepted by the 37th Conference on Neural Information Processing
Systems (NeurIPS 2023
Dependent Nonparametric Bayesian Group Dictionary Learning for online reconstruction of Dynamic MR images
In this paper, we introduce a dictionary learning based approach applied to
the problem of real-time reconstruction of MR image sequences that are highly
undersampled in k-space. Unlike traditional dictionary learning, our method
integrates both global and patch-wise (local) sparsity information and
incorporates some priori information into the reconstruction process. Moreover,
we use a Dependent Hierarchical Beta-process as the prior for the group-based
dictionary learning, which adaptively infers the dictionary size and the
sparsity of each patch; and also ensures that similar patches are manifested in
terms of similar dictionary atoms. An efficient numerical algorithm based on
the alternating direction method of multipliers (ADMM) is also presented.
Through extensive experimental results we show that our proposed method
achieves superior reconstruction quality, compared to the other state-of-the-
art DL-based methods
Sparsity-Promoting Bayesian Dynamic Linear Models
Sparsity-promoting priors have become increasingly popular over recent years
due to an increased number of regression and classification applications
involving a large number of predictors. In time series applications where
observations are collected over time, it is often unrealistic to assume that
the underlying sparsity pattern is fixed. We propose here an original class of
flexible Bayesian linear models for dynamic sparsity modelling. The proposed
class of models expands upon the existing Bayesian literature on sparse
regression using generalized multivariate hyperbolic distributions. The
properties of the models are explored through both analytic results and
simulation studies. We demonstrate the model on a financial application where
it is shown that it accurately represents the patterns seen in the analysis of
stock and derivative data, and is able to detect major events by filtering an
artificial portfolio of assets
Quality-based Multimodal Classification Using Tree-Structured Sparsity
Recent studies have demonstrated advantages of information fusion based on
sparsity models for multimodal classification. Among several sparsity models,
tree-structured sparsity provides a flexible framework for extraction of
cross-correlated information from different sources and for enforcing group
sparsity at multiple granularities. However, the existing algorithm only solves
an approximated version of the cost functional and the resulting solution is
not necessarily sparse at group levels. This paper reformulates the
tree-structured sparse model for multimodal classification task. An accelerated
proximal algorithm is proposed to solve the optimization problem, which is an
efficient tool for feature-level fusion among either homogeneous or
heterogeneous sources of information. In addition, a (fuzzy-set-theoretic)
possibilistic scheme is proposed to weight the available modalities, based on
their respective reliability, in a joint optimization problem for finding the
sparsity codes. This approach provides a general framework for quality-based
fusion that offers added robustness to several sparsity-based multimodal
classification algorithms. To demonstrate their efficacy, the proposed methods
are evaluated on three different applications - multiview face recognition,
multimodal face recognition, and target classification.Comment: To Appear in 2014 IEEE Conference on Computer Vision and Pattern
Recognition (CVPR 2014
Structured Sparsity: Discrete and Convex approaches
Compressive sensing (CS) exploits sparsity to recover sparse or compressible
signals from dimensionality reducing, non-adaptive sensing mechanisms. Sparsity
is also used to enhance interpretability in machine learning and statistics
applications: While the ambient dimension is vast in modern data analysis
problems, the relevant information therein typically resides in a much lower
dimensional space. However, many solutions proposed nowadays do not leverage
the true underlying structure. Recent results in CS extend the simple sparsity
idea to more sophisticated {\em structured} sparsity models, which describe the
interdependency between the nonzero components of a signal, allowing to
increase the interpretability of the results and lead to better recovery
performance. In order to better understand the impact of structured sparsity,
in this chapter we analyze the connections between the discrete models and
their convex relaxations, highlighting their relative advantages. We start with
the general group sparse model and then elaborate on two important special
cases: the dispersive and the hierarchical models. For each, we present the
models in their discrete nature, discuss how to solve the ensuing discrete
problems and then describe convex relaxations. We also consider more general
structures as defined by set functions and present their convex proxies.
Further, we discuss efficient optimization solutions for structured sparsity
problems and illustrate structured sparsity in action via three applications.Comment: 30 pages, 18 figure
- …