871 research outputs found
LIPIcs, Volume 251, ITCS 2023, Complete Volume
LIPIcs, Volume 251, ITCS 2023, Complete Volum
Native Multi-Band Audio Coding within Hyper-Autoencoded Reconstruction Propagation Networks
Spectral sub-bands do not portray the same perceptual relevance. In audio
coding, it is therefore desirable to have independent control over each of the
constituent bands so that bitrate assignment and signal reconstruction can be
achieved efficiently. In this work, we present a novel neural audio coding
network that natively supports a multi-band coding paradigm. Our model extends
the idea of compressed skip connections in the U-Net-based codec, allowing for
independent control over both core and high band-specific reconstructions and
bit allocation. Our system reconstructs the full-band signal mainly from the
condensed core-band code, therefore exploiting and showcasing its bandwidth
extension capabilities to its fullest. Meanwhile, the low-bitrate high-band
code helps the high-band reconstruction similarly to MPEG audio codecs'
spectral bandwidth replication. MUSHRA tests show that the proposed model not
only improves the quality of the core band by explicitly assigning more bits to
it but retains a good quality in the high-band as well.Comment: Accepted to ICASSP 2023. For resources and examples, see
https://saige.sice.indiana.edu/research-projects/HARP-Net
Neural Vector Fields: Implicit Representation by Explicit Learning
Deep neural networks (DNNs) are widely applied for nowadays 3D surface
reconstruction tasks and such methods can be further divided into two
categories, which respectively warp templates explicitly by moving vertices or
represent 3D surfaces implicitly as signed or unsigned distance functions.
Taking advantage of both advanced explicit learning process and powerful
representation ability of implicit functions, we propose a novel 3D
representation method, Neural Vector Fields (NVF). It not only adopts the
explicit learning process to manipulate meshes directly, but also leverages the
implicit representation of unsigned distance functions (UDFs) to break the
barriers in resolution and topology. Specifically, our method first predicts
the displacements from queries towards the surface and models the shapes as
\textit{Vector Fields}. Rather than relying on network differentiation to
obtain direction fields as most existing UDF-based methods, the produced vector
fields encode the distance and direction fields both and mitigate the ambiguity
at "ridge" points, such that the calculation of direction fields is
straightforward and differentiation-free. The differentiation-free
characteristic enables us to further learn a shape codebook via Vector
Quantization, which encodes the cross-object priors, accelerates the training
procedure, and boosts model generalization on cross-category reconstruction.
The extensive experiments on surface reconstruction benchmarks indicate that
our method outperforms those state-of-the-art methods in different evaluation
scenarios including watertight vs non-watertight shapes, category-specific vs
category-agnostic reconstruction, category-unseen reconstruction, and
cross-domain reconstruction. Our code is released at
https://github.com/Wi-sc/NVF.Comment: Accepted by CVPR2023. Video:
https://www.youtube.com/watch?v=GMXKoJfmHr
QuantEase: Optimization-based Quantization for Language Models -- An Efficient and Intuitive Algorithm
With the rising popularity of Large Language Models (LLMs), there has been an
increasing interest in compression techniques that enable their efficient
deployment. This study focuses on the Post-Training Quantization (PTQ) of LLMs.
Drawing from recent advances, our work introduces QuantEase, a layer-wise
quantization framework where individual layers undergo separate quantization.
The problem is framed as a discrete-structured non-convex optimization,
prompting the development of algorithms rooted in Coordinate Descent (CD)
techniques. These CD-based methods provide high-quality solutions to the
complex non-convex layer-wise quantization problems. Notably, our CD-based
approach features straightforward updates, relying solely on matrix and vector
operations, circumventing the need for matrix inversion or decomposition. We
also explore an outlier-aware variant of our approach, allowing for retaining
significant weights (outliers) with complete precision. Our proposal attains
state-of-the-art performance in terms of perplexity and zero-shot accuracy in
empirical evaluations across various LLMs and datasets, with relative
improvements up to 15% over methods such as GPTQ. Particularly noteworthy is
our outlier-aware algorithm's capability to achieve near or sub-3-bit
quantization of LLMs with an acceptable drop in accuracy, obviating the need
for non-uniform quantization or grouping techniques, improving upon methods
such as SpQR by up to two times in terms of perplexity
Neural Residual Radiance Fields for Streamably Free-Viewpoint Videos
The success of the Neural Radiance Fields (NeRFs) for modeling and free-view
rendering static objects has inspired numerous attempts on dynamic scenes.
Current techniques that utilize neural rendering for facilitating free-view
videos (FVVs) are restricted to either offline rendering or are capable of
processing only brief sequences with minimal motion. In this paper, we present
a novel technique, Residual Radiance Field or ReRF, as a highly compact neural
representation to achieve real-time FVV rendering on long-duration dynamic
scenes. ReRF explicitly models the residual information between adjacent
timestamps in the spatial-temporal feature space, with a global
coordinate-based tiny MLP as the feature decoder. Specifically, ReRF employs a
compact motion grid along with a residual feature grid to exploit inter-frame
feature similarities. We show such a strategy can handle large motions without
sacrificing quality. We further present a sequential training scheme to
maintain the smoothness and the sparsity of the motion/residual grids. Based on
ReRF, we design a special FVV codec that achieves three orders of magnitudes
compression rate and provides a companion ReRF player to support online
streaming of long-duration FVVs of dynamic scenes. Extensive experiments
demonstrate the effectiveness of ReRF for compactly representing dynamic
radiance fields, enabling an unprecedented free-viewpoint viewing experience in
speed and quality.Comment: Accepted by CVPR 2023. Project page, see
https://aoliao12138.github.io/ReRF
Historical Burdens on Physics
When learning physics, one follows a track very similar to the historical path of the evolution of this science: one takes detours, overcomes superfluous obstacles and repeats mistakes, one learns inappropriate concepts and uses outdated methods. In the book, more than 200 articles present and analyze such obsolete concepts methods. All articles have the same structure: 1. subject, 2. deficiencies, 3. origin, 4. disposal. The articles had originally appeared as columns in various magazines. Accordingly, we had tried to write them in an easily understandable way
Robust Brain MRI Image Classification with SIBOW-SVM
The majority of primary Central Nervous System (CNS) tumors in the brain are
among the most aggressive diseases affecting humans. Early detection of brain
tumor types, whether benign or malignant, glial or non-glial, is critical for
cancer prevention and treatment, ultimately improving human life expectancy.
Magnetic Resonance Imaging (MRI) stands as the most effective technique to
detect brain tumors by generating comprehensive brain images through scans.
However, human examination can be error-prone and inefficient due to the
complexity, size, and location variability of brain tumors. Recently, automated
classification techniques using machine learning (ML) methods, such as
Convolutional Neural Network (CNN), have demonstrated significantly higher
accuracy than manual screening, while maintaining low computational costs.
Nonetheless, deep learning-based image classification methods, including CNN,
face challenges in estimating class probabilities without proper model
calibration. In this paper, we propose a novel brain tumor image classification
method, called SIBOW-SVM, which integrates the Bag-of-Features (BoF) model with
SIFT feature extraction and weighted Support Vector Machines (wSVMs). This new
approach effectively captures hidden image features, enabling the
differentiation of various tumor types and accurate label predictions.
Additionally, the SIBOW-SVM is able to estimate the probabilities of images
belonging to each class, thereby providing high-confidence classification
decisions. We have also developed scalable and parallelable algorithms to
facilitate the practical implementation of SIBOW-SVM for massive images. As a
benchmark, we apply the SIBOW-SVM to a public data set of brain tumor MRI
images containing four classes: glioma, meningioma, pituitary, and normal. Our
results show that the new method outperforms state-of-the-art methods,
including CNN
- …