121 research outputs found
FreeCOS: Self-Supervised Learning from Fractals and Unlabeled Images for Curvilinear Object Segmentation
Curvilinear object segmentation is critical for many applications. However,
manually annotating curvilinear objects is very time-consuming and error-prone,
yielding insufficiently available annotated datasets for existing supervised
methods and domain adaptation methods. This paper proposes a self-supervised
curvilinear object segmentation method that learns robust and distinctive
features from fractals and unlabeled images (FreeCOS). The key contributions
include a novel Fractal-FDA synthesis (FFS) module and a geometric information
alignment (GIA) approach. FFS generates curvilinear structures based on the
parametric Fractal L-system and integrates the generated structures into
unlabeled images to obtain synthetic training images via Fourier Domain
Adaptation. GIA reduces the intensity differences between the synthetic and
unlabeled images by comparing the intensity order of a given pixel to the
values of its nearby neighbors. Such image alignment can explicitly remove the
dependency on absolute intensity values and enhance the inherent geometric
characteristics which are common in both synthetic and real images. In
addition, GIA aligns features of synthetic and real images via the prediction
space adaptation loss (PSAL) and the curvilinear mask contrastive loss (CMCL).
Extensive experimental results on four public datasets, i.e., XCAD, DRIVE,
STARE and CrackTree demonstrate that our method outperforms the
state-of-the-art unsupervised methods, self-supervised methods and traditional
methods by a large margin. The source code of this work is available at
https://github.com/TY-Shi/FreeCOS.Comment: Accepted by ICCV 202
Resource-Efficient Cooperative Online Scalar Field Mapping via Distributed Sparse Gaussian Process Regression
Cooperative online scalar field mapping is an important task for multi-robot
systems. Gaussian process regression is widely used to construct a map that
represents spatial information with confidence intervals. However, it is
difficult to handle cooperative online mapping tasks because of its high
computation and communication costs. This letter proposes a resource-efficient
cooperative online field mapping method via distributed sparse Gaussian process
regression. A novel distributed online Gaussian process evaluation method is
developed such that robots can cooperatively evaluate and find observations of
sufficient global utility to reduce computation. The bounded errors of
distributed aggregation results are guaranteed theoretically, and the
performances of the proposed algorithms are validated by real online light
field mapping experiments
OTOV2: Automatic, Generic, User-Friendly
The existing model compression methods via structured pruning typically
require complicated multi-stage procedures. Each individual stage necessitates
numerous engineering efforts and domain-knowledge from the end-users which
prevent their wider applications onto broader scenarios. We propose the second
generation of Only-Train-Once (OTOv2), which first automatically trains and
compresses a general DNN only once from scratch to produce a more compact model
with competitive performance without fine-tuning. OTOv2 is automatic and
pluggable into various deep learning applications, and requires almost minimal
engineering efforts from the users. Methodologically, OTOv2 proposes two major
improvements: (i) Autonomy: automatically exploits the dependency of general
DNNs, partitions the trainable variables into Zero-Invariant Groups (ZIGs), and
constructs the compressed model; and (ii) Dual Half-Space Projected Gradient
(DHSPG): a novel optimizer to more reliably solve structured-sparsity problems.
Numerically, we demonstrate the generality and autonomy of OTOv2 on a variety
of model architectures such as VGG, ResNet, CARN, ConvNeXt, DenseNet and
StackedUnets, the majority of which cannot be handled by other methods without
extensive handcrafting efforts. Together with benchmark datasets including
CIFAR10/100, DIV2K, Fashion-MNIST, SVNH and ImageNet, its effectiveness is
validated by performing competitively or even better than the
state-of-the-arts. The source code is available at
https://github.com/tianyic/only_train_once.Comment: Published on ICLR 2023. Remark here that a few images of dependency
graphs can not be included in arXiv due to exceeding size limi
LoRAShear: Efficient Large Language Model Structured Pruning and Knowledge Recovery
Large Language Models (LLMs) have transformed the landscape of artificial
intelligence, while their enormous size presents significant challenges in
terms of computational costs. We introduce LoRAShear, a novel efficient
approach to structurally prune LLMs and recover knowledge. Given general LLMs,
LoRAShear at first creates the dependency graphs over LoRA modules to discover
minimally removal structures and analyze the knowledge distribution. It then
proceeds progressive structured pruning on LoRA adaptors and enables inherent
knowledge transfer to better preserve the information in the redundant
structures. To recover the lost knowledge during pruning, LoRAShear
meticulously studies and proposes a dynamic fine-tuning schemes with dynamic
data adaptors to effectively narrow down the performance gap to the full
models. Numerical results demonstrate that by only using one GPU within a
couple of GPU days, LoRAShear effectively reduced footprint of LLMs by 20% with
only 1.0% performance degradation and significantly outperforms
state-of-the-arts. The source code will be available at
https://github.com/microsoft/lorashear
Merging Experts into One: Improving Computational Efficiency of Mixture of Experts
Scaling the size of language models usually leads to remarkable advancements
in NLP tasks. But it often comes with a price of growing computational cost.
Although a sparse Mixture of Experts (MoE) can reduce the cost by activating a
small subset of parameters (e.g., one expert) for each input, its computation
escalates significantly if increasing the number of activated experts, limiting
its practical utility. Can we retain the advantages of adding more experts
without substantially increasing the computational costs? In this paper, we
first demonstrate the superiority of selecting multiple experts and then
propose a computation-efficient approach called \textbf{\texttt{Merging Experts
into One}} (MEO), which reduces the computation cost to that of a single
expert. Extensive experiments show that MEO significantly improves
computational efficiency, e.g., FLOPS drops from 72.0G of vanilla MoE to 28.6G
(MEO). Moreover, we propose a token-level attention block that further enhances
the efficiency and performance of token-level MEO, e.g., 83.3\% (MEO) vs.
82.6\% (vanilla MoE) average score on the GLUE benchmark. Our code will be
released upon acceptance. Code will be released at:
\url{https://github.com/Shwai-He/MEO}.Comment: EMNLP 2023 Main Conference (Oral
Integrating Stock Features and Global Information via Large Language Models for Enhanced Stock Return Prediction
The remarkable achievements and rapid advancements of Large Language Models
(LLMs) such as ChatGPT and GPT-4 have showcased their immense potential in
quantitative investment. Traders can effectively leverage these LLMs to analyze
financial news and predict stock returns accurately. However, integrating LLMs
into existing quantitative models presents two primary challenges: the
insufficient utilization of semantic information embedded within LLMs and the
difficulties in aligning the latent information within LLMs with pre-existing
quantitative stock features. We propose a novel framework consisting of two
components to surmount these challenges. The first component, the Local-Global
(LG) model, introduces three distinct strategies for modeling global
information. These approaches are grounded respectively on stock features, the
capabilities of LLMs, and a hybrid method combining the two paradigms. The
second component, Self-Correlated Reinforcement Learning (SCRL), focuses on
aligning the embeddings of financial news generated by LLMs with stock features
within the same semantic space. By implementing our framework, we have
demonstrated superior performance in Rank Information Coefficient and returns,
particularly compared to models relying only on stock features in the China
A-share market.Comment: 8 pages, International Joint Conferences on Artificial Intelligenc
Synthesis and Characterization of Structure-Controlled Micro-/Nanocomposite TiO 2
A series of structure-controlled composite TiO2 fibers combining micro- and nanostructures (hereafter, micro-/nanocomposite) were fabricated using a combination of electrospinning and calcination methods, and their photocatalytic activities were investigated. Smooth microscale fibers were obtained by electrospinning a precursor solution containing tetrabutyl titanate and TiF4. TiO2 nanocrystals formed on the microfibers with the help of HF which was produced from the decomposition of TiF4 in calcination. The size and quantity of TiO2 nanocrystals can be controlled by tuning the mass ratio of TiF4 in the sol-gel precursor solutions and the calcination time. The obtained micro-/nanocomposite TiO2 fibers were found to exhibit enhanced photocatalytic properties when compared with the bare microfibers. These micro-/nanocomposite structures exhibit the advantages of both the nanocrystals and microfibers, which will lead to new developments in photocatalysis
Towards Automatic Neural Architecture Search within General Super-Networks
Existing neural architecture search (NAS) methods typically rely on
pre-specified super deep neural networks (super-networks) with handcrafted
search spaces beforehand. Such requirements make it challenging to extend them
onto general scenarios without significant human expertise and manual
intervention. To overcome the limitations, we propose the third generation of
Only-Train-Once (OTOv3). OTOv3 is perhaps the first automated system that
trains general super-networks and produces high-performing sub-networks in the
one shot manner without pretraining and fine-tuning. Technologically, OTOv3
delivers three noticeable contributions to minimize human efforts: (i)
automatic search space construction for general super-networks; (ii) a
Hierarchical Half-Space Projected Gradient (H2SPG) that leverages the
dependency graph to ensure the network validity during optimization and
reliably produces a solution with both high performance and hierarchical group
sparsity; and (iii) automatic sub-network construction based on the
super-network and the H2SPG solution. Numerically, we demonstrate the
effectiveness of OTOv3 on a variety of super-networks, including RegNet,
StackedUnets, SuperResNet, and DARTS, over benchmark datasets such as CIFAR10,
Fashion-MNIST, ImageNet, STL-10, and SVNH. The sub-networks computed by OTOv3
achieve competitive even superior performance compared to the super-networks
and other state-of-the-arts. The library will be released at
https://github.com/tianyic/only_train_once
TSE-GAN: strain elastography using generative adversarial network for thyroid disease diagnosis
Over the past 35 years, studies conducted worldwide have revealed a threefold increase in the incidence of thyroid cancer. Strain elastography is a new imaging technique to identify benign and malignant thyroid nodules due to its sensitivity to tissue stiffness. However, there are certain limitations of this technique, particularly in terms of standardization of the compression process, evaluation of results and several assumptions used in commercial strain elastography modes for the purpose of simplifying imaging analysis. In this work, we propose a novel conditional generative adversarial network (TSE-GAN) for automatically generating thyroid strain elastograms, which adopts a global-to-local architecture to improve the ability of extracting multi-scale features and develops an adaptive deformable U-net structure in the sub-generator to apply effective deformation. Furthermore, we introduce a Lab-based loss function to induce the networks to generate realistic thyroid elastograms that conform to the probability distribution of the target domain. Qualitative and quantitative assessments are conducted on a clinical dataset provided by Shanghai Sixth People’s Hospital. Experimental results demonstrate that thyroid elastograms generated by the proposed TSE-GAN outperform state-of-the-art image translation methods in meeting the needs of clinical diagnostic applications and providing practical value
- …