22 research outputs found
3D-U-SAM Network For Few-shot Tooth Segmentation in CBCT Images
Accurate representation of tooth position is extremely important in
treatment. 3D dental image segmentation is a widely used method, however
labelled 3D dental datasets are a scarce resource, leading to the problem of
small samples that this task faces in many cases. To this end, we address this
problem with a pretrained SAM and propose a novel 3D-U-SAM network for 3D
dental image segmentation. Specifically, in order to solve the problem of using
2D pre-trained weights on 3D datasets, we adopted a convolution approximation
method; in order to retain more details, we designed skip connections to fuse
features at all levels with reference to U-Net. The effectiveness of the
proposed method is demonstrated in ablation experiments, comparison
experiments, and sample size experiments.Comment: This work has been submitted to the IEEE for possible publication.
Copyright may be transferred without notice, after which this version may no
longer be accessibl
Self-Improving for Zero-Shot Named Entity Recognition with Large Language Models
Exploring the application of powerful large language models (LLMs) on the
fundamental named entity recognition (NER) task has drawn much attention
recently. This work aims to investigate the possibilities of pushing the
boundary of zero-shot NER with LLM via a training-free self-improving strategy.
We propose a self-improving framework, which utilize an unlabeled corpus to
stimulate the self-learning ability of LLMs on NER. First, we use LLM to make
predictions on the unlabeled corpus and obtain the self-annotated data. Second,
we explore various strategies to select reliable samples from the
self-annotated dataset as demonstrations, considering the similarity, diversity
and reliability of demonstrations. Finally, we conduct inference for the test
query via in-context learning with the selected self-annotated demonstrations.
Through comprehensive experimental analysis, our study yielded the following
findings: (1) The self-improving framework further pushes the boundary of
zero-shot NER with LLMs, and achieves an obvious performance improvement; (2)
Iterative self-improving or naively increasing the size of unlabeled corpus
does not guarantee improvements; (3) There might still be space for improvement
via more advanced strategy for reliable entity selection
Scheduling Policies for Federated Learning in Wireless Networks
Motivated by the increasing computational capacity of wireless user
equipments (UEs), e.g., smart phones, tablets, or vehicles, as well as the
increasing concerns about sharing private data, a new machine learning model
has emerged, namely federated learning (FL), that allows a decoupling of data
acquisition and computation at the central unit. Unlike centralized learning
taking place in a data center, FL usually operates in a wireless edge network
where the communication medium is resource-constrained and unreliable. Due to
limited bandwidth, only a portion of UEs can be scheduled for updates at each
iteration. Due to the shared nature of the wireless medium, transmissions are
subjected to interference and are not guaranteed. The performance of FL system
in such a setting is not well understood. In this paper, an analytical model is
developed to characterize the performance of FL in wireless networks.
Particularly, tractable expressions are derived for the convergence rate of FL
in a wireless setting, accounting for effects from both scheduling schemes and
inter-cell interference. Using the developed analysis, the effectiveness of
three different scheduling policies, i.e., random scheduling (RS), round robin
(RR), and proportional fair (PF), are compared in terms of FL convergence rate.
It is shown that running FL with PF outperforms RS and RR if the network is
operating under a high signal-to-interference-plus-noise ratio (SINR)
threshold, while RR is more preferable when the SINR threshold is low.
Moreover, the FL convergence rate decreases rapidly as the SINR threshold
increases, thus confirming the importance of compression and quantization of
the update parameters. The analysis also reveals a trade-off between the number
of scheduled UEs and subchannel bandwidth under a fixed amount of available
spectrum
How Well Do Text Embedding Models Understand Syntax?
Text embedding models have significantly contributed to advancements in
natural language processing by adeptly capturing semantic properties of textual
data. However, the ability of these models to generalize across a wide range of
syntactic contexts remains under-explored. In this paper, we first develop an
evaluation set, named \textbf{SR}, to scrutinize the capability for syntax
understanding of text embedding models from two crucial syntactic aspects:
Structural heuristics, and Relational understanding among concepts, as revealed
by the performance gaps in previous studies. Our findings reveal that existing
text embedding models have not sufficiently addressed these syntactic
understanding challenges, and such ineffectiveness becomes even more apparent
when evaluated against existing benchmark datasets. Furthermore, we conduct
rigorous analysis to unearth factors that lead to such limitations and examine
why previous evaluations fail to detect such ineffectiveness. Lastly, we
propose strategies to augment the generalization ability of text embedding
models in diverse syntactic scenarios. This study serves to highlight the
hurdles associated with syntactic generalization and provides pragmatic
guidance for boosting model performance across varied syntactic contexts.Comment: Accepted to EMNLP-Findings 2023, datasets and code are release
Biologically Plausible Sequence Learning with Spiking Neural Networks
Motivated by the celebrated discrete-time model of nervous activity outlined
by McCulloch and Pitts in 1943, we propose a novel continuous-time model, the
McCulloch-Pitts network (MPN), for sequence learning in spiking neural
networks. Our model has a local learning rule, such that the synaptic weight
updates depend only on the information directly accessible by the synapse. By
exploiting asymmetry in the connections between binary neurons, we show that
MPN can be trained to robustly memorize multiple spatiotemporal patterns of
binary vectors, generalizing the ability of the symmetric Hopfield network to
memorize static spatial patterns. In addition, we demonstrate that the model
can efficiently learn sequences of binary pictures as well as generative models
for experimental neural spike-train data. Our learning rule is consistent with
spike-timing-dependent plasticity (STDP), thus providing a theoretical ground
for the systematic design of biologically inspired networks with large and
robust long-range sequence storage capacity.Comment: Accepted for publication in the Proceedings of the 34th AAAI
Conference on Artificial Intelligence (AAAI-20
On the Effectiveness of Out-of-Distribution Data in Self-Supervised Long-Tail Learning
Though Self-supervised learning (SSL) has been widely studied as a promising
technique for representation learning, it doesn't generalize well on
long-tailed datasets due to the majority classes dominating the feature space.
Recent work shows that the long-tailed learning performance could be boosted by
sampling extra in-domain (ID) data for self-supervised training, however,
large-scale ID data which can rebalance the minority classes are expensive to
collect. In this paper, we propose an alternative but easy-to-use and effective
solution, Contrastive with Out-of-distribution (OOD) data for Long-Tail
learning (COLT), which can effectively exploit OOD data to dynamically
re-balance the feature space. We empirically identify the counter-intuitive
usefulness of OOD samples in SSL long-tailed learning and principally design a
novel SSL method. Concretely, we first localize the `head' and `tail' samples
by assigning a tailness score to each OOD sample based on its neighborhoods in
the feature space. Then, we propose an online OOD sampling strategy to
dynamically re-balance the feature space. Finally, we enforce the model to be
capable of distinguishing ID and OOD samples by a distribution-level supervised
contrastive loss. Extensive experiments are conducted on various datasets and
several state-of-the-art SSL frameworks to verify the effectiveness of the
proposed method. The results show that our method significantly improves the
performance of SSL on long-tailed datasets by a large margin, and even
outperforms previous work which uses external ID data. Our code is available at
https://github.com/JianhongBai/COLT
DIVOTrack: A Novel Dataset and Baseline Method for Cross-View Multi-Object Tracking in DIVerse Open Scenes
Cross-view multi-object tracking aims to link objects between frames and
camera views with substantial overlaps. Although cross-view multi-object
tracking has received increased attention in recent years, existing datasets
still have several issues, including 1) missing real-world scenarios, 2)
lacking diverse scenes, 3) owning a limited number of tracks, 4) comprising
only static cameras, and 5) lacking standard benchmarks, which hinder the
investigation and comparison of cross-view tracking methods. To solve the
aforementioned issues, we introduce DIVOTrack: a new cross-view multi-object
tracking dataset for DIVerse Open scenes with dense tracking pedestrians in
realistic and non-experimental environments. Our DIVOTrack has ten distinct
scenarios and 550 cross-view tracks, surpassing all cross-view multi-object
tracking datasets currently available. Furthermore, we provide a novel baseline
cross-view tracking method with a unified joint detection and cross-view
tracking framework named CrossMOT, which learns object detection, single-view
association, and cross-view matching with an all-in-one embedding model.
Finally, we present a summary of current methodologies and a set of standard
benchmarks with our DIVOTrack to provide a fair comparison and conduct a
comprehensive analysis of current approaches and our proposed CrossMOT. The
dataset and code are available at https://github.com/shengyuhao/DIVOTrack