71 research outputs found
Ord2Seq: Regarding Ordinal Regression as Label Sequence Prediction
Ordinal regression refers to classifying object instances into ordinal
categories. It has been widely studied in many scenarios, such as medical
disease grading, movie rating, etc. Known methods focused only on learning
inter-class ordinal relationships, but still incur limitations in
distinguishing adjacent categories thus far. In this paper, we propose a simple
sequence prediction framework for ordinal regression called Ord2Seq, which, for
the first time, transforms each ordinal category label into a special label
sequence and thus regards an ordinal regression task as a sequence prediction
process. In this way, we decompose an ordinal regression task into a series of
recursive binary classification steps, so as to subtly distinguish adjacent
categories. Comprehensive experiments show the effectiveness of distinguishing
adjacent categories for performance improvement and our new approach exceeds
state-of-the-art performances in four different scenarios. Codes are available
at https://github.com/wjh892521292/Ord2Seq.Comment: Accepted by ICCV202
OneSeg: Self-learning and One-shot Learning based Single-slice Annotation for 3D Medical Image Segmentation
As deep learning methods continue to improve medical image segmentation
performance, data annotation is still a big bottleneck due to the
labor-intensive and time-consuming burden on medical experts, especially for 3D
images. To significantly reduce annotation efforts while attaining competitive
segmentation accuracy, we propose a self-learning and one-shot learning based
framework for 3D medical image segmentation by annotating only one slice of
each 3D image. Our approach takes two steps: (1) self-learning of a
reconstruction network to learn semantic correspondence among 2D slices within
3D images, and (2) representative selection of single slices for one-shot
manual annotation and propagating the annotated data with the well-trained
reconstruction network. Extensive experiments verify that our new framework
achieves comparable performance with less than 1% annotated data compared with
fully supervised methods and generalizes well on several out-of-distribution
testing sets
ME-GAN: Learning Panoptic Electrocardio Representations for Multi-view ECG Synthesis Conditioned on Heart Diseases
Electrocardiogram (ECG) is a widely used non-invasive diagnostic tool for
heart diseases. Many studies have devised ECG analysis models (e.g.,
classifiers) to assist diagnosis. As an upstream task, researches have built
generative models to synthesize ECG data, which are beneficial to providing
training samples, privacy protection, and annotation reduction. However,
previous generative methods for ECG often neither synthesized multi-view data,
nor dealt with heart disease conditions. In this paper, we propose a novel
disease-aware generative adversarial network for multi-view ECG synthesis
called ME-GAN, which attains panoptic electrocardio representations conditioned
on heart diseases and projects the representations onto multiple standard views
to yield ECG signals. Since ECG manifestations of heart diseases are often
localized in specific waveforms, we propose a new "mixup normalization" to
inject disease information precisely into suitable locations. In addition, we
propose a view discriminator to revert disordered ECG views into a
pre-determined order, supervising the generator to obtain ECG representing
correct view characteristics. Besides, a new metric, rFID, is presented to
assess the quality of the synthesized ECG signals. Comprehensive experiments
verify that our ME-GAN performs well on multi-view ECG signal synthesis with
trusty morbid manifestations
Text2Tree: Aligning Text Representation to the Label Tree Hierarchy for Imbalanced Medical Classification
Deep learning approaches exhibit promising performances on various text
tasks. However, they are still struggling on medical text classification since
samples are often extremely imbalanced and scarce. Different from existing
mainstream approaches that focus on supplementary semantics with external
medical information, this paper aims to rethink the data challenges in medical
texts and present a novel framework-agnostic algorithm called Text2Tree that
only utilizes internal label hierarchy in training deep learning models. We
embed the ICD code tree structure of labels into cascade attention modules for
learning hierarchy-aware label representations. Two new learning schemes,
Similarity Surrogate Learning (SSL) and Dissimilarity Mixup Learning (DML), are
devised to boost text classification by reusing and distinguishing samples of
other labels following the label representation hierarchy, respectively.
Experiments on authoritative public datasets and real-world medical records
show that our approach stably achieves superior performances over classical and
advanced imbalanced classification methods.Comment: EMNLP 2023 Findings. Code: https://github.com/jyansir/Text2Tre
GCL: Gradient-Guided Contrastive Learning for Medical Image Segmentation with Multi-Perspective Meta Labels
Since annotating medical images for segmentation tasks commonly incurs
expensive costs, it is highly desirable to design an annotation-efficient
method to alleviate the annotation burden. Recently, contrastive learning has
exhibited a great potential in learning robust representations to boost
downstream tasks with limited labels. In medical imaging scenarios, ready-made
meta labels (i.e., specific attribute information of medical images) inherently
reveal semantic relationships among images, which have been used to define
positive pairs in previous work. However, the multi-perspective semantics
revealed by various meta labels are usually incompatible and can incur
intractable "semantic contradiction" when combining different meta labels. In
this paper, we tackle the issue of "semantic contradiction" in a
gradient-guided manner using our proposed Gradient Mitigator method, which
systematically unifies multi-perspective meta labels to enable a pre-trained
model to attain a better high-level semantic recognition ability. Moreover, we
emphasize that the fine-grained discrimination ability is vital for
segmentation-oriented pre-training, and develop a novel method called Gradient
Filter to dynamically screen pixel pairs with the most discriminating power
based on the magnitude of gradients. Comprehensive experiments on four medical
image segmentation datasets verify that our new method GCL: (1) learns
informative image representations and considerably boosts segmentation
performance with limited labels, and (2) shows promising generalizability on
out-of-distribution datasets
PoCo: A Self-Supervised Approach via Polar Transformation Based Progressive Contrastive Learning for Ophthalmic Disease Diagnosis
Automatic ophthalmic disease diagnosis on fundus images is important in
clinical practice. However, due to complex fundus textures and limited
annotated data, developing an effective automatic method for this problem is
still challenging. In this paper, we present a self-supervised method via polar
transformation based progressive contrastive learning, called PoCo, for
ophthalmic disease diagnosis. Specifically, we novelly inject the polar
transformation into contrastive learning to 1) promote contrastive learning
pre-training to be faster and more stable and 2) naturally capture task-free
and rotation-related textures, which provides insights into disease recognition
on fundus images. Beneficially, simple normal translation-invariant convolution
on transformed images can equivalently replace the complex rotation-invariant
and sector convolution on raw images. After that, we develop a progressive
contrastive learning method to efficiently utilize large unannotated images and
a novel progressive hard negative sampling scheme to gradually reduce the
negative sample number for efficient training and performance enhancement.
Extensive experiments on three public ophthalmic disease datasets show that our
PoCo achieves state-of-the-art performance with good generalization ability,
validating that our method can reduce annotation efforts and provide reliable
diagnosis. Codes are available at \url{https://github.com/wjh892521292/PoCo}
Doctor Imitator: Hand-Radiography-based Bone Age Assessment by Imitating Scoring Methods
Bone age assessment is challenging in clinical practice due to the
complicated bone age assessment process. Current automatic bone age assessment
methods were designed with rare consideration of the diagnostic logistics and
thus may yield certain uninterpretable hidden states and outputs. Consequently,
doctors can find it hard to cooperate with such models harmoniously because it
is difficult to check the correctness of the model predictions. In this work,
we propose a new graph-based deep learning framework for bone age assessment
with hand radiographs, called Doctor Imitator (DI). The architecture of DI is
designed to learn the diagnostic logistics of doctors using the scoring methods
(e.g., the Tanner-Whitehouse method) for bone age assessment. Specifically, the
convolutions of DI capture the local features of the anatomical regions of
interest (ROIs) on hand radiographs and predict the ROI scores by our proposed
Anatomy-based Group Convolution, summing up for bone age prediction. Besides,
we develop a novel Dual Graph-based Attention module to compute
patient-specific attention for ROI features and context attention for ROI
scores. As far as we know, DI is the first automatic bone age assessment
framework following the scoring methods without fully supervised hand
radiographs. Experiments on hand radiographs with only bone age supervision
verify that DI can achieve excellent performance with sparse parameters and
provide more interpretability.Comment: Original Title: "Doctor Imitator: A Graph-based Bone Age Assessment
Framework Using Hand Radiographs" @inproceedings{chen2020doctor,
title={Doctor imitator: A graph-based bone age assessment framework using
hand radiographs}, author={Chen, Jintai and Yu, Bohan and Lei, Biwen and
Feng, Ruiwei and Chen, Danny Z and Wu, Jian}, booktitle={MICCAI}, year={2020}
Clean air for some : Unintended spillover effects of regional air pollution policies
China has enacted a number of ambitious pollution control policies to mitigate air pollution in urban areas. Unintended side effects of these policies to other environmental policy arenas and regions have largely been ignored. To bridge this gap, we use a multiregional input-output model in combination with an atmospheric chemical transport model to simulate clean air policy scenarios and evaluate their environmental impacts on primary PM2.5 and secondary precursor emissions, as well as CO2 emissions and water consumption, in the target region and spillover effects to other regions. Our results show that the reduction in primary PM2.5 and secondary precursor emissions in the target regions comes at the cost of increasing emissions especially in neighboring provinces. Similarly, co-benefits of lower CO2 emissions and reduced water consumption in the target region are achieved at the expense of higher impacts elsewhere, through outsourcing production to less developed regions in China
Making Pre-trained Language Models Great on Tabular Prediction
The transferability of deep neural networks (DNNs) has made significant
progress in image and language processing. However, due to the heterogeneity
among tables, such DNN bonus is still far from being well exploited on tabular
data prediction (e.g., regression or classification tasks). Condensing
knowledge from diverse domains, language models (LMs) possess the capability to
comprehend feature names from various tables, potentially serving as versatile
learners in transferring knowledge across distinct tables and diverse
prediction tasks, but their discrete text representation space is inherently
incompatible with numerical feature values in tables. In this paper, we present
TP-BERTa, a specifically pre-trained LM for tabular data prediction.
Concretely, a novel relative magnitude tokenization converts scalar numerical
feature values to finely discrete, high-dimensional tokens, and an
intra-feature attention approach integrates feature values with the
corresponding feature names. Comprehensive experiments demonstrate that our
pre-trained TP-BERTa leads the performance among tabular DNNs and is
competitive with Gradient Boosted Decision Tree models in typical tabular data
regime.Comment: Accepted to ICLR 2024 as spotlight presentation (Notable Top 5%).
OpenReview link is https://openreview.net/forum?id=anzIzGZuLi, codes will be
available at https://github.com/jyansir/tp-bert
Mind's Mirror: Distilling Self-Evaluation Capability and Comprehensive Thinking from Large Language Models
Large language models (LLMs) have achieved remarkable advancements in the
field of natural language processing. However, the sheer scale and
computational demands of these models present formidable challenges when
considering their practical deployment in resource-constrained contexts. While
techniques such as chain-of-thought (CoT) distillation have displayed promise
in distilling LLMs into small language models (SLMs), there is a risk that
distilled SLMs may still carry over flawed reasoning or hallucinations
inherited from their LLM counterparts. To address these issues, we propose a
twofold methodology: First, we introduce a novel method for distilling the
self-evaluation capability inherent in LLMs into SLMs, which aims to mitigate
the adverse effects of erroneous reasoning and reduce hallucinations. Second,
we advocate for a comprehensive distillation process that incorporates multiple
distinct chain-of-thought and self-evaluation paradigms and ensures a more
holistic and robust knowledge transfer into SLMs. Experiments on three NLP
benchmarks demonstrate that our method significantly improves the performance
of distilled SLMs and sheds light on the path towards developing smaller models
closely aligned with human cognition.Comment: 13 pages, 5 figure
- …