53 research outputs found
Boosting Unsupervised Contrastive Learning Using Diffusion-Based Data Augmentation From Scratch
Unsupervised contrastive learning methods have recently seen significant
improvements, particularly through data augmentation strategies that aim to
produce robust and generalizable representations. However, prevailing data
augmentation methods, whether hand designed or based on foundation models, tend
to rely heavily on prior knowledge or external data. This dependence often
compromises their effectiveness and efficiency. Furthermore, the applicability
of most existing data augmentation strategies is limited when transitioning to
other research domains, especially science-related data. This limitation stems
from the paucity of prior knowledge and labeled data available in these
domains. To address these challenges, we introduce DiffAug-a novel and
efficient Diffusion-based data Augmentation technique. DiffAug aims to ensure
that the augmented and original data share a smoothed latent space, which is
achieved through diffusion steps. Uniquely, unlike traditional methods, DiffAug
first mines sufficient prior semantic knowledge about the neighborhood. This
provides a constraint to guide the diffusion steps, eliminating the need for
labels, external data/models, or prior knowledge. Designed as an
architecture-agnostic framework, DiffAug provides consistent improvements.
Specifically, it improves image classification and clustering accuracy by
1.6%~4.5%. When applied to biological data, DiffAug improves performance by up
to 10.1%, with an average improvement of 5.8%. DiffAug shows good performance
in both vision and biological domains.Comment: arXiv admin note: text overlap with arXiv:2302.07944 by other author
Architecture-Agnostic Masked Image Modeling -- From ViT back to CNN
Masked image modeling (MIM), an emerging self-supervised pre-training method,
has shown impressive success across numerous downstream vision tasks with
Vision transformers (ViTs). Its underlying idea is simple: a portion of the
input image is randomly masked out and then reconstructed via the pre-text
task. However, the working principle behind MIM is not well explained, and
previous studies insist that MIM primarily works for the Transformer family but
is incompatible with CNNs. In this paper, we first study interactions among
patches to understand what knowledge is learned and how it is acquired via the
MIM task. We observe that MIM essentially teaches the model to learn better
middle-order interactions among patches and extract more generalized features.
Based on this fact, we propose an Architecture-Agnostic Masked Image Modeling
framework (AMIM), which is compatible with both Transformers and CNNs in a
unified way. Extensive experiments on popular benchmarks show that our AMIM
learns better representations without explicit design and endows the backbone
model with the stronger capability to transfer to various downstream tasks for
both Transformers and CNNs.Comment: Preprint under review (update reversion). The source code will be
released in https://github.com/Westlake-AI/openmixu
Align Yourself: Self-supervised Pre-training for Fine-grained Recognition via Saliency Alignment
Self-supervised contrastive learning has demonstrated great potential in
learning visual representations. Despite their success on various downstream
tasks such as image classification and object detection, self-supervised
pre-training for fine-grained scenarios is not fully explored. In this paper,
we first point out that current contrastive methods are prone to memorizing
background/foreground texture and therefore have a limitation in localizing the
foreground object. Analysis suggests that learning to extract discriminative
texture information and localization are equally crucial for self-supervised
pre-training in fine-grained scenarios. Based on our findings, we introduce
cross-view saliency alignment (CVSA), a contrastive learning framework that
first crops and swaps saliency regions of images as a novel view generation and
then guides the model to localize on the foreground object via a cross-view
alignment loss. Extensive experiments on four popular fine-grained
classification benchmarks show that CVSA significantly improves the learned
representation.Comment: The second version of CVSA. 10 pages, 4 figure
Transcriptome analysis to identify candidate genes related to mammary gland development of Bactrian camel (Camelus bactrianus)
IntroductionThe demand for camel milk, which has unique therapeutic properties, is increasing. The mammary gland is the organ in mammals responsible for the production and quality of milk. However, few studies have investigated the genes or pathways related to mammary gland growth and development in Bactrian camels. This study aimed to compare the morphological changes in mammary gland tissue and transcriptome expression profiles between young and adult female Bactrian camels and to explore the potential candidate genes and signaling pathways related to mammary gland development.MethodsThree 2  years-old female camels and three 5  years-old adult female camels were maintained in the same environment. The parenchyma of the mammary gland tissue was sampled from the camels using percutaneous needle biopsy. Morphological changes were observed using hematoxylin-eosin staining. High-throughput RNA sequencing was performed using the Illumina HiSeq platform to analyze changes in the transcriptome between young and adult camels. Functional enrichment, pathway enrichment, and protein–protein interaction networks were also analyzed. Gene expression was verified using quantitative real-time polymerase chain reaction (qRT-PCR).ResultsHistomorphological analysis showed that the mammary ducts and mammary epithelial cells in adult female camels were greatly developed and differentiated from those in young camels. Transcriptome analysis showed that 2,851 differentially expressed genes were obtained in the adult camel group compared to the young camel group, of which 1,420 were upregulated, 1,431 were downregulated, and 2,419 encoded proteins. Functional enrichment analysis revealed that the upregulated genes were significantly enriched for 24 pathways, including the Hedgehog signaling pathway which is closely related to mammary gland development. The downregulated genes were significantly enriched for seven pathways, among these the Wnt signaling pathway was significantly related to mammary gland development. The protein–protein interaction network sorted the nodes according to the degree of gene interaction and identified nine candidate genes: PRKAB2, PRKAG3, PLCB4, BTRC, GLI1, WIF1, DKK2, FZD3, and WNT4. The expression of fifteen genes randomly detected by qRT-PCR showed results consistent with those of the transcriptome analysis.DiscussionPreliminary findings indicate that the Hedgehog, Wnt, oxytocin, insulin, and steroid biosynthesis signaling pathways have important effects on mammary gland development in dairy camels. Given the importance of these pathways and the interconnections of the involved genes, the genes in these pathways should be considered potential candidate genes. This study provides a theoretical basis for elucidating the molecular mechanisms associated with mammary gland development and milk production in Bactrian camels
Fetal Brain Tissue Annotation and Segmentation Challenge Results
In-utero fetal MRI is emerging as an important tool in the diagnosis and
analysis of the developing human brain. Automatic segmentation of the
developing fetal brain is a vital step in the quantitative analysis of prenatal
neurodevelopment both in the research and clinical context. However, manual
segmentation of cerebral structures is time-consuming and prone to error and
inter-observer variability. Therefore, we organized the Fetal Tissue Annotation
(FeTA) Challenge in 2021 in order to encourage the development of automatic
segmentation algorithms on an international level. The challenge utilized FeTA
Dataset, an open dataset of fetal brain MRI reconstructions segmented into
seven different tissues (external cerebrospinal fluid, grey matter, white
matter, ventricles, cerebellum, brainstem, deep grey matter). 20 international
teams participated in this challenge, submitting a total of 21 algorithms for
evaluation. In this paper, we provide a detailed analysis of the results from
both a technical and clinical perspective. All participants relied on deep
learning methods, mainly U-Nets, with some variability present in the network
architecture, optimization, and image pre- and post-processing. The majority of
teams used existing medical imaging deep learning frameworks. The main
differences between the submissions were the fine tuning done during training,
and the specific pre- and post-processing steps performed. The challenge
results showed that almost all submissions performed similarly. Four of the top
five teams used ensemble learning methods. However, one team's algorithm
performed significantly superior to the other submissions, and consisted of an
asymmetrical U-Net network architecture. This paper provides a first of its
kind benchmark for future automatic multi-tissue segmentation algorithms for
the developing human brain in utero.Comment: Results from FeTA Challenge 2021, held at MICCAI; Manuscript
submitte
Long-Term Exposure to Ambient Fine Particles and Heart Rate in Northwestern China: Findings from 1.8 Million Adults of the Kashgar Prospective Cohort Study (KPCS)
Elevated heart rate (HR) can be hypothesized to be involved in the pathways by which ambient air pollution, especially fine particulate matter (PM2.5), causes cardiovascular morbidity and mortality. However, evidence concerning long-term PM2.5 exposure and HR is still limited. Therefore, in this study, we assessed the associations of PM2.5 with HR levels and tachycardia prevalence and explored potential modifiers of the associations. We used baseline data of 1,802,207 adults from the Kashgar Prospective Cohort Study (KPCS). PM2.5 exposure was assessed based on satellite sensing data, meteorological factors, multi-resolution emission inventory, and measurements from ground-based surface monitors measurements. HR was measured using a calibrated electronic sphygmomanometer, and tachycardia was defined as resting heart rate (RHR) equal to or greater than 80 beats per minute. Linear regression and logistic regression models were employed to evaluate the associations of PM2.5 levels with RHR levels and tachycardia prevalence, respectively. Stratified analyses by sex, age, ethnicity, smoking status, alcohol use, and physical activity were also performed. The mean (standard deviation) age of the study participants was 39.4 (15.5) years old. In the adjusted models, an interquartile range (8.8 µg/m3) increase in PM2.5 levels was associated with 0.515 (95% confidence interval: 0.503–0.526) bpm increase in RHR levels and with 1.062-fold (95% confidence interval: 1.059–1.064) increase in the odds of tachycardia. The results were robust against several sensitivity analyses. In addition, we observed the above associations were stronger in participants that were men, of Uyghur ethnicity, smoking cigarettes, drinking alcohol, and having physical inactivity, compared to their counterparts. In summary, our findings indicate that long-term exposure to ambient PM2.5 may be hazardously associated with HR, and women, Uyghur people, and those with unhealthy lifestyles may be more vulnerable to the hazardous effects
- …