64 research outputs found
Diversity is Strength: Mastering Football Full Game with Interactive Reinforcement Learning of Multiple AIs
Training AI with strong and rich strategies in multi-agent environments
remains an important research topic in Deep Reinforcement Learning (DRL). The
AI's strength is closely related to its diversity of strategies, and this
relationship can guide us to train AI with both strong and rich strategies. To
prove this point, we propose Diversity is Strength (DIS), a novel DRL training
framework that can simultaneously train multiple kinds of AIs. These AIs are
linked through an interconnected history model pool structure, which enhances
their capabilities and strategy diversities. We also design a model evaluation
and screening scheme to select the best models to enrich the model pool and
obtain the final AI. The proposed training method provides diverse,
generalizable, and strong AI strategies without using human data. We tested our
method in an AI competition based on Google Research Football (GRF) and won the
5v5 and 11v11 tracks. The method enables a GRF AI to have a high level on both
5v5 and 11v11 tracks for the first time, which are under complex multi-agent
environments. The behavior analysis shows that the trained AI has rich
strategies, and the ablation experiments proved that the designed modules
benefit the training process
Long-MIL: Scaling Long Contextual Multiple Instance Learning for Histopathology Whole Slide Image Analysis
Histopathology image analysis is the golden standard of clinical diagnosis
for Cancers. In doctors daily routine and computer-aided diagnosis, the Whole
Slide Image (WSI) of histopathology tissue is used for analysis. Because of the
extremely large scale of resolution, previous methods generally divide the WSI
into a large number of patches, then aggregate all patches within a WSI by
Multi-Instance Learning (MIL) to make the slide-level prediction when
developing computer-aided diagnosis tools. However, most previous WSI-MIL
models using global-attention without pairwise interaction and any positional
information, or self-attention with absolute position embedding can not well
handle shape varying large WSIs, e.g. testing WSIs after model deployment may
be larger than training WSIs, since the model development set is always limited
due to the difficulty of histopathology WSIs collection. To deal with the
problem, in this paper, we propose to amend position embedding for shape
varying long-contextual WSI by introducing Linear Bias into Attention, and
adapt it from 1-d long sequence into 2-d long-contextual WSI which helps model
extrapolate position embedding to unseen or under-fitted positions. We further
utilize Flash-Attention module to tackle the computational complexity of
Transformer, which also keep full self-attention performance compared to
previous attention approximation work. Our method, Long-contextual MIL
(Long-MIL) are evaluated on extensive experiments including 4 dataset including
WSI classification and survival prediction tasks to validate the superiority on
shape varying WSIs. The source code will be open-accessed soon
Test-Time Training for Semantic Segmentation with Output Contrastive Loss
Although deep learning-based segmentation models have achieved impressive
performance on public benchmarks, generalizing well to unseen environments
remains a major challenge. To improve the model's generalization ability to the
new domain during evaluation, the test-time training (TTT) is a challenging
paradigm that adapts the source-pretrained model in an online fashion. Early
efforts on TTT mainly focus on the image classification task. Directly
extending these methods to semantic segmentation easily experiences unstable
adaption due to segmentation's inherent characteristics, such as extreme class
imbalance and complex decision spaces. To stabilize the adaptation process, we
introduce contrastive loss (CL), known for its capability to learn robust and
generalized representations. Nevertheless, the traditional CL operates in the
representation space and cannot directly enhance predictions. In this paper, we
resolve this limitation by adapting the CL to the output space, employing a
high temperature, and simplifying the formulation, resulting in a
straightforward yet effective loss function called Output Contrastive Loss
(OCL). Our comprehensive experiments validate the efficacy of our approach
across diverse evaluation scenarios. Notably, our method excels even when
applied to models initially pre-trained using domain adaptation methods on test
domain data, showcasing its resilience and adaptability.\footnote{Code and more
information could be found at~ \url{https://github.com/dazhangyu123/OCL}
Exploring Unsupervised Cell Recognition with Prior Self-activation Maps
The success of supervised deep learning models on cell recognition tasks
relies on detailed annotations. Many previous works have managed to reduce the
dependency on labels. However, considering the large number of cells contained
in a patch, costly and inefficient labeling is still inevitable. To this end,
we explored label-free methods for cell recognition. Prior self-activation maps
(PSM) are proposed to generate pseudo masks as training targets. To be
specific, an activation network is trained with self-supervised learning. The
gradient information in the shallow layers of the network is aggregated to
generate prior self-activation maps. Afterward, a semantic clustering module is
then introduced as a pipeline to transform PSMs to pixel-level semantic pseudo
masks for downstream tasks. We evaluated our method on two histological
datasets: MoNuSeg (cell segmentation) and BCData (multi-class cell detection).
Compared with other fully-supervised and weakly-supervised methods, our method
can achieve competitive performance without any manual annotations. Our simple
but effective framework can also achieve multi-class cell detection which can
not be done by existing unsupervised methods. The results show the potential of
PSMs that might inspire other research to deal with the hunger for labels in
medical area.Comment: MICCAI 2023. arXiv admin note: substantial text overlap with
arXiv:2210.0786
PwoP: Intrusion-Tolerant and Privacy-Preserving Sensor Fusion
We design and implement, PwoP, an efficient and scalable system for intrusion-tolerant and privacy-preserving multi-sensor fusion. PwoP develops and unifies techniques from dependable distributed systems and modern cryptography, and in contrast to prior works, can 1) provably defend against pollution attacks where some malicious sensors lie about their values to sway the final result, and 2) perform within the computation and bandwidth limitations of cyber-physical systems.
PwoP is flexible and extensible, covering a variety of application scenarios. We demonstrate the practicality of our system using Raspberry Pi Zero W, and we show that PwoP is efficient in both failure-free and failure scenarios
PathAsst: Redefining Pathology through Generative Foundation AI Assistant for Pathology
As advances in large language models (LLMs) and multimodal techniques
continue to mature, the development of general-purpose multimodal large
language models (MLLMs) has surged, with significant applications in natural
image interpretation. However, the field of pathology has largely remained
untapped in this regard, despite the growing need for accurate, timely, and
personalized diagnostics. To bridge the gap in pathology MLLMs, we present the
PathAsst in this study, which is a generative foundation AI assistant to
revolutionize diagnostic and predictive analytics in pathology. To develop
PathAsst, we collect over 142K high-quality pathology image-text pairs from a
variety of reliable sources, including PubMed, comprehensive pathology
textbooks, reputable pathology websites, and private data annotated by
pathologists. Leveraging the advanced capabilities of ChatGPT/GPT-4, we
generate over 180K instruction-following samples. Furthermore, we devise
additional instruction-following data, specifically tailored for the invocation
of the pathology-specific models, allowing the PathAsst to effectively interact
with these models based on the input image and user intent, consequently
enhancing the model's diagnostic capabilities. Subsequently, our PathAsst is
trained based on Vicuna-13B language model in coordination with the CLIP vision
encoder. The results of PathAsst show the potential of harnessing the
AI-powered generative foundation model to improve pathology diagnosis and
treatment processes. We are committed to open-sourcing our meticulously curated
dataset, as well as a comprehensive toolkit designed to aid researchers in the
extensive collection and preprocessing of their own datasets. Resources can be
obtained at
https://github.com/superjamessyx/Generative-Foundation-AI-Assistant-for-Pathology.Comment: 13 pages, 5 figures, conferenc
Pairwise registration of TLS point clouds by deep multi-scale local features
Abstract(#br)Because of the mechanism of TLS system, noise, outliers, various occlusions, varying cloud densities, etc. inevitably exist in the collection of TLS point clouds. To achieve automatic TLS point cloud registration, many methods, based on the hand-crafted features of keypoints, have been proposed. Despite significant progress, the current methods still face great challenges in accomplishing TLS point cloud registration. In this paper, we propose a multi-scale neural network to learn local shape descriptors for establishing correspondences between pairwise TLS point clouds. To train our model, data augmentation, developed on pairwise semi-synthetic 3D local patches, is to extend our network to be robust to rotation transformation. Then, based on varying local neighborhoods, multi-scale subnetworks are constructed and fused to learn robust local features. Experimental results demonstrate that our proposed method successfully registers two TLS point clouds and outperforms state-of-the-art methods. Besides, our learned descriptors are invariant to translation and tolerant to changes in rotation
Structural and abnormal electrical properties of excess PbO-doped lead lanthanum titanate thin films
Lead lanthanum titanate (PLT) thin films with excess PbO (from 0 to 20 mol%) were prepared by a metal-organic decomposition process. The ferroelectric properties and current-voltage (C -V ) characteristics of PLT films were investigated as a function of the excess PbO. Abnormal ferroelectric and C -V properties were observed in PLT films with excess PbO. The polarization against applied electric field (P -E ) hysteresis loops were pinched before saturation of polarization of the films, and C -V curves had four peaks instead of the two peaks found in the normal C -V curves. The abnormal level of the hysteresis loops and C -V curves deteriorate with increasing concentrations of excess PbO in the films. Electron probe microanalysis has revealed that there is excess Pb in PLT thin films. Auger electron spectroscopy has detected that the Pb accumulates at the interfaces between the thin film and the bottom electrode. Meanwhile, transmission electron microscopy has found that PbO nanocrystals on the interface between the PLT thin film and the bottom electrode, and clusters of vacancies and interstitials, in particular, exist in the PLT grains. Therefore, a part of the excess PbO may accumulate at the domain wall of the grains and the grain boundaries and the interface between the bottom electrode and film during the thermal annealing process of the films. Meanwhile, the oxygen vacancies of the grains will increase with the increasing concentration of the excess PbO in the films. The excess PbO and oxygen vacancies act as pinning centres and have a strong pinning effect on the domains. When the poling voltage is not large enough, part of the domains can overcome the force of the pinning, and abnormal ferroelectric and C -V properties were observed.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/48906/2/d00703.pd
- …