64 research outputs found

    Diversity is Strength: Mastering Football Full Game with Interactive Reinforcement Learning of Multiple AIs

    Full text link
    Training AI with strong and rich strategies in multi-agent environments remains an important research topic in Deep Reinforcement Learning (DRL). The AI's strength is closely related to its diversity of strategies, and this relationship can guide us to train AI with both strong and rich strategies. To prove this point, we propose Diversity is Strength (DIS), a novel DRL training framework that can simultaneously train multiple kinds of AIs. These AIs are linked through an interconnected history model pool structure, which enhances their capabilities and strategy diversities. We also design a model evaluation and screening scheme to select the best models to enrich the model pool and obtain the final AI. The proposed training method provides diverse, generalizable, and strong AI strategies without using human data. We tested our method in an AI competition based on Google Research Football (GRF) and won the 5v5 and 11v11 tracks. The method enables a GRF AI to have a high level on both 5v5 and 11v11 tracks for the first time, which are under complex multi-agent environments. The behavior analysis shows that the trained AI has rich strategies, and the ablation experiments proved that the designed modules benefit the training process

    Long-MIL: Scaling Long Contextual Multiple Instance Learning for Histopathology Whole Slide Image Analysis

    Full text link
    Histopathology image analysis is the golden standard of clinical diagnosis for Cancers. In doctors daily routine and computer-aided diagnosis, the Whole Slide Image (WSI) of histopathology tissue is used for analysis. Because of the extremely large scale of resolution, previous methods generally divide the WSI into a large number of patches, then aggregate all patches within a WSI by Multi-Instance Learning (MIL) to make the slide-level prediction when developing computer-aided diagnosis tools. However, most previous WSI-MIL models using global-attention without pairwise interaction and any positional information, or self-attention with absolute position embedding can not well handle shape varying large WSIs, e.g. testing WSIs after model deployment may be larger than training WSIs, since the model development set is always limited due to the difficulty of histopathology WSIs collection. To deal with the problem, in this paper, we propose to amend position embedding for shape varying long-contextual WSI by introducing Linear Bias into Attention, and adapt it from 1-d long sequence into 2-d long-contextual WSI which helps model extrapolate position embedding to unseen or under-fitted positions. We further utilize Flash-Attention module to tackle the computational complexity of Transformer, which also keep full self-attention performance compared to previous attention approximation work. Our method, Long-contextual MIL (Long-MIL) are evaluated on extensive experiments including 4 dataset including WSI classification and survival prediction tasks to validate the superiority on shape varying WSIs. The source code will be open-accessed soon

    Test-Time Training for Semantic Segmentation with Output Contrastive Loss

    Full text link
    Although deep learning-based segmentation models have achieved impressive performance on public benchmarks, generalizing well to unseen environments remains a major challenge. To improve the model's generalization ability to the new domain during evaluation, the test-time training (TTT) is a challenging paradigm that adapts the source-pretrained model in an online fashion. Early efforts on TTT mainly focus on the image classification task. Directly extending these methods to semantic segmentation easily experiences unstable adaption due to segmentation's inherent characteristics, such as extreme class imbalance and complex decision spaces. To stabilize the adaptation process, we introduce contrastive loss (CL), known for its capability to learn robust and generalized representations. Nevertheless, the traditional CL operates in the representation space and cannot directly enhance predictions. In this paper, we resolve this limitation by adapting the CL to the output space, employing a high temperature, and simplifying the formulation, resulting in a straightforward yet effective loss function called Output Contrastive Loss (OCL). Our comprehensive experiments validate the efficacy of our approach across diverse evaluation scenarios. Notably, our method excels even when applied to models initially pre-trained using domain adaptation methods on test domain data, showcasing its resilience and adaptability.\footnote{Code and more information could be found at~ \url{https://github.com/dazhangyu123/OCL}

    Exploring Unsupervised Cell Recognition with Prior Self-activation Maps

    Full text link
    The success of supervised deep learning models on cell recognition tasks relies on detailed annotations. Many previous works have managed to reduce the dependency on labels. However, considering the large number of cells contained in a patch, costly and inefficient labeling is still inevitable. To this end, we explored label-free methods for cell recognition. Prior self-activation maps (PSM) are proposed to generate pseudo masks as training targets. To be specific, an activation network is trained with self-supervised learning. The gradient information in the shallow layers of the network is aggregated to generate prior self-activation maps. Afterward, a semantic clustering module is then introduced as a pipeline to transform PSMs to pixel-level semantic pseudo masks for downstream tasks. We evaluated our method on two histological datasets: MoNuSeg (cell segmentation) and BCData (multi-class cell detection). Compared with other fully-supervised and weakly-supervised methods, our method can achieve competitive performance without any manual annotations. Our simple but effective framework can also achieve multi-class cell detection which can not be done by existing unsupervised methods. The results show the potential of PSMs that might inspire other research to deal with the hunger for labels in medical area.Comment: MICCAI 2023. arXiv admin note: substantial text overlap with arXiv:2210.0786

    PwoP: Intrusion-Tolerant and Privacy-Preserving Sensor Fusion

    Get PDF
    We design and implement, PwoP, an efficient and scalable system for intrusion-tolerant and privacy-preserving multi-sensor fusion. PwoP develops and unifies techniques from dependable distributed systems and modern cryptography, and in contrast to prior works, can 1) provably defend against pollution attacks where some malicious sensors lie about their values to sway the final result, and 2) perform within the computation and bandwidth limitations of cyber-physical systems. PwoP is flexible and extensible, covering a variety of application scenarios. We demonstrate the practicality of our system using Raspberry Pi Zero W, and we show that PwoP is efficient in both failure-free and failure scenarios

    PathAsst: Redefining Pathology through Generative Foundation AI Assistant for Pathology

    Full text link
    As advances in large language models (LLMs) and multimodal techniques continue to mature, the development of general-purpose multimodal large language models (MLLMs) has surged, with significant applications in natural image interpretation. However, the field of pathology has largely remained untapped in this regard, despite the growing need for accurate, timely, and personalized diagnostics. To bridge the gap in pathology MLLMs, we present the PathAsst in this study, which is a generative foundation AI assistant to revolutionize diagnostic and predictive analytics in pathology. To develop PathAsst, we collect over 142K high-quality pathology image-text pairs from a variety of reliable sources, including PubMed, comprehensive pathology textbooks, reputable pathology websites, and private data annotated by pathologists. Leveraging the advanced capabilities of ChatGPT/GPT-4, we generate over 180K instruction-following samples. Furthermore, we devise additional instruction-following data, specifically tailored for the invocation of the pathology-specific models, allowing the PathAsst to effectively interact with these models based on the input image and user intent, consequently enhancing the model's diagnostic capabilities. Subsequently, our PathAsst is trained based on Vicuna-13B language model in coordination with the CLIP vision encoder. The results of PathAsst show the potential of harnessing the AI-powered generative foundation model to improve pathology diagnosis and treatment processes. We are committed to open-sourcing our meticulously curated dataset, as well as a comprehensive toolkit designed to aid researchers in the extensive collection and preprocessing of their own datasets. Resources can be obtained at https://github.com/superjamessyx/Generative-Foundation-AI-Assistant-for-Pathology.Comment: 13 pages, 5 figures, conferenc

    Pairwise registration of TLS point clouds by deep multi-scale local features

    Get PDF
    Abstract(#br)Because of the mechanism of TLS system, noise, outliers, various occlusions, varying cloud densities, etc. inevitably exist in the collection of TLS point clouds. To achieve automatic TLS point cloud registration, many methods, based on the hand-crafted features of keypoints, have been proposed. Despite significant progress, the current methods still face great challenges in accomplishing TLS point cloud registration. In this paper, we propose a multi-scale neural network to learn local shape descriptors for establishing correspondences between pairwise TLS point clouds. To train our model, data augmentation, developed on pairwise semi-synthetic 3D local patches, is to extend our network to be robust to rotation transformation. Then, based on varying local neighborhoods, multi-scale subnetworks are constructed and fused to learn robust local features. Experimental results demonstrate that our proposed method successfully registers two TLS point clouds and outperforms state-of-the-art methods. Besides, our learned descriptors are invariant to translation and tolerant to changes in rotation

    Structural and abnormal electrical properties of excess PbO-doped lead lanthanum titanate thin films

    Full text link
    Lead lanthanum titanate (PLT) thin films with excess PbO (from 0 to 20 mol%) were prepared by a metal-organic decomposition process. The ferroelectric properties and current-voltage (C -V ) characteristics of PLT films were investigated as a function of the excess PbO. Abnormal ferroelectric and C -V properties were observed in PLT films with excess PbO. The polarization against applied electric field (P -E ) hysteresis loops were pinched before saturation of polarization of the films, and C -V curves had four peaks instead of the two peaks found in the normal C -V curves. The abnormal level of the hysteresis loops and C -V curves deteriorate with increasing concentrations of excess PbO in the films. Electron probe microanalysis has revealed that there is excess Pb in PLT thin films. Auger electron spectroscopy has detected that the Pb accumulates at the interfaces between the thin film and the bottom electrode. Meanwhile, transmission electron microscopy has found that PbO nanocrystals on the interface between the PLT thin film and the bottom electrode, and clusters of vacancies and interstitials, in particular, exist in the PLT grains. Therefore, a part of the excess PbO may accumulate at the domain wall of the grains and the grain boundaries and the interface between the bottom electrode and film during the thermal annealing process of the films. Meanwhile, the oxygen vacancies of the grains will increase with the increasing concentration of the excess PbO in the films. The excess PbO and oxygen vacancies act as pinning centres and have a strong pinning effect on the domains. When the poling voltage is not large enough, part of the domains can overcome the force of the pinning, and abnormal ferroelectric and C -V properties were observed.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/48906/2/d00703.pd
    • …
    corecore