715 research outputs found
Assessment of IBM and NASA's geospatial foundation model in flood inundation mapping
Vision foundation models are a new frontier in GeoAI research because of
their potential to enable powerful image analysis by learning and extracting
important image features from vast amounts of geospatial data. This paper
evaluates the performance of the first-of-its-kind geospatial foundation model,
IBM-NASA's Prithvi, to support a crucial geospatial analysis task: flood
inundation mapping. This model is compared with popular convolutional neural
network and vision transformer-based architectures in terms of mapping accuracy
for flooded areas. A benchmark dataset, Sen1Floods11, is used in the
experiments, and the models' predictability, generalizability, and
transferability are evaluated based on both a test dataset and a dataset that
is completely unseen by the model. Results show the impressive transferability
of the Prithvi model, highlighting its performance advantages in segmenting
flooded areas in previously unseen regions. The findings also suggest areas for
improvement for the Prithvi model in terms of adopting multi-scale
representation learning, developing more end-to-end pipelines for high-level
image analysis tasks, and offering more flexibility in terms of input data
bands.Comment: 11 pages, 4 figure
P2RBox: A Single Point is All You Need for Oriented Object Detection
Oriented object detection, a specialized subfield in computer vision, finds
applications across diverse scenarios, excelling particularly when dealing with
objects of arbitrary orientations. Conversely, point annotation, which treats
objects as single points, offers a cost-effective alternative to rotated and
horizontal bounding boxes but sacrifices performance due to the loss of size
and orientation information. In this study, we introduce the P2RBox network,
which leverages point annotations and a mask generator to create mask
proposals, followed by filtration through our Inspector Module and Constrainer
Module. This process selects high-quality masks, which are subsequently
converted into rotated box annotations for training a fully supervised
detector. Specifically, we've thoughtfully crafted an Inspector Module rooted
in multi-instance learning principles to evaluate the semantic score of masks.
We've also proposed a more robust mask quality assessment in conjunction with
the Constrainer Module. Furthermore, we've introduced a Symmetry Axis
Estimation (SAE) Module inspired by the spectral theorem for symmetric matrices
to transform the top-performing mask proposal into rotated bounding boxes.
P2RBox performs well with three fully supervised rotated object detectors:
RetinaNet, Rotated FCOS, and Oriented R-CNN. By combining with Oriented R-CNN,
P2RBox achieves 62.26% on DOTA-v1.0 test dataset. As far as we know, this is
the first attempt at training an oriented object detector with point
supervision
Turning a CLIP Model into a Scene Text Detector
The recent large-scale Contrastive Language-Image Pretraining (CLIP) model
has shown great potential in various downstream tasks via leveraging the
pretrained vision and language knowledge. Scene text, which contains rich
textual and visual information, has an inherent connection with a model like
CLIP. Recently, pretraining approaches based on vision language models have
made effective progresses in the field of text detection. In contrast to these
works, this paper proposes a new method, termed TCM, focusing on Turning the
CLIP Model directly for text detection without pretraining process. We
demonstrate the advantages of the proposed TCM as follows: (1) The underlying
principle of our framework can be applied to improve existing scene text
detector. (2) It facilitates the few-shot training capability of existing
methods, e.g., by using 10% of labeled data, we significantly improve the
performance of the baseline method with an average of 22% in terms of the
F-measure on 4 benchmarks. (3) By turning the CLIP model into existing scene
text detection methods, we further achieve promising domain adaptation ability.
The code will be publicly released at https://github.com/wenwenyu/TCM.Comment: CVPR202
Turning a CLIP Model into a Scene Text Spotter
We exploit the potential of the large-scale Contrastive Language-Image
Pretraining (CLIP) model to enhance scene text detection and spotting tasks,
transforming it into a robust backbone, FastTCM-CR50. This backbone utilizes
visual prompt learning and cross-attention in CLIP to extract image and
text-based prior knowledge. Using predefined and learnable prompts,
FastTCM-CR50 introduces an instance-language matching process to enhance the
synergy between image and text embeddings, thereby refining text regions. Our
Bimodal Similarity Matching (BSM) module facilitates dynamic language prompt
generation, enabling offline computations and improving performance.
FastTCM-CR50 offers several advantages: 1) It can enhance existing text
detectors and spotters, improving performance by an average of 1.7% and 1.5%,
respectively. 2) It outperforms the previous TCM-CR50 backbone, yielding an
average improvement of 0.2% and 0.56% in text detection and spotting tasks,
along with a 48.5% increase in inference speed. 3) It showcases robust few-shot
training capabilities. Utilizing only 10% of the supervised data, FastTCM-CR50
improves performance by an average of 26.5% and 5.5% for text detection and
spotting tasks, respectively. 4) It consistently enhances performance on
out-of-distribution text detection and spotting datasets, particularly the
NightTime-ArT subset from ICDAR2019-ArT and the DOTA dataset for oriented
object detection. The code is available at https://github.com/wenwenyu/TCM.Comment: arXiv admin note: text overlap with arXiv:2302.1433
Upregulated expression of indoleamine 2, 3-dioxygenase in CHO cells induces apoptosis of competent T cells and increases proportion of Treg cells
<p>Abstract</p> <p>Introduction</p> <p>The inflammatory enzyme indoleamine 2, 3-dioxygenase (IDO) participates in immune tolerance and promotes immune escape of IDO+ tumors. A recent hypothesis suggested that IDO may contribute to the differentiation of new T regulatory cells (Tregs) from naive CD4+ T cells. In this study we investigated the role of IDO in induction of immunosuppression in breast cancer by increasing the apoptosis of T cells and the proportion of Tregs.</p> <p>Methods</p> <p>An IDO expression plasmid was constructed and Chinese hamster ovary (CHO) cells were stably transfected with human IDO. Purified CD3+ T cells were isolated from the peripheral blood monouclear cells of breast cancer patients. After co-culturing IDO expressing or untransfected (control) CHO cells with T cells, T cells apoptosis were determined by flow cytometry analysis and annexin-V and PI staining. The proportion of the regulatory T cell (Tregs [CD4 + CD25 + CD127-]) subset was measured by flow cytometry analysis. T cells total RNA and cellular protein samples were isolated for detecting Foxp3 gene and protein expression.</p> <p>Results</p> <p>IDO transgenic CHO cells yielded high levels of IDO enzymatic activity, resulting in complete depletion of tryptophan from the culture medium. We found that apoptosis occurred in 79.07 ± 8.13% of CD3+T cells after co-cultured with IDO+ CHO cells for 3 days and the proportion of CD4 + CD25 + CD127- T cells increased from 3.43 ± 1.07% to 8.98 ± 1.88% (<it>P </it>< 0.05) as well. The specific inhibitor of IDO,1-MT efficiently reversed enhancement of T cells apoptosis and amplification of Tregs in vitro. Increased expression of Foxp3, a key molecular marker of Tregs, was confirmed by RT-PCR, real-time RT-PCR and Western blot analysis at the same time.</p> <p>Conclusions</p> <p>These results suggest that IDO helps to create a tolerogenic milieu in breast tumors by directly inducing T cell apoptosis and enhancing Treg-mediated immunosuppression.</p
- …