131 research outputs found
Anomaly Detection in Aerial Videos with Transformers
Unmanned aerial vehicles (UAVs) are widely applied for purposes of
inspection, search, and rescue operations by the virtue of low-cost,
large-coverage, real-time, and high-resolution data acquisition capacities.
Massive volumes of aerial videos are produced in these processes, in which
normal events often account for an overwhelming proportion. It is extremely
difficult to localize and extract abnormal events containing potentially
valuable information from long video streams manually. Therefore, we are
dedicated to developing anomaly detection methods to solve this issue. In this
paper, we create a new dataset, named DroneAnomaly, for anomaly detection in
aerial videos. This dataset provides 37 training video sequences and 22 testing
video sequences from 7 different realistic scenes with various anomalous
events. There are 87,488 color video frames (51,635 for training and 35,853 for
testing) with the size of at 30 frames per second. Based on
this dataset, we evaluate existing methods and offer a benchmark for this task.
Furthermore, we present a new baseline model, ANomaly Detection with
Transformers (ANDT), which treats consecutive video frames as a sequence of
tubelets, utilizes a Transformer encoder to learn feature representations from
the sequence, and leverages a decoder to predict the next frame. Our network
models normality in the training phase and identifies an event with
unpredictable temporal dynamics as an anomaly in the test phase. Moreover, To
comprehensively evaluate the performance of our proposed method, we use not
only our Drone-Anomaly dataset but also another dataset. We will make our
dataset and code publicly available. A demo video is available at
https://youtu.be/ancczYryOBY. We make our dataset and code publicly available
Advancing diabetes treatment: the role of mesenchymal stem cells in islet transplantation
Diabetes mellitus, a prevalent global health challenge, significantly impacts societal and economic well-being. Islet transplantation is increasingly recognized as a viable treatment for type 1 diabetes that aims to restore endogenous insulin production and mitigate complications associated with exogenous insulin dependence. We review the role of mesenchymal stem cells (MSCs) in enhancing the efficacy of islet transplantation. MSCs, characterized by their immunomodulatory properties and differentiation potential, are increasingly seen as valuable in enhancing islet graft survival, reducing immune-mediated rejection, and supporting angiogenesis and tissue repair. The utilization of MSC-derived extracellular vesicles further exemplifies innovative approaches to improve transplantation outcomes. However, challenges such as MSC heterogeneity and the optimization of therapeutic applications persist. Advanced methodologies, including artificial intelligence (AI) and single-cell RNA sequencing (scRNA-seq), are highlighted as potential technologies for addressing these challenges, potentially steering MSC therapy toward more effective, personalized treatment modalities for diabetes. This review revealed that MSCs are important for advancing diabetes treatment strategies, particularly through islet transplantation. This highlights the importance of MSCs in the field of regenerative medicine, acknowledging both their potential and the challenges that must be navigated to fully realize their therapeutic promise
Identification and interaction analysis of key genes and microRNAs in hepatocellular carcinoma by bioinformatics analysis
Complete list of differentially expressed genes (DEGs) in GSE22058. (DOCX 183 kb
Unconstrained Aerial Scene Recognition with Deep Neural Networks and a New Dataset
Aerial scene recognition is a fundamental research problem in interpreting high-resolution aerial imagery. Over the past few years, most studies focus on classifying an image into one scene category, while in real-world scenarios, it is more often that a single image contains multiple scenes. Therefore, in this paper, we investigate a more practical yet underexplored task---multi-scene recognition in single images. To this end, we create a large-scale dataset, called MultiScene dataset, composed of 100,000 unconstrained images each with multiple labels from 36 different scenes. Among these images, 14,000 of them are manually interpreted and assigned ground-truth labels, while the remaining images are provided with crowdsourced labels, which are generated from low-cost but noisy OpenStreetMap (OSM) data. By doing so, our dataset allows two branches of studies: 1) developing novel CNNs for multi-scene recognition and 2) learning with noisy labels. We experiment with extensive baseline models on our dataset to offer a benchmark for multi-scene recognition in single images. Aiming to expedite further researches, we will make our dataset and pre-trained models availabl
Single-cell RNA analysis to identify five cytokines signaling in immune-related genes for melanoma survival prognosis
Melanoma is one of the deadliest skin cancers. Recently, developed single-cell sequencing has revealed fresh insights into melanoma. Cytokine signaling in the immune system is crucial for tumor development in melanoma. To evaluate melanoma patient diagnosis and treatment, the prediction value of cytokine signaling in immune-related genes (CSIRGs) is needed. In this study, the machine learning method of least absolute selection and shrinkage operator (LASSO) regression was used to establish a CSIRG prognostic signature of melanoma at the single-cell level. We discovered a 5-CSIRG signature that was substantially related to the overall survival of melanoma patients. We also constructed a nomogram that combined CSIRGs and clinical features. Overall survival of melanoma patients can be consistently predicted with good performance as well as accuracy by both the 5-CSIRG signature and nomograms. We compared the melanoma patients in the CSIRG high- and low-risk groups in terms of tumor mutation burden, infiltration of the immune system, and gene enrichment. High CSIRG-risk patients had a lower tumor mutational burden than low CSIRG-risk patients. The CSIRG high-risk patients had a higher infiltration of monocytes. Signaling pathways including oxidative phosphorylation, DNA replication, and aminoacyl tRNA biosynthesis were enriched in the high-risk group. For the first time, we constructed and validated a machine-learning model by single-cell RNA-sequencing datasets that have the potential to be a novel treatment target and might serve as a prognostic biomarker panel for melanoma. The 5-CSIRG signature may assist in predicting melanoma patient prognosis, biological characteristics, and appropriate therapy
FuTH-Net: Fusing Temporal Relations and Holistic Features for Aerial Video Classification
Unmanned aerial vehicles (UAVs) are now widely applied to data acquisition due to its low cost and fast mobility. With the increasing volume of aerial videos, the demand for automatically parsing these videos is surging. To achieve this, current research mainly focuses on extracting a holistic feature with convolutions along both spatial and temporal dimensions. However, these methods are limited by small temporal receptive fields and cannot adequately capture long-term temporal dependencies that are important for describing complicated dynamics. In this article, we propose a novel deep neural network, termed Fusing Temporal relations and Holistic features for aerial video classification (FuTH-Net), to model not only holistic features but also temporal relations for aerial video classification. Furthermore, the holistic features are refined by the multiscale temporal relations in a novel fusion module for yielding more discriminative video representations. More specially, FuTH-Net employs a two-pathway architecture: 1) a holistic representation pathway to learn a general feature of both frame appearances and short-term temporal variations and 2) a temporal relation pathway to capture multiscale temporal relations across arbitrary frames, providing long-term temporal dependencies. Afterward, a novel fusion module is proposed to spatiotemporally integrate the two features learned from the two pathways. Our model is evaluated on two aerial video classification datasets, ERA and Drone-Action, and achieves the state-of-the-art results. This demonstrates its effectiveness and good generalization capacity across different recognition tasks (event classification and human action recognition). To facilitate further research, we release the code at https://gitlab.lrz.de/ai4eo/reasoning/futh-net
Instance Segmentation of Buildings using Keypoints
Building segmentation is of great importance in the task of remote sensing
imagery interpretation. However, the existing semantic segmentation and
instance segmentation methods often lead to segmentation masks with blurred
boundaries. In this paper, we propose a novel instance segmentation network for
building segmentation in high-resolution remote sensing images. More
specifically, we consider segmenting an individual building as detecting
several keypoints. The detected keypoints are subsequently reformulated as a
closed polygon, which is the semantic boundary of the building. By doing so,
the sharp boundary of the building could be preserved. Experiments are
conducted on selected Aerial Imagery for Roof Segmentation (AIRS) dataset, and
our method achieves better performance in both quantitative and qualitative
results with comparison to the state-of-the-art methods. Our network is a
bottom-up instance segmentation method that could well preserve geometric
details
Self-supervised Audiovisual Representation Learning for Remote Sensing Data
Many current deep learning approaches make extensive use of backbone networks pre-trained on large datasets like ImageNet, which are then fine-tuned to perform a certain task. In remote sensing, the lack of comparable large annotated datasets and the wide diversity of sensing platforms impedes similar developments. In order to contribute towards the availability of pre-trained backbone networks in remote sensing, we devise a self-supervised approach for pre-training deep neural networks. By exploiting the correspondence between geo-tagged audio recordings and remote sensing imagery, this is done in a completely label-free manner, eliminating the need for laborious manual annotation. For this purpose, we introduce the SoundingEarth dataset, which consists of co-located aerial imagery and audio samples all around the world. Using this dataset, we then pre-train ResNet models to map samples from both modalities into a common embedding space, which encourages the models to understand key properties of a scene that influence both visual and auditory appearance. To validate the usefulness of the proposed approach, we evaluate the transfer learning performance of pre-trained weights obtained against weights obtained through other means. By fine-tuning the models on a number of commonly used remote sensing datasets, we show that our approach outperforms existing pre-training strategies for remote sensing imagery. The dataset, code and pre-trained model weights will be available at this URL: https://github.com/khdlr/SoundingEarth
- …