114 research outputs found

    Romantic Narrative in the Film The Battle at Lake Changjin

    Get PDF
    Chen Kaige, a famous Chinese “scholar-type director”, usually contains rich cultural connotations in his films and his works are always full of Chen Kaige-style romanticism. Although he is good at narrative, making The Battle of Changjin Lake a pure war film is a challenge. Therefore, this article takes the film as the research focus. The analysis of aesthetic elements, such as scenes and scope, picture composition, color and so on, expounds into detail the narrative style of romanticism displayed by Chen Kaige in the film

    Self-guided Few-shot Semantic Segmentation for Remote Sensing Imagery Based on Large Vision Models

    Full text link
    The Segment Anything Model (SAM) exhibits remarkable versatility and zero-shot learning abilities, owing largely to its extensive training data (SA-1B). Recognizing SAM's dependency on manual guidance given its category-agnostic nature, we identified unexplored potential within few-shot semantic segmentation tasks for remote sensing imagery. This research introduces a structured framework designed for the automation of few-shot semantic segmentation. It utilizes the SAM model and facilitates a more efficient generation of semantically discernible segmentation outcomes. Central to our methodology is a novel automatic prompt learning approach, leveraging prior guided masks to produce coarse pixel-wise prompts for SAM. Extensive experiments on the DLRSD datasets underline the superiority of our approach, outperforming other available few-shot methodologies

    Chinese health funding in Africa: The untold story

    Get PDF
    The motivations behind China’s allocation of health aid to Africa remain complex due to limited information on the details of health aid project activities. Insufficient knowledge about the purpose of China’s health aid hinders our understanding of China’s comprehensive role in supporting Africa’s healthcare system. To address this gap, our study aimed to gain better insights into China’s health aid priorities and the factors driving these priorities across Africa. To achieve this, we utilized AidData’s Chinese Official Finance Dataset and adhered to the Organisation for Economic Co-operation and Development (OECD) guidelines. We reclassified all 1,026 health projects in Africa, originally categorized under broad 3-digit OECDDAC sector codes, into more specific 5-digit CRS codes. By analyzing the project count and financial value, we assessed the shifting priorities over time. Our analysis revealed that China’s priorities in health aid have evolved between 2000 and 2017. In the early 2000s, China primarily allocated aid to basic health personnel and lacked diversity in sub-sectors. However, after 2004, China shifted its focus more toward basic infrastructure and reduced emphasis on clinical-level staff. Furthermore, China’s interest in addressing malaria expanded both in scale and depth between 2006 and 2009. This trend continued in 2012 and 2014 when China responded to the Ebola outbreak by shifting its focus from basic infrastructure to infectious diseases. In summary, our findings demonstrate the changes in China’s health aid strategy, starting with addressing diseases already eliminated in China and gradually transitioning towards global health security, health system strengthening, and shaping the governance mechanisms

    Not Just Learning from Others but Relying on Yourself: A New Perspective on Few-Shot Segmentation in Remote Sensing

    Full text link
    Few-shot segmentation (FSS) is proposed to segment unknown class targets with just a few annotated samples. Most current FSS methods follow the paradigm of mining the semantics from the support images to guide the query image segmentation. However, such a pattern of `learning from others' struggles to handle the extreme intra-class variation, preventing FSS from being directly generalized to remote sensing scenes. To bridge the gap of intra-class variance, we develop a Dual-Mining network named DMNet for cross-image mining and self-mining, meaning that it no longer focuses solely on support images but pays more attention to the query image itself. Specifically, we propose a Class-public Region Mining (CPRM) module to effectively suppress irrelevant feature pollution by capturing the common semantics between the support-query image pair. The Class-specific Region Mining (CSRM) module is then proposed to continuously mine the class-specific semantics of the query image itself in a `filtering' and `purifying' manner. In addition, to prevent the co-existence of multiple classes in remote sensing scenes from exacerbating the collapse of FSS generalization, we also propose a new Known-class Meta Suppressor (KMS) module to suppress the activation of known-class objects in the sample. Extensive experiments on the iSAID and LoveDA remote sensing datasets have demonstrated that our method sets the state-of-the-art with a minimum number of model parameters. Significantly, our model with the backbone of Resnet-50 achieves the mIoU of 49.58% and 51.34% on iSAID under 1-shot and 5-shot settings, outperforming the state-of-the-art method by 1.8% and 1.12%, respectively. The code is publicly available at https://github.com/HanboBizl/DMNet.Comment: accepted to IEEE TGR

    Semantic Segmentation for Point Cloud Scenes via Dilated Graph Feature Aggregation and Pyramid Decoders

    Full text link
    Semantic segmentation of point clouds generates comprehensive understanding of scenes through densely predicting the category for each point. Due to the unicity of receptive field, semantic segmentation of point clouds remains challenging for the expression of multi-receptive field features, which brings about the misclassification of instances with similar spatial structures. In this paper, we propose a graph convolutional network DGFA-Net rooted in dilated graph feature aggregation (DGFA), guided by multi-basis aggregation loss (MALoss) calculated through Pyramid Decoders. To configure multi-receptive field features, DGFA which takes the proposed dilated graph convolution (DGConv) as its basic building block, is designed to aggregate multi-scale feature representation by capturing dilated graphs with various receptive regions. By simultaneously considering penalizing the receptive field information with point sets of different resolutions as calculation bases, we introduce Pyramid Decoders driven by MALoss for the diversity of receptive field bases. Combining these two aspects, DGFA-Net significantly improves the segmentation performance of instances with similar spatial structures. Experiments on S3DIS, ShapeNetPart and Toronto-3D show that DGFA-Net outperforms the baseline approach, achieving a new state-of-the-art segmentation performance.Comment: accepted to AAAI Workshop 202

    Breaking Immutable: Information-Coupled Prototype Elaboration for Few-Shot Object Detection

    Full text link
    Few-shot object detection, expecting detectors to detect novel classes with a few instances, has made conspicuous progress. However, the prototypes extracted by existing meta-learning based methods still suffer from insufficient representative information and lack awareness of query images, which cannot be adaptively tailored to different query images. Firstly, only the support images are involved for extracting prototypes, resulting in scarce perceptual information of query images. Secondly, all pixels of all support images are treated equally when aggregating features into prototype vectors, thus the salient objects are overwhelmed by the cluttered background. In this paper, we propose an Information-Coupled Prototype Elaboration (ICPE) method to generate specific and representative prototypes for each query image. Concretely, a conditional information coupling module is introduced to couple information from the query branch to the support branch, strengthening the query-perceptual information in support features. Besides, we design a prototype dynamic aggregation module that dynamically adjusts intra-image and inter-image aggregation weights to highlight the salient information useful for detecting query images. Experimental results on both Pascal VOC and MS COCO demonstrate that our method achieves state-of-the-art performance in almost all settings.Comment: Accepted by AAAI202

    Parallel Prediction Method of Knowledge Proficiency Based on Bloom’s Cognitive Theory

    Get PDF
    Knowledge proficiency refers to the extent to which students master knowledge and reflects their cognitive status. To accurately assess knowledge proficiency, various pedagogical theories have emerged. Bloom’s cognitive theory, proposed in 1956 as one of the classic theories, follows the cognitive progression from foundational to advanced levels, categorizing cognition into multiple tiers including “knowing”, “understanding”, and “application”, thereby constructing a hierarchical cognitive structure. This theory is predominantly employed to frame the design of teaching objectives and guide the implementation of teaching activities. Additionally, due to the large number of students in real-world online education systems, the time required to calculate knowledge proficiency is significantly high and unacceptable. To ensure the applicability of this method in large-scale systems, there is a substantial demand for the design of a parallel prediction model to assess knowledge proficiency. The research in this paper is grounded in Bloom’s Cognitive theory, and a Bloom Cognitive Diagnosis Parallel Model (BloomCDM) for calculating knowledge proficiency is designed based on this theory. The model is founded on the concept of matrix decomposition. In the theoretical modeling phase, hierarchical and inter-hierarchical assumptions are initially established, leading to the abstraction of the mathematical model. Subsequently, subject features are mapped onto the three-tier cognitive space of “knowing”, “understanding”, and “applying” to derive the posterior distribution of the target parameters. Upon determining the objective function of the model, both student and topic characteristic parameters are computed to ascertain students’ knowledge proficiency. During the modeling process, in order to formalize the mathematical expressions of “understanding” and “application”, the notions of “knowledge group” and “higher-order knowledge group” are introduced, along with a parallel method for identifying the structure of higher-order knowledge groups. Finally, the experiments in this paper validate that the model can accurately diagnose students’ knowledge proficiency, affirming the scientific and meaningful integration of Bloom’s cognitive hierarchy in knowledge proficiency assessment

    Elevation Estimation-Driven Building 3D Reconstruction from Single-View Remote Sensing Imagery

    Full text link
    Building 3D reconstruction from remote sensing images has a wide range of applications in smart cities, photogrammetry and other fields. Methods for automatic 3D urban building modeling typically employ multi-view images as input to algorithms to recover point clouds and 3D models of buildings. However, such models rely heavily on multi-view images of buildings, which are time-intensive and limit the applicability and practicality of the models. To solve these issues, we focus on designing an efficient DSM estimation-driven reconstruction framework (Building3D), which aims to reconstruct 3D building models from the input single-view remote sensing image. First, we propose a Semantic Flow Field-guided DSM Estimation (SFFDE) network, which utilizes the proposed concept of elevation semantic flow to achieve the registration of local and global features. Specifically, in order to make the network semantics globally aware, we propose an Elevation Semantic Globalization (ESG) module to realize the semantic globalization of instances. Further, in order to alleviate the semantic span of global features and original local features, we propose a Local-to-Global Elevation Semantic Registration (L2G-ESR) module based on elevation semantic flow. Our Building3D is rooted in the SFFDE network for building elevation prediction, synchronized with a building extraction network for building masks, and then sequentially performs point cloud reconstruction, surface reconstruction (or CityGML model reconstruction). On this basis, our Building3D can optionally generate CityGML models or surface mesh models of the buildings. Extensive experiments on ISPRS Vaihingen and DFC2019 datasets on the DSM estimation task show that our SFFDE significantly improves upon state-of-the-arts. Furthermore, our Building3D achieves impressive results in the 3D point cloud and 3D model reconstruction process
    • …
    corecore