225 research outputs found
MDFL: Multi-domain Diffusion-driven Feature Learning
High-dimensional images, known for their rich semantic information, are
widely applied in remote sensing and other fields. The spatial information in
these images reflects the object's texture features, while the spectral
information reveals the potential spectral representations across different
bands. Currently, the understanding of high-dimensional images remains limited
to a single-domain perspective with performance degradation. Motivated by the
masking texture effect observed in the human visual system, we present a
multi-domain diffusion-driven feature learning network (MDFL) , a scheme to
redefine the effective information domain that the model really focuses on.
This method employs diffusion-based posterior sampling to explicitly consider
joint information interactions between the high-dimensional manifold structures
in the spectral, spatial, and frequency domains, thereby eliminating the
influence of masking texture effects in visual models. Additionally, we
introduce a feature reuse mechanism to gather deep and raw features of
high-dimensional data. We demonstrate that MDFL significantly improves the
feature extraction performance of high-dimensional data, thereby providing a
powerful aid for revealing the intrinsic patterns and structures of such data.
The experimental results on three multi-modal remote sensing datasets show that
MDFL reaches an average overall accuracy of 98.25%, outperforming various
state-of-the-art baseline schemes. The code will be released, contributing to
the computer vision community
Guided Hybrid Quantization for Object detection in Multimodal Remote Sensing Imagery via One-to-one Self-teaching
Considering the computation complexity, we propose a Guided Hybrid
Quantization with One-to-one Self-Teaching (GHOST}) framework. More concretely,
we first design a structure called guided quantization self-distillation
(GQSD), which is an innovative idea for realizing lightweight through the
synergy of quantization and distillation. The training process of the
quantization model is guided by its full-precision model, which is time-saving
and cost-saving without preparing a huge pre-trained model in advance. Second,
we put forward a hybrid quantization (HQ) module to obtain the optimal bit
width automatically under a constrained condition where a threshold for
distribution distance between the center and samples is applied in the weight
value search space. Third, in order to improve information transformation, we
propose a one-to-one self-teaching (OST) module to give the student network a
ability of self-judgment. A switch control machine (SCM) builds a bridge
between the student network and teacher network in the same location to help
the teacher to reduce wrong guidance and impart vital knowledge to the student.
This distillation method allows a model to learn from itself and gain
substantial improvement without any additional supervision. Extensive
experiments on a multimodal dataset (VEDAI) and single-modality datasets (DOTA,
NWPU, and DIOR) show that object detection based on GHOST outperforms the
existing detectors. The tiny parameters (<9.7 MB) and Bit-Operations (BOPs)
(<2158 G) compared with any remote sensing-based, lightweight or
distillation-based algorithms demonstrate the superiority in the lightweight
design domain. Our code and model will be released at
https://github.com/icey-zhang/GHOST.Comment: This article has been delivered to TRGS and is under revie
SAR-to-Optical Image Translation via Thermodynamics-inspired Network
Synthetic aperture radar (SAR) is prevalent in the remote sensing field but
is difficult to interpret in human visual perception. Recently, SAR-to-optical
(S2O) image conversion methods have provided a prospective solution for
interpretation. However, since there is a huge domain difference between
optical and SAR images, they suffer from low image quality and geometric
distortion in the produced optical images. Motivated by the analogy between
pixels during the S2O image translation and molecules in a heat field,
Thermodynamics-inspired Network for SAR-to-Optical Image Translation (S2O-TDN)
is proposed in this paper. Specifically, we design a Third-order Finite
Difference (TFD) residual structure in light of the TFD equation of
thermodynamics, which allows us to efficiently extract inter-domain invariant
features and facilitate the learning of the nonlinear translation mapping. In
addition, we exploit the first law of thermodynamics (FLT) to devise an
FLT-guided branch that promotes the state transition of the feature values from
the unstable diffusion state to the stable one, aiming to regularize the
feature diffusion and preserve image structures during S2O image translation.
S2O-TDN follows an explicit design principle derived from thermodynamic theory
and enjoys the advantage of explainability. Experiments on the public SEN1-2
dataset show the advantages of the proposed S2O-TDN over the current methods
with more delicate textures and higher quantitative results
SuperYOLO: Super Resolution Assisted Object Detection in Multimodal Remote Sensing Imagery
Accurately and timely detecting multiscale small objects that contain tens of
pixels from remote sensing images (RSI) remains challenging. Most of the
existing solutions primarily design complex deep neural networks to learn
strong feature representations for objects separated from the background, which
often results in a heavy computation burden. In this article, we propose an
accurate yet fast object detection method for RSI, named SuperYOLO, which fuses
multimodal data and performs high-resolution (HR) object detection on
multiscale objects by utilizing the assisted super resolution (SR) learning and
considering both the detection accuracy and computation cost. First, we utilize
a symmetric compact multimodal fusion (MF) to extract supplementary information
from various data for improving small object detection in RSI. Furthermore, we
design a simple and flexible SR branch to learn HR feature representations that
can discriminate small objects from vast backgrounds with low-resolution (LR)
input, thus further improving the detection accuracy. Moreover, to avoid
introducing additional computation, the SR branch is discarded in the inference
stage, and the computation of the network model is reduced due to the LR input.
Experimental results show that, on the widely used VEDAI RS dataset, SuperYOLO
achieves an accuracy of 75.09% (in terms of mAP50 ), which is more than 10%
higher than the SOTA large models, such as YOLOv5l, YOLOv5x, and RS designed
YOLOrs. Meanwhile, the parameter size and GFLOPs of SuperYOLO are about 18
times and 3.8 times less than YOLOv5x. Our proposed model shows a favorable
accuracy and speed tradeoff compared to the state-of-the-art models. The code
will be open-sourced at https://github.com/icey-zhang/SuperYOLO.Comment: The article is accepted by IEEE Transactions on Geoscience and Remote
Sensin
Summertime Thermal Comfort and Adaptive Behaviours in Mixed-Mode Office Buildings in Harbin, China
A longitudinal study of summertime occupant behaviour and thermal comfort in office buildings in northern China
The adaptive behaviour and thermal responses of building occupants can be responsible for significant uncertainties when comparing monitored and modelled building energy performance. A better understanding of the interaction of occupants and their buildings is necessary for managing this uncertainty and reducing discrepancies between predicted and actual energy use (commonly known as ‘the performance gap’). This paper presents the results from a longitudinal study during a summer season of ten mixed-mode offices located in Harbin, a city in northern China, which experiences severe winters and warm summers. The study collected data from on-line daily surveys, field measurements of the local environment, occupants' experiences and adaptive control behaviours. Occupant-building interactions were analysed through observing adaptive behaviour, perceived thermal sensations in the physical environment, architectural geometric variables and personnel characteristics. The driving mechanisms for behaviours and feelings were also studied. The results showed a high probability of window opening for both day and night, and a high frequency of the use of a mix of cooling options, including fans and air conditioning, accompanied by natural ventilation in the summer season. The active interaction of the offices' internal environments with the outdoor environment motivated more connections of occupant thermal feelings with the outdoor physical variables. Relative humidity levels were potential key predictors for window opening, and the geometric parameters of offices, occupants' fan use and perceived thermal feelings also showed a level of predictive ability. Evaluating the nature of occupant feelings and behaviours interactions may inform and improve results from building performance-based design
Thermal comfort, occupant control behaviour and performance gap – a study of office buildings in north-east China using data mining
Simulation techniques have been increasingly applied to building performance evaluation and building environmental design. However, uncertain and random factors, such as occupant behaviour, can generate a performance gap between the results from computer simulations and real buildings. This study involved a longitudinal questionnaire survey conducted for one year, along with a continuous recording of environmental parameters and behaviour state changes, in ten offices located in the severe cold region of north-east China. The offices varied from private rooms to open-plan spaces. The thermal comfort experiences of the office workers and their environmental control behaviours were tracked and analysed during summer and winter seasons. The interaction of the thermal comfort experiences of the occupants and behaviour changes were analysed, and window-opening behaviour patterns were defined by applying data mining techniques. The results also generated window-opening behaviour working profiles to link to building performance simulation software. The aim was to apply these profiles to further study the discrepancies between simulation and monitored results that arise from real-world occupant behaviour patterns
- …