232 research outputs found
Efficient Multi-Grained Knowledge Reuse for Class Incremental Segmentation
Class Incremental Semantic Segmentation (CISS) has been a trend recently due
to its great significance in real-world applications. Although the existing
CISS methods demonstrate remarkable performance, they either leverage the
high-level knowledge (feature) only while neglecting the rich and diverse
knowledge in the low-level features, leading to poor old knowledge preservation
and weak new knowledge exploration; or use multi-level features for knowledge
distillation by retraining a heavy backbone, which is computationally
intensive. In this paper, we for the first time propose to efficiently reuse
the multi-grained knowledge for CISS by fusing multi-level features with the
frozen backbone and show a simple aggregation of varying-level features, i.e.,
naive feature pyramid, can boost the performance significantly. We further
introduce a novel densely-interactive feature pyramid (DEFY) module that
enhances the fusion of high- and low-level features by enabling their dense
interaction. Specifically, DEFY establishes a per-pixel relationship between
pairs of feature maps, allowing for multi-pair outputs to be aggregated. This
results in improved semantic segmentation by leveraging the complementary
information from multi-level features. We show that DEFY can be effortlessly
integrated into three representative methods for performance enhancement. Our
method yields a new state-of-the-art performance when combined with the current
SOTA by notably averaged mIoU gains on two widely used benchmarks, i.e., 2.5%
on PASCAL VOC 2012 and 2.3% on ADE20K.Comment: Technical Report. This work has been submitted to the IEEE for
possible publication. Copyright may be transferred without notice, after
which this version may no longer be accessibl
Quantification of Linear and Non-Linear Flow Behaviours in a Rock Fracture with Complex Void Geometry
Understanding the process of fluid flow through fractured rock in subsurface engineering applications has been an active field of research for decades. Accurate modelling of the process is essential to providing guidance for the development of underground projects and reduction of associated risks. This work focuses on the study of flow behaviours in a single rock fracture with complex void geometry, which is fundamental to larger scale flow-related problems in fractured rocks. In this research, the effects of aperture variation, tortuosity and local roughness of fracture surfaces are quantified over segmented areas to develop a more accurate modified cubic law that improves flow prediction in rock fractures with rough walls. To account for the flow non-linearity when inertial effects become significant, new approximate analytical solutions of two-dimensional (2D) Navier-Stokes equations are derived under both the pressure boundary condition (PBC) and flow rate boundary condition (FBC) using the perturbation method. Considering the slowly varying feature of fracture apertures, the ratio of aperture variation to fracture length, instead of the commonly used ratio of mean aperture to fracture length, is used as the perturbation parameter in our solutions. The derived solutions are applied to 2D symmetric wedges and sinusoidal fractures, and it is found that the FBC solution provides more accurate flow estimations, due to a more precise quantification of inertial effects. The derived FBC solution is then extended to asymmetric geometries for more realistic representations of fracture voids at pore-scale. A non-linear Reynolds equation is then developed based on the derived FBC solution for rough rock fractures and results have shown a close agreement with both experiments and flow simulations in capturing the non-linear feature of flow through the fracture.Thesis (Ph.D.) -- University of Adelaide, School of Civil, Environmental and Mining Engineering, 201
Beyond Sole Strength: Customized Ensembles for Generalized Vision-Language Models
Fine-tuning pre-trained vision-language models (VLMs), e.g., CLIP, for the
open-world generalization has gained increasing popularity due to its practical
value. However, performance advancements are limited when relying solely on
intricate algorithmic designs for a single model, even one exhibiting strong
performance, e.g., CLIP-ViT-B/16. This paper, for the first time, explores the
collaborative potential of leveraging much weaker VLMs to enhance the
generalization of a robust single model. The affirmative findings motivate us
to address the generalization problem from a novel perspective, i.e., ensemble
of pre-trained VLMs. We introduce three customized ensemble strategies, each
tailored to one specific scenario. Firstly, we introduce the zero-shot
ensemble, automatically adjusting the logits of different models based on their
confidence when only pre-trained VLMs are available. Furthermore, for scenarios
with extra few-shot samples, we propose the training-free and tuning ensemble,
offering flexibility based on the availability of computing resources. The
proposed ensemble strategies are evaluated on zero-shot, base-to-new, and
cross-dataset generalization, achieving new state-of-the-art performance.
Notably, this work represents an initial stride toward enhancing the
generalization performance of VLMs via ensemble. The code is available at
https://github.com/zhiheLu/Ensemble_VLM.git.Comment: Technical repor
Task Residual for Tuning Vision-Language Models
Large-scale vision-language models (VLMs) pre-trained on billion-level data
have learned general visual representations and broad visual concepts. In
principle, the well-learned knowledge structure of the VLMs should be inherited
appropriately when being transferred to downstream tasks with limited data.
However, most existing efficient transfer learning (ETL) approaches for VLMs
either damage or are excessively biased towards the prior knowledge, e.g.,
prompt tuning (PT) discards the pre-trained text-based classifier and builds a
new one while adapter-style tuning (AT) fully relies on the pre-trained
features. To address this, we propose a new efficient tuning approach for VLMs
named Task Residual Tuning (TaskRes), which performs directly on the text-based
classifier and explicitly decouples the prior knowledge of the pre-trained
models and new knowledge regarding a target task. Specifically, TaskRes keeps
the original classifier weights from the VLMs frozen and obtains a new
classifier for the target task by tuning a set of prior-independent parameters
as a residual to the original one, which enables reliable prior knowledge
preservation and flexible task-specific knowledge exploration. The proposed
TaskRes is simple yet effective, which significantly outperforms previous ETL
methods (e.g., PT and AT) on 11 benchmark datasets while requiring minimal
effort for the implementation. Our code is available at
https://github.com/geekyutao/TaskRes.Comment: Accepted to CVPR 202
Changes in the milk metabolome of the Giant Panda (Ailuropoda melanoleuca) with time after birth: three phases in early lactation and progressive individual differences
Ursids (bears) in general, and giant pandas in particular, are highly altricial at birth. The components of bear milks and their changes with time may be uniquely adapted to nourish relatively immature neonates, protect them from pathogens, and support the maturation of neonatal digestive physiology. Serial milk samples collected from three giant pandas in early lactation were subjected to untargeted metabolite profiling and multivariate analysis. Changes in milk metabolites with time after birth were analysed by Principal Component Analysis, Hierarchical Cluster Analysis and further supported by Orthogonal Partial Least Square-Discriminant Analysis, revealing three phases of milk maturation: days 1–6 (Phase 1), days 7–20 (Phase 2), and beyond day 20 (Phase 3). While the compositions of Phase 1 milks were essentially indistinguishable among individuals, divergences emerged during the second week of lactation. OPLS regression analysis positioned against the growth rate of one cub tentatively inferred a correlation with changes in the abundance of a trisaccharide, isoglobotriose, previously observed to be a major oligosaccharide in ursid milks. Three artificial milk formulae used to feed giant panda cubs were also analysed, and were found to differ markedly in component content from natural panda milk. These findings have implications for the dependence of the ontogeny of all species of bears, and potentially other members of the Carnivora and beyond, on the complexity and sequential changes in maternal provision of micrometabolites in the immediate period after birth
Emerging information technology acceptance model for the development of smart construction system
The potential of emerging information technology has been proposed by many researchers and practitioners in the construction industry, including smart construction. Meanwhile, emerging information technology acceptance and use is one of the major subjects for current smart construction study and practice. Furthermore, although there are many potential applications for and benefits of emerging information technology in the development of smart construction system, the current issue is that it is unclear why this technology is adopted, and that the factors that enhance its implementation are unknown. Therefore, an emerging information technology acceptance model (EITAM) was proposed, and our hypotheses were tested by structural equation modeling (SEM) based on an open-ended questionnaire survey. This study identified the factors that affect emerging information technology acceptance from engineering construction technology and innovation professionals. The EITAM evaluation results can be used to develop an emerging information technology acceptance strategy that is suitable for continual smart construction promotion. Finally, this study can provide guidance to smart construction developers to establish an effective technological integration plan
GraphAdapter: Tuning Vision-Language Models With Dual Knowledge Graph
Adapter-style efficient transfer learning (ETL) has shown excellent
performance in the tuning of vision-language models (VLMs) under the low-data
regime, where only a few additional parameters are introduced to excavate the
task-specific knowledge based on the general and powerful representation of
VLMs. However, most adapter-style works face two limitations: (i) modeling
task-specific knowledge with a single modality only; and (ii) overlooking the
exploitation of the inter-class relationships in downstream tasks, thereby
leading to sub-optimal solutions. To mitigate that, we propose an effective
adapter-style tuning strategy, dubbed GraphAdapter, which performs the textual
adapter by explicitly modeling the dual-modality structure knowledge (i.e., the
correlation of different semantics/classes in textual and visual modalities)
with a dual knowledge graph. In particular, the dual knowledge graph is
established with two sub-graphs, i.e., a textual knowledge sub-graph, and a
visual knowledge sub-graph, where the nodes and edges represent the
semantics/classes and their correlations in two modalities, respectively. This
enables the textual feature of each prompt to leverage the task-specific
structure knowledge from both textual and visual modalities, yielding a more
effective classifier for downstream tasks. Extensive experimental results on 11
benchmark datasets reveal that our GraphAdapter significantly outperforms
previous adapter-based methods. The code will be released at
https://github.com/lixinustc/GraphAdapterComment: Accepted by NeurIPS 2023. The manuscript will be further revised
based on the review
- …