94 research outputs found
Domain Generalization via Balancing Training Difficulty and Model Capability
Domain generalization (DG) aims to learn domain-generalizable models from one
or multiple source domains that can perform well in unseen target domains.
Despite its recent progress, most existing work suffers from the misalignment
between the difficulty level of training samples and the capability of
contemporarily trained models, leading to over-fitting or under-fitting in the
trained generalization model. We design MoDify, a Momentum Difficulty framework
that tackles the misalignment by balancing the seesaw between the model's
capability and the samples' difficulties along the training process. MoDify
consists of two novel designs that collaborate to fight against the
misalignment while learning domain-generalizable models. The first is
MoDify-based Data Augmentation which exploits an RGB Shuffle technique to
generate difficulty-aware training samples on the fly. The second is
MoDify-based Network Optimization which dynamically schedules the training
samples for balanced and smooth learning with appropriate difficulty. Without
bells and whistles, a simple implementation of MoDify achieves superior
performance across multiple benchmarks. In addition, MoDify can complement
existing methods as a plug-in, and it is generic and can work for different
visual recognition tasks.Comment: 11 pages, 6 figures, Accepted by ICCV 202
Vision-Language Models for Vision Tasks: A Survey
Most visual recognition studies rely heavily on crowd-labelled data in deep
neural networks (DNNs) training, and they usually train a DNN for each single
visual recognition task, leading to a laborious and time-consuming visual
recognition paradigm. To address the two challenges, Vision-Language Models
(VLMs) have been intensively investigated recently, which learns rich
vision-language correlation from web-scale image-text pairs that are almost
infinitely available on the Internet and enables zero-shot predictions on
various visual recognition tasks with a single VLM. This paper provides a
systematic review of visual language models for various visual recognition
tasks, including: (1) the background that introduces the development of visual
recognition paradigms; (2) the foundations of VLM that summarize the
widely-adopted network architectures, pre-training objectives, and downstream
tasks; (3) the widely-adopted datasets in VLM pre-training and evaluations; (4)
the review and categorization of existing VLM pre-training methods, VLM
transfer learning methods, and VLM knowledge distillation methods; (5) the
benchmarking, analysis and discussion of the reviewed methods; (6) several
research challenges and potential research directions that could be pursued in
the future VLM studies for visual recognition. A project associated with this
survey has been created at https://github.com/jingyi0000/VLM_survey
Facile Preparation of Bimetallic MOF-derived Supported Tungstophosphoric Acid Composites for Biodiesel Production
In this work, the novel TPA@C-NiZr-MOF catalyst is synthesized by the impregnation of tungstophosphoric acid (TPA) on the NiZr-based metal-organic framework (NiZr-MOF) followed by calcination up to 300 °C. The as-prepared catalyst materials were structurally, morphologically, and texturally characterized by XRD, FTIR, temperature programmed desorption of NH3 ( TPD-NH3 ), N2 physisorption, SEM, TEM, and XPS. The prepared catalyst can be used as an efficient heterogeneous catalyst for biodiesel production from oleic acid (OA) with methanol. The results indicated that, in comparison to TPA@NiZr-MOF, the TPA@C-NiZr-MOF catalyst calcined at 300 °C exhibits excellent catalytic performance probably owing to the synergistic effect between TPA and metal oxide skeletons, high acidity, as well as larger surface area and pore size. Additionally, the TPA@C-NiZr-MOF catalyst can be reused in up to six cycles with an acceptable conversion. This study showed that the bimetallic MOF-derived composite materials can be used as an alternative potential heterogeneous catalyst toward biorefinery applications
LLMs Meet VLMs: Boost Open Vocabulary Object Detection with Fine-grained Descriptors
Inspired by the outstanding zero-shot capability of vision language models
(VLMs) in image classification tasks, open-vocabulary object detection has
attracted increasing interest by distilling the broad VLM knowledge into
detector training. However, most existing open-vocabulary detectors learn by
aligning region embeddings with categorical labels (e.g., bicycle) only,
disregarding the capability of VLMs on aligning visual embeddings with
fine-grained text description of object parts (e.g., pedals and bells). This
paper presents DVDet, a Descriptor-Enhanced Open Vocabulary Detector that
introduces conditional context prompts and hierarchical textual descriptors
that enable precise region-text alignment as well as open-vocabulary detection
training in general. Specifically, the conditional context prompt transforms
regional embeddings into image-like representations that can be directly
integrated into general open vocabulary detection training. In addition, we
introduce large language models as an interactive and implicit knowledge
repository which enables iterative mining and refining visually oriented
textual descriptors for precise region-text alignment. Extensive experiments
over multiple large-scale benchmarks show that DVDet outperforms the
state-of-the-art consistently by large margins
Let's Discover More API Relations: A Large Language Model-based AI Chain for Unsupervised API Relation Inference
APIs have intricate relations that can be described in text and represented
as knowledge graphs to aid software engineering tasks. Existing relation
extraction methods have limitations, such as limited API text corpus and
affected by the characteristics of the input text.To address these limitations,
we propose utilizing large language models (LLMs) (e.g., GPT-3.5) as a neural
knowledge base for API relation inference. This approach leverages the entire
Web used to pre-train LLMs as a knowledge base and is insensitive to the
context and complexity of input texts. To ensure accurate inference, we design
our analytic flow as an AI Chain with three AI modules: API FQN Parser, API
Knowledge Extractor, and API Relation Decider. The accuracy of the API FQN
parser and API Relation Decider module are 0.81 and 0.83, respectively. Using
the generative capacity of the LLM and our approach's inference capability, we
achieve an average F1 value of 0.76 under the three datasets, significantly
higher than the state-of-the-art method's average F1 value of 0.40. Compared to
CoT-based method, our AI Chain design improves the inference reliability by
67%, and the AI-crowd-intelligence strategy enhances the robustness of our
approach by 26%
First-principles calculation on the transport properties of molecular wires between Au clusters under equilibrium
Based on the matrix Green's function method combined with hybrid
tight-binding / density functional theory, we calculate the conductances of a
series of gold-dithiol molecule-gold junctions including benzenedithiol (BDT),
benzenedimethanethiol (BDMT), hexanedithiol (HDT), octanedithiol (ODT) and
decanedithiol (DDT). An atomically-contacted extended molecule model is used in
our calculation. As an important procedure, we determine the position of the
Fermi level by the energy reference according to the results from ultraviolet
photoelectron spectroscopy (UPS) experiments. After considering the
experimental uncertainty in UPS measurement, the calculated results of
molecular conductances near the Fermi level qualitatively agree with the
experimental values measured by Tao et. al. [{\it Science} 301, 1221 (2003);
{\it J. Am. Chem. Soc.} 125, 16164 (2003); {\it Nano. Lett.} 4, 267 (2004).]Comment: 12 pages,8 figure
Analysis of the Efficacy of Autologous Peripheral Blood Stem Cell Transplantation in High-Risk Neuroblastoma
Objective: This study aimed to analyze the efficacy of autologous peripheral blood stem cell transplantation for high-risk neuroblastoma in China.
Methods: The data of 90 high-risk neuroblastoma patients treated with the CCCG-NB 2015 regimen were reviewed. The baseline clinicopathological characteristics and prognosis were analyzed and compared. In addition, the prognoses of tandem autologous stem cell transplantation and single autologous stem cell transplantation groups were compared.
Results: The results of survival analysis showed that autologous peripheral blood stem cell transplantation based on this pretreatment regimen significantly improved the prognosis of children in the high-risk group. The 3-year event-free survival (EFS) and overall survival (OS) rates for the transplantation group and the nontransplantation group were 65.5% vs. 41.3% (p=0.023) and 77.1% vs. 57.9% (p=0.03), respectively. There was no difference in the distribution of baseline clinical case characteristics between the single transplantation group and the tandem transplantation group (p\u3e0.05), and there was no significant difference in EFS and OS between the two groups (p\u3e0.05).
Conclusion: Based on this pretreatment programme, autologous peripheral blood stem cell transplantation is safe and tolerable and significantly improves the prognosis of children in the high-risk group. The value of tandem autologous stem cell transplantation is worthy of further discussion, which should consider various aspects such as the transplantation medication regimen and the patient\u27s state
The influence of H2O or/and O2Â introduction during the low-temperature gas-phase sulfation of organic COS + CS2Â on the conversion and deposition of sulfur-containing species in the sulfated CeO2-OS catalyst for NH3-SCR
Herein, the typical components of blast furnace gas, including H O and O , were introduced to improve the NH -SCR activity of the sulfated CeO -OS catalyst during the gas-phase sulfation of organic COS + CS at 50 °C. The characterization results demonstrate that the introduction of O or H O during gas-phase sulfation enhances the conversion of organic COS + CS on a cubic fluorite CeO surface and reduces the formation of sulfur and sulfates in the catalyst, but decreases the BET surface area and pore volume of the sulfated CeO -OS catalyst. However, the introduction of O or H O during the gas-phase sulfation increases the molar ratios of Ce /(Ce + Ce ) and O /(O + O + O ) on the sulfated CeO -OS catalyst surface, thus promoting the formation of surface oxygen vacancies and chemisorbed oxygen, and these properties of the catalyst are further enhanced by the co-existence of O and H O. Furthermore, the reduction of sulfates formed under the action of O or H O decreases the weak acid sites of the sulfated CeO -OS catalyst, but the few and highly dispersive sulfates present stronger reducibility, and the proportion of medium-strong acid sites of the catalyst increases. These factors help to improve the NH -SCR activity of the sulfated CeO -OS catalyst. Thus, there exists a synergistic effect of H O and O introduction during gas-phase sulfation on the physical-chemical properties and catalytic performance of the sulfated CeO -OS catalyst by organic COS + CS at 50 °C
- …