94 research outputs found

    Domain Generalization via Balancing Training Difficulty and Model Capability

    Full text link
    Domain generalization (DG) aims to learn domain-generalizable models from one or multiple source domains that can perform well in unseen target domains. Despite its recent progress, most existing work suffers from the misalignment between the difficulty level of training samples and the capability of contemporarily trained models, leading to over-fitting or under-fitting in the trained generalization model. We design MoDify, a Momentum Difficulty framework that tackles the misalignment by balancing the seesaw between the model's capability and the samples' difficulties along the training process. MoDify consists of two novel designs that collaborate to fight against the misalignment while learning domain-generalizable models. The first is MoDify-based Data Augmentation which exploits an RGB Shuffle technique to generate difficulty-aware training samples on the fly. The second is MoDify-based Network Optimization which dynamically schedules the training samples for balanced and smooth learning with appropriate difficulty. Without bells and whistles, a simple implementation of MoDify achieves superior performance across multiple benchmarks. In addition, MoDify can complement existing methods as a plug-in, and it is generic and can work for different visual recognition tasks.Comment: 11 pages, 6 figures, Accepted by ICCV 202

    Vision-Language Models for Vision Tasks: A Survey

    Full text link
    Most visual recognition studies rely heavily on crowd-labelled data in deep neural networks (DNNs) training, and they usually train a DNN for each single visual recognition task, leading to a laborious and time-consuming visual recognition paradigm. To address the two challenges, Vision-Language Models (VLMs) have been intensively investigated recently, which learns rich vision-language correlation from web-scale image-text pairs that are almost infinitely available on the Internet and enables zero-shot predictions on various visual recognition tasks with a single VLM. This paper provides a systematic review of visual language models for various visual recognition tasks, including: (1) the background that introduces the development of visual recognition paradigms; (2) the foundations of VLM that summarize the widely-adopted network architectures, pre-training objectives, and downstream tasks; (3) the widely-adopted datasets in VLM pre-training and evaluations; (4) the review and categorization of existing VLM pre-training methods, VLM transfer learning methods, and VLM knowledge distillation methods; (5) the benchmarking, analysis and discussion of the reviewed methods; (6) several research challenges and potential research directions that could be pursued in the future VLM studies for visual recognition. A project associated with this survey has been created at https://github.com/jingyi0000/VLM_survey

    Facile Preparation of Bimetallic MOF-derived Supported Tungstophosphoric Acid Composites for Biodiesel Production

    Get PDF
    In this work, the novel TPA@C-NiZr-MOF catalyst is synthesized by the impregnation of tungstophosphoric acid (TPA) on the NiZr-based metal-organic framework (NiZr-MOF) followed by calcination up to 300 °C. The as-prepared catalyst materials were structurally, morphologically, and texturally characterized by XRD, FTIR, temperature programmed desorption of NH3 ( TPD-NH3 ), N2 physisorption, SEM, TEM, and XPS. The prepared catalyst can be used as an efficient heterogeneous catalyst for biodiesel production from oleic acid (OA) with methanol. The results indicated that, in comparison to TPA@NiZr-MOF, the TPA@C-NiZr-MOF catalyst calcined at 300 °C exhibits excellent catalytic performance probably owing to the synergistic effect between TPA and metal oxide skeletons, high acidity, as well as larger surface area and pore size. Additionally, the TPA@C-NiZr-MOF catalyst can be reused in up to six cycles with an acceptable conversion. This study showed that the bimetallic MOF-derived composite materials can be used as an alternative potential heterogeneous catalyst toward biorefinery applications

    LLMs Meet VLMs: Boost Open Vocabulary Object Detection with Fine-grained Descriptors

    Full text link
    Inspired by the outstanding zero-shot capability of vision language models (VLMs) in image classification tasks, open-vocabulary object detection has attracted increasing interest by distilling the broad VLM knowledge into detector training. However, most existing open-vocabulary detectors learn by aligning region embeddings with categorical labels (e.g., bicycle) only, disregarding the capability of VLMs on aligning visual embeddings with fine-grained text description of object parts (e.g., pedals and bells). This paper presents DVDet, a Descriptor-Enhanced Open Vocabulary Detector that introduces conditional context prompts and hierarchical textual descriptors that enable precise region-text alignment as well as open-vocabulary detection training in general. Specifically, the conditional context prompt transforms regional embeddings into image-like representations that can be directly integrated into general open vocabulary detection training. In addition, we introduce large language models as an interactive and implicit knowledge repository which enables iterative mining and refining visually oriented textual descriptors for precise region-text alignment. Extensive experiments over multiple large-scale benchmarks show that DVDet outperforms the state-of-the-art consistently by large margins

    Let's Discover More API Relations: A Large Language Model-based AI Chain for Unsupervised API Relation Inference

    Full text link
    APIs have intricate relations that can be described in text and represented as knowledge graphs to aid software engineering tasks. Existing relation extraction methods have limitations, such as limited API text corpus and affected by the characteristics of the input text.To address these limitations, we propose utilizing large language models (LLMs) (e.g., GPT-3.5) as a neural knowledge base for API relation inference. This approach leverages the entire Web used to pre-train LLMs as a knowledge base and is insensitive to the context and complexity of input texts. To ensure accurate inference, we design our analytic flow as an AI Chain with three AI modules: API FQN Parser, API Knowledge Extractor, and API Relation Decider. The accuracy of the API FQN parser and API Relation Decider module are 0.81 and 0.83, respectively. Using the generative capacity of the LLM and our approach's inference capability, we achieve an average F1 value of 0.76 under the three datasets, significantly higher than the state-of-the-art method's average F1 value of 0.40. Compared to CoT-based method, our AI Chain design improves the inference reliability by 67%, and the AI-crowd-intelligence strategy enhances the robustness of our approach by 26%

    First-principles calculation on the transport properties of molecular wires between Au clusters under equilibrium

    Full text link
    Based on the matrix Green's function method combined with hybrid tight-binding / density functional theory, we calculate the conductances of a series of gold-dithiol molecule-gold junctions including benzenedithiol (BDT), benzenedimethanethiol (BDMT), hexanedithiol (HDT), octanedithiol (ODT) and decanedithiol (DDT). An atomically-contacted extended molecule model is used in our calculation. As an important procedure, we determine the position of the Fermi level by the energy reference according to the results from ultraviolet photoelectron spectroscopy (UPS) experiments. After considering the experimental uncertainty in UPS measurement, the calculated results of molecular conductances near the Fermi level qualitatively agree with the experimental values measured by Tao et. al. [{\it Science} 301, 1221 (2003); {\it J. Am. Chem. Soc.} 125, 16164 (2003); {\it Nano. Lett.} 4, 267 (2004).]Comment: 12 pages,8 figure

    Analysis of the Efficacy of Autologous Peripheral Blood Stem Cell Transplantation in High-Risk Neuroblastoma

    Get PDF
    Objective: This study aimed to analyze the efficacy of autologous peripheral blood stem cell transplantation for high-risk neuroblastoma in China. Methods: The data of 90 high-risk neuroblastoma patients treated with the CCCG-NB 2015 regimen were reviewed. The baseline clinicopathological characteristics and prognosis were analyzed and compared. In addition, the prognoses of tandem autologous stem cell transplantation and single autologous stem cell transplantation groups were compared. Results: The results of survival analysis showed that autologous peripheral blood stem cell transplantation based on this pretreatment regimen significantly improved the prognosis of children in the high-risk group. The 3-year event-free survival (EFS) and overall survival (OS) rates for the transplantation group and the nontransplantation group were 65.5% vs. 41.3% (p=0.023) and 77.1% vs. 57.9% (p=0.03), respectively. There was no difference in the distribution of baseline clinical case characteristics between the single transplantation group and the tandem transplantation group (p\u3e0.05), and there was no significant difference in EFS and OS between the two groups (p\u3e0.05). Conclusion: Based on this pretreatment programme, autologous peripheral blood stem cell transplantation is safe and tolerable and significantly improves the prognosis of children in the high-risk group. The value of tandem autologous stem cell transplantation is worthy of further discussion, which should consider various aspects such as the transplantation medication regimen and the patient\u27s state

    The influence of H2O or/and O2 introduction during the low-temperature gas-phase sulfation of organic COS + CS2 on the conversion and deposition of sulfur-containing species in the sulfated CeO2-OS catalyst for NH3-SCR

    Get PDF
    Herein, the typical components of blast furnace gas, including H O and O , were introduced to improve the NH -SCR activity of the sulfated CeO -OS catalyst during the gas-phase sulfation of organic COS + CS at 50 °C. The characterization results demonstrate that the introduction of O or H O during gas-phase sulfation enhances the conversion of organic COS + CS on a cubic fluorite CeO surface and reduces the formation of sulfur and sulfates in the catalyst, but decreases the BET surface area and pore volume of the sulfated CeO -OS catalyst. However, the introduction of O or H O during the gas-phase sulfation increases the molar ratios of Ce /(Ce + Ce ) and O /(O + O + O ) on the sulfated CeO -OS catalyst surface, thus promoting the formation of surface oxygen vacancies and chemisorbed oxygen, and these properties of the catalyst are further enhanced by the co-existence of O and H O. Furthermore, the reduction of sulfates formed under the action of O or H O decreases the weak acid sites of the sulfated CeO -OS catalyst, but the few and highly dispersive sulfates present stronger reducibility, and the proportion of medium-strong acid sites of the catalyst increases. These factors help to improve the NH -SCR activity of the sulfated CeO -OS catalyst. Thus, there exists a synergistic effect of H O and O introduction during gas-phase sulfation on the physical-chemical properties and catalytic performance of the sulfated CeO -OS catalyst by organic COS + CS at 50 °C
    • …
    corecore