37 research outputs found

    Distilling Large Vision-Language Model with Out-of-Distribution Generalizability

    Full text link
    Large vision-language models have achieved outstanding performance, but their size and computational requirements make their deployment on resource-constrained devices and time-sensitive tasks impractical. Model distillation, the process of creating smaller, faster models that maintain the performance of larger models, is a promising direction towards the solution. This paper investigates the distillation of visual representations in large teacher vision-language models into lightweight student models using a small- or mid-scale dataset. Notably, this study focuses on open-vocabulary out-of-distribution (OOD) generalization, a challenging problem that has been overlooked in previous model distillation literature. We propose two principles from vision and language modality perspectives to enhance student's OOD generalization: (1) by better imitating teacher's visual representation space, and carefully promoting better coherence in vision-language alignment with the teacher; (2) by enriching the teacher's language representations with informative and finegrained semantic attributes to effectively distinguish between different labels. We propose several metrics and conduct extensive experiments to investigate their techniques. The results demonstrate significant improvements in zero-shot and few-shot student performance on open-vocabulary out-of-distribution classification, highlighting the effectiveness of our proposed approaches. Our code will be released at https://github.com/xuanlinli17/large_vlm_distillation_oo

    Deductive Verification of Chain-of-Thought Reasoning

    Full text link
    Large Language Models (LLMs) significantly benefit from Chain-of-Thought (CoT) prompting in performing various reasoning tasks. While CoT allows models to produce more comprehensive reasoning processes, its emphasis on intermediate reasoning steps can inadvertently introduce hallucinations and accumulated errors, thereby limiting models' ability to solve complex reasoning tasks. Inspired by how humans engage in careful and meticulous deductive logical reasoning processes to solve tasks, we seek to enable language models to perform explicit and rigorous deductive reasoning, and also ensure the trustworthiness of their reasoning process through self-verification. However, directly verifying the validity of an entire deductive reasoning process is challenging, even with advanced models like ChatGPT. In light of this, we propose to decompose a reasoning verification process into a series of step-by-step subprocesses, each only receiving their necessary context and premises. To facilitate this procedure, we propose Natural Program, a natural language-based deductive reasoning format. Our approach enables models to generate precise reasoning steps where subsequent steps are more rigorously grounded on prior steps. It also empowers language models to carry out reasoning self-verification in a step-by-step manner. By integrating this verification process into each deductive reasoning stage, we significantly enhance the rigor and trustfulness of generated reasoning steps. Along this process, we also improve the answer correctness on complex reasoning tasks. Code will be released at https://github.com/lz1oceani/verify_cot

    Spatial planning for urban ventilation corridors by urban climatology

    No full text
    Ventilation corridors in cities can decrease air pollution and alleviate heat island problems but there remains a need to fully assess their effectiveness. Few urban managers have been able to take city-scale approaches to the construction of urban ventilation corridors. This study aimed to introduced the Ventilation Corridor Planning (VCP) model, which is a multi-criteria evaluation method combined with a geographical information system (GIS) to determine where the ventilated environment is most appropriate. Specifically, the VCP model took Bozhou, China as the research object and contained two scales, including mesoscale and local scale. In mesoscale scale, we got three outputs to build urban ventilation corridors, including 1) background wind environment, 2) ventilation potential, 3) heat island intensity. In local scale, we used traditional computational fluid dynamics (CFD) model to verify the impact of VCP criteria. The results revealed that compared with the traditional CFD model, the proposed VCP model has advantages in establishing a comprehensive evaluation standard. In addition, the application of VCP model in macro and micro also enhances the efficiency of ventilation corridor construction. Overall, this study introduced a effective modeling method to urban ventilation corridors planning, and provide a way to study the urban climate

    Five-Direction Occlusion Filling with Five Layer Parallel Two-Stage Pipeline for Stereo Matching with Sub-Pixel Disparity Map Estimation

    No full text
    Binocular stereoscopic matching is an essential method in computer vision, imitating human binocular technology to obtain distance information. Among plentiful stereo matching algorithms, Semi-Global Matching (SGM) is recognized as one of the most popular vision algorithms due to its relatively low power consumption and high accuracy, resulting in many excellent SGM-based hardware accelerators. However, vision algorithms, including SGM, are still somewhat inaccurate in actual long-range applications. Therefore, this paper proposes a disparity improvement strategy based on subpixel interpolation and disparity optimization post-processing using an area optimization strategy, hardware-friendly divider, split look-up table, and the clock alignment multi-directional disparity occlusion filling, and depth acquisition based on floating-point operations. The hardware architecture based on optimization algorithms is on the Stratix-IV platform. It consumes about 5.6 K LUTs, 12.8 K registers, and 2.5 M bits of on-chip memory. Meanwhile, the non-occlusion error rate of only 4.61% is about 1% better than the state-of-the-art works in the KITTI2015 dataset. The maximum working frequency can reach up to 98.28 MHz for the 640 × 480 resolution video and 128 disparity range with the power dissipation of 1.459 W and 320 frames per second processing speed

    Electroacupuncture at Fengchi(GB20) and Yanglingquan(GB34) Ameliorates Paralgesia through Microglia-Mediated Neuroinflammation in a Rat Model of Migraine

    No full text
    Background: Multiple studies have suggested that paralgesia (hyperalgesia and cutaneous allodynia) in migraine reflects the activation and sensitisation of the trigeminovascular system (TGVS). In particular, it reflects the second-order and higher nerve centre sensitisation, which is caused and maintained by neuroinflammation. Microglia activation leads to the release of proinflammatory cytokines involved in inflammatory responses. Accumulating evidence indicates that electroacupuncture (EA) is effective in ameliorating paralgesia, but the underlying mechanisms of EA in migraine attacks caused by microglia and microglia-mediated inflammatory responses are still unclear. The purpose of this study was to explore whether EA could ameliorate the dysregulation of pain sensation by suppressing microglial activation and the resulting neuroinflammatory response, and to evaluate whether this response was regulated by Toll-like receptor 4 (TLR4)/nuclear factor-kappa B(NF-κB) in the trigeminal nucleus caudalis (TNC) in a rat model of migraine. Methods: Repeated Inflammatory Soup (IS) was infused into the dura for seven sessions to establish a recurrent migraine-like rat model, and EA treatment was administered at Fengchi (GB20) and Yanglingquan (GB34) after daily IS infusion. Facial mechanical withdrawal thresholds were measured to evaluate the change in pain perception, and plasma samples and the TNC tissues of rats were collected to examine the changes in calcitonin gene-related peptide (CGRP), the Ibal-1-labelled microglial activation, and the resulting inflammatory response, including interleukin-1β (IL-1β), tumour necrosis factor-α (TNF-α), interleukin-6 (IL-6), and their regulatory molecules TLR4/NF-κB, via enzyme-linked immunosorbent assay (ELISA), real-time polymerase chain reaction (RT-PCR), immunohistochemistry (IHC) and Western blot analysis. Results: Repeated IS injections into the dura induced facial mechanical paralgesia, which is the manifestation of migraine attacks, and increased the expression of CGRP, Ibal-1, microglial mediated inflammatory cytokines (IL-1β, TNF-α, IL-6), and regulatory molecules TLR4/NF-κB. EA at GB20/34 significantly attenuated repetitive IS-induced pain hypersensitivity. This effect was consistent with decreased levels of CGRP and inflammatory cytokines in the plasma and the TNC via the inhibition of microglia activation, and this response may be regulated by TLR4/NF-κB. Conclusions: EA ameliorated paralgesia in repetitive IS-induced migraine-like rats, which was mainly mediated by a reduction in microglial activation and microglial-mediated inflammatory responses that could be regulated by TLR4/NF-κB
    corecore