Search CORE

40 research outputs found

Distilling Large Vision-Language Model with Out-of-Distribution Generalizability

Author: Fang Yunhao
Li Xuanlin
Ling Zhan
Liu Minghua
Su Hao
Tu Zhuowen
Publication venue
Publication date: 06/07/2023
Field of study

Large vision-language models have achieved outstanding performance, but their size and computational requirements make their deployment on resource-constrained devices and time-sensitive tasks impractical. Model distillation, the process of creating smaller, faster models that maintain the performance of larger models, is a promising direction towards the solution. This paper investigates the distillation of visual representations in large teacher vision-language models into lightweight student models using a small- or mid-scale dataset. Notably, this study focuses on open-vocabulary out-of-distribution (OOD) generalization, a challenging problem that has been overlooked in previous model distillation literature. We propose two principles from vision and language modality perspectives to enhance student's OOD generalization: (1) by better imitating teacher's visual representation space, and carefully promoting better coherence in vision-language alignment with the teacher; (2) by enriching the teacher's language representations with informative and finegrained semantic attributes to effectively distinguish between different labels. We propose several metrics and conduct extensive experiments to investigate their techniques. The results demonstrate significant improvements in zero-shot and few-shot student performance on open-vocabulary out-of-distribution classification, highlighting the effectiveness of our proposed approaches. Our code will be released at https://github.com/xuanlinli17/large_vlm_distillation_oo

arXiv.org e-Print Archive

Deductive Verification of Chain-of-Thought Reasoning

Author: Fang Yunhao
Huang Zhiao
Lee Mingu
Li Xuanlin
Ling Zhan
Memisevic Roland
Su Hao
Publication venue
Publication date: 06/06/2023
Field of study

Large Language Models (LLMs) significantly benefit from Chain-of-Thought (CoT) prompting in performing various reasoning tasks. While CoT allows models to produce more comprehensive reasoning processes, its emphasis on intermediate reasoning steps can inadvertently introduce hallucinations and accumulated errors, thereby limiting models' ability to solve complex reasoning tasks. Inspired by how humans engage in careful and meticulous deductive logical reasoning processes to solve tasks, we seek to enable language models to perform explicit and rigorous deductive reasoning, and also ensure the trustworthiness of their reasoning process through self-verification. However, directly verifying the validity of an entire deductive reasoning process is challenging, even with advanced models like ChatGPT. In light of this, we propose to decompose a reasoning verification process into a series of step-by-step subprocesses, each only receiving their necessary context and premises. To facilitate this procedure, we propose Natural Program, a natural language-based deductive reasoning format. Our approach enables models to generate precise reasoning steps where subsequent steps are more rigorously grounded on prior steps. It also empowers language models to carry out reasoning self-verification in a step-by-step manner. By integrating this verification process into each deductive reasoning stage, we significantly enhance the rigor and trustfulness of generated reasoning steps. Along this process, we also improve the answer correctness on complex reasoning tasks. Code will be released at https://github.com/lz1oceani/verify_cot

arXiv.org e-Print Archive

Anxiety mediates association between sex and jaw function limitation in temporomandibular disorder patients from China

Author: Li Chen
Shanbao Fang
Shuyuan Zhang
Xin Xiong
Yanyue Tan
Yanyue Tan
Yating Yi
Yunhao Zheng
Publication venue: Frontiers Media S.A.
Publication date: 01/05/2024
Field of study

AimThe objective of this study is to explore the relationship between sex and jaw function and to test whether anxiety mediates the causal relationship between sex and jaw function in temporomandibular disorders (TMDs) patients.MethodsA total of 488 participants with TMD were included in the analysis. Demographic data were collected. Generalized anxiety symptoms and anxiety severity were initially assessed using the GAD-7 questionnaire. And jaw function limitation was measured using the JFLS-8 scale. A directed acyclic graph (DAG) was used in this study to evaluate the hypotheses. Mediation analysis was conducted to explore causality and to calculate the total effect, natural direct effect (NDE) and natural indirect effect (NIE).ResultsIn TMD patients, there was a significant association between female and jaw function (r = 0.17, p < 0.001), female and anxiety (r = 0.15, p = 0.002), anxiety and jaw function (r = 0.35, p < 0.001). In addition, sex can directly lead to differences in impaired jaw function (NDE: 3.719, 95% CI: 1.619–5.828, p < 0.001), and can also be causally related to jaw function through anxiety (NIE: 1.146, 95% CI: 0.267–2.024, p = 0.011). And the total effect was 4.865 (95% CI, 2.709–7.029, p < 0.001).ConclusionA causal mechanism was found that anxiety acts as a mediator of sex effects on jaw function. Therefore, psychological factors need to be taken into account in the treatment of female TMD patients. Further clinical trials are needed to explore whether psychotherapy is more beneficial to improve jaw function in female TMD patients

Directory of Open Access Journals

Underground coal mine monitoring with wireless sensor networks

Author: Cheekiralla S.
Considine J.
Douglas S.
Fang Q.
Lazos L.
Melodia T.
Mo Li
Ray S.
Subramanian S.
Wieselthier J. E.
Yunhao Liu
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref

Recommended from our members

Towards Enhanced Language Model Reasoning and Efficient Knowledge Transfer

Author: Fang Yunhao
Publication venue: eScholarship, University of California
Publication date: 01/01/2024
Field of study

Large language models (LLMs) and vision language models (VLMs) are changing the world and gradually presenting human-level intelligence in various real-world scenarios, including knowledge-based question answering, mathematics, and programming. During the master period, my research focuses on understanding and improving current large language models’ reasoning capacity towards general problems solving, and efficient methods to enable the knowledge transfer for vision-language models: distill knowledge from large vision-language models

eScholarship - University of California

Spatial planning for urban ventilation corridors by urban climatology

Author: Ai Wang
Kangkang Gu
Yunhao Fang
Zhao Qian
Zhen Sun
Publication venue: 'Informa UK Limited'
Publication date: 01/12/2020
Field of study

Ventilation corridors in cities can decrease air pollution and alleviate heat island problems but there remains a need to fully assess their effectiveness. Few urban managers have been able to take city-scale approaches to the construction of urban ventilation corridors. This study aimed to introduced the Ventilation Corridor Planning (VCP) model, which is a multi-criteria evaluation method combined with a geographical information system (GIS) to determine where the ventilated environment is most appropriate. Specifically, the VCP model took Bozhou, China as the research object and contained two scales, including mesoscale and local scale. In mesoscale scale, we got three outputs to build urban ventilation corridors, including 1) background wind environment, 2) ventilation potential, 3) heat island intensity. In local scale, we used traditional computational fluid dynamics (CFD) model to verify the impact of VCP criteria. The results revealed that compared with the traditional CFD model, the proposed VCP model has advantages in establishing a comprehensive evaluation standard. In addition, the application of VCP model in macro and micro also enhances the efficiency of ventilation corridor construction. Overall, this study introduced a effective modeling method to urban ventilation corridors planning, and provide a way to study the urban climate

Directory of Open Access Journals

Electroacupuncture at Fengchi(GB20) and Yanglingquan(GB34) Ameliorates Paralgesia through Microglia-Mediated Neuroinflammation in a Rat Model of Migraine

Author: Chenglin Tang
Dongmei Liao
Fang Pang
Min Zhou
Xinlu He
Yunhao Yang
Publication venue: 'MDPI AG'
Publication date: 01/03/2023
Field of study

Background: Multiple studies have suggested that paralgesia (hyperalgesia and cutaneous allodynia) in migraine reflects the activation and sensitisation of the trigeminovascular system (TGVS). In particular, it reflects the second-order and higher nerve centre sensitisation, which is caused and maintained by neuroinflammation. Microglia activation leads to the release of proinflammatory cytokines involved in inflammatory responses. Accumulating evidence indicates that electroacupuncture (EA) is effective in ameliorating paralgesia, but the underlying mechanisms of EA in migraine attacks caused by microglia and microglia-mediated inflammatory responses are still unclear. The purpose of this study was to explore whether EA could ameliorate the dysregulation of pain sensation by suppressing microglial activation and the resulting neuroinflammatory response, and to evaluate whether this response was regulated by Toll-like receptor 4 (TLR4)/nuclear factor-kappa B(NF-κB) in the trigeminal nucleus caudalis (TNC) in a rat model of migraine. Methods: Repeated Inflammatory Soup (IS) was infused into the dura for seven sessions to establish a recurrent migraine-like rat model, and EA treatment was administered at Fengchi (GB20) and Yanglingquan (GB34) after daily IS infusion. Facial mechanical withdrawal thresholds were measured to evaluate the change in pain perception, and plasma samples and the TNC tissues of rats were collected to examine the changes in calcitonin gene-related peptide (CGRP), the Ibal-1-labelled microglial activation, and the resulting inflammatory response, including interleukin-1β (IL-1β), tumour necrosis factor-α (TNF-α), interleukin-6 (IL-6), and their regulatory molecules TLR4/NF-κB, via enzyme-linked immunosorbent assay (ELISA), real-time polymerase chain reaction (RT-PCR), immunohistochemistry (IHC) and Western blot analysis. Results: Repeated IS injections into the dura induced facial mechanical paralgesia, which is the manifestation of migraine attacks, and increased the expression of CGRP, Ibal-1, microglial mediated inflammatory cytokines (IL-1β, TNF-α, IL-6), and regulatory molecules TLR4/NF-κB. EA at GB20/34 significantly attenuated repetitive IS-induced pain hypersensitivity. This effect was consistent with decreased levels of CGRP and inflammatory cytokines in the plasma and the TNC via the inhibition of microglia activation, and this response may be regulated by TLR4/NF-κB. Conclusions: EA ameliorated paralgesia in repetitive IS-induced migraine-like rats, which was mainly mediated by a reduction in microglial activation and microglial-mediated inflammatory responses that could be regulated by TLR4/NF-κB

Directory of Open Access Journals

Five-Direction Occlusion Filling with Five Layer Parallel Two-Stage Pipeline for Stereo Matching with Sub-Pixel Disparity Map Estimation

Author: Fengwei An
Ke Li
Lei Chen
Xinyu Guan
Xiwei Fang
Yunhao Ma
Publication venue: 'MDPI AG'
Publication date: 08/11/2022
Field of study

Binocular stereoscopic matching is an essential method in computer vision, imitating human binocular technology to obtain distance information. Among plentiful stereo matching algorithms, Semi-Global Matching (SGM) is recognized as one of the most popular vision algorithms due to its relatively low power consumption and high accuracy, resulting in many excellent SGM-based hardware accelerators. However, vision algorithms, including SGM, are still somewhat inaccurate in actual long-range applications. Therefore, this paper proposes a disparity improvement strategy based on subpixel interpolation and disparity optimization post-processing using an area optimization strategy, hardware-friendly divider, split look-up table, and the clock alignment multi-directional disparity occlusion filling, and depth acquisition based on floating-point operations. The hardware architecture based on optimization algorithms is on the Stratix-IV platform. It consumes about 5.6 K LUTs, 12.8 K registers, and 2.5 M bits of on-chip memory. Meanwhile, the non-occlusion error rate of only 4.61% is about 1% better than the state-of-the-art works in the KITTI2015 dataset. The maximum working frequency can reach up to 98.28 MHz for the 640 × 480 resolution video and 128 disparity range with the power dissipation of 1.459 W and 320 frames per second processing speed

Multidisciplinary Digital Publishing Institute

PubMed Central