Search CORE

50 research outputs found

Vision-Language Models for Vision Tasks: A Survey

Author: Huang Jiaxing
Jin Sheng
Lu Shijian
Zhang Jingyi
Publication venue
Publication date: 16/02/2024
Field of study

Most visual recognition studies rely heavily on crowd-labelled data in deep neural networks (DNNs) training, and they usually train a DNN for each single visual recognition task, leading to a laborious and time-consuming visual recognition paradigm. To address the two challenges, Vision-Language Models (VLMs) have been intensively investigated recently, which learns rich vision-language correlation from web-scale image-text pairs that are almost infinitely available on the Internet and enables zero-shot predictions on various visual recognition tasks with a single VLM. This paper provides a systematic review of visual language models for various visual recognition tasks, including: (1) the background that introduces the development of visual recognition paradigms; (2) the foundations of VLM that summarize the widely-adopted network architectures, pre-training objectives, and downstream tasks; (3) the widely-adopted datasets in VLM pre-training and evaluations; (4) the review and categorization of existing VLM pre-training methods, VLM transfer learning methods, and VLM knowledge distillation methods; (5) the benchmarking, analysis and discussion of the reviewed methods; (6) several research challenges and potential research directions that could be pursued in the future VLM studies for visual recognition. A project associated with this survey has been created at https://github.com/jingyi0000/VLM_survey

arXiv.org e-Print Archive

Domain Generalization via Balancing Training Difficulty and Model Capability

Author: Huang Jiaxing
Jiang Xueying
Jin Sheng
Lu Shijian
Publication venue
Publication date: 02/09/2023
Field of study

Domain generalization (DG) aims to learn domain-generalizable models from one or multiple source domains that can perform well in unseen target domains. Despite its recent progress, most existing work suffers from the misalignment between the difficulty level of training samples and the capability of contemporarily trained models, leading to over-fitting or under-fitting in the trained generalization model. We design MoDify, a Momentum Difficulty framework that tackles the misalignment by balancing the seesaw between the model's capability and the samples' difficulties along the training process. MoDify consists of two novel designs that collaborate to fight against the misalignment while learning domain-generalizable models. The first is MoDify-based Data Augmentation which exploits an RGB Shuffle technique to generate difficulty-aware training samples on the fly. The second is MoDify-based Network Optimization which dynamically schedules the training samples for balanced and smooth learning with appropriate difficulty. Without bells and whistles, a simple implementation of MoDify achieves superior performance across multiple benchmarks. In addition, MoDify can complement existing methods as a plug-in, and it is generic and can work for different visual recognition tasks.Comment: 11 pages, 6 figures, Accepted by ICCV 202

arXiv.org e-Print Archive

SME creation facilitation process at Universities

Author: Jin Jun
Johnsson Charlotta
Luo Shijian
Nilsson Carl-Henric
Yang Qinmin
Publication venue: West Lake International Conference on Small and Medium Businesses
Publication date: 01/01/2012
Field of study

Much research on SMEs is aimed at researching SMEs after the fact that they have become SMEs. However all SMEs as well as larger companies start as an idea in the head or heads of one or many persons - the prospective entrepreneurs. The purpose of this paper is to investigate how SMEs can be created by transforming ideas into real companies. More specifically we will investigate if and how Universities can facilitate this process by running international cross-functional courses. Our hypothesis is that in order to create a SME three topics are of pivotal importance: • Specialist Competence in the business area • General management competence • Financial capital During the fall of 2012 we will test the hypothesis by running a university course called international Marked Driven Engineering (iMDE) in cooperation between Lund University and Zhejiang University. Technology faculties from both Universities are involved – students as well as teachers. Their participation is crucial to cover specialist competence in the business area – technology-based enterprises. Management faculties from both Universities are involved – students as well as teachers. Their participation is crucial to cover general management competence in setting up, funding and running an enterprise. When it comes to financial capital our hypothesis is that for clever business ideas, financial capital can be raised in order to industrialize such a business idea. In the first trial run 8 business ideas will be generated and tested in the Hangzhou area during the period 120910-121019. Each of the 8 teams will consist of 8 persons – blended to cross-fertilize engineering-business, Chinese-Swedish and male-female participants. With the support of university teachers with the same blend the aim is to create embryos of SME’s

Lund University Publications

LLMs Meet VLMs: Boost Open Vocabulary Object Detection with Fine-grained Descriptors

Author: Huang Jiaxing
Jiang Xueying
Jin Sheng
Lu Lewei
Lu Shijian
Publication venue
Publication date: 07/02/2024
Field of study

Inspired by the outstanding zero-shot capability of vision language models (VLMs) in image classification tasks, open-vocabulary object detection has attracted increasing interest by distilling the broad VLM knowledge into detector training. However, most existing open-vocabulary detectors learn by aligning region embeddings with categorical labels (e.g., bicycle) only, disregarding the capability of VLMs on aligning visual embeddings with fine-grained text description of object parts (e.g., pedals and bells). This paper presents DVDet, a Descriptor-Enhanced Open Vocabulary Detector that introduces conditional context prompts and hierarchical textual descriptors that enable precise region-text alignment as well as open-vocabulary detection training in general. Specifically, the conditional context prompt transforms regional embeddings into image-like representations that can be directly integrated into general open vocabulary detection training. In addition, we introduce large language models as an interactive and implicit knowledge repository which enables iterative mining and refining visually oriented textual descriptors for precise region-text alignment. Extensive experiments over multiple large-scale benchmarks show that DVDet outperforms the state-of-the-art consistently by large margins

arXiv.org e-Print Archive

Film bulk acoustic resonators integrated on arbitrary substrates using a polymer support layer

Author: Chen Guohao
Dong Shurong
Flewitt A.
Jin Hao
Li Shijian
Luo J.
Milne W. I.
Wang Xiaozhi
Zhao Xinru
Publication venue: Nature Publishing Group
Publication date: 31/03/2015
Field of study

The film bulk acoustic resonator (FBAR) is a widely-used MEMS device which can be used as a filter, or as a gravimetric sensor for biochemical or physical sensing. Current device architectures require the use of an acoustic mirror or a freestanding membrane and are fabricated as discrete components. A new architecture is demonstrated which permits fabrication and integration of FBARs on arbitrary substrates. Wave confinement is achieved by fabricating the resonator on a polyimide support layer. Results show when the polymer thickness is greater than a critical value, d, the FBARs have similar performance to devices using alternative architectures. For ZnO FBARs operating at 1.3–2.2 GHz, d is ~9 μm, and the devices have a Q-factor of 470, comparable to 493 for the membrane architecture devices. The polymer support makes the resonators insensitive to the underlying substrate. Yields over 95% have been achieved on roughened silicon, copper and glass

Crossref

University of Bolton Institutional Repository (UBIR)

PubMed Central

Apollo (Cambridge)

University of Bolton Institutional Repository

A comprehensive review of carbon capture science and technologies

Author: Akinola Toluleke E
Ali Zakawat
Ba Zhichen
Baqain Mais Hanna Suleiman
Boetcher Sandra KS
Bork Alexander H
Chen Guoxing
Chen Siming
Chen Yongdong
Dai Zhongde
Daramola Michael
Deng Lihua
Deng Shuai
Donat Felix
Feng Dongdong
Foong Shin Ying
Gao Li
Gao Ningbo
Gao Xin
He Xuezhong
Hu Guoping
Hu Leiqing
Huang Jun
Huang Qi
Ji Haiyan
Ji Ying
Jiang Long
Jiang Xia
Jin Xin
Kang Guojun
Kawi Sibudjing
Khan Asim Laeeq
Konist Alar
Krödel Maximilian
Lam Su Shiung
Lawal Adekola
Li Qingfang
Liang Daxin
Lim Kang Hui
Lin Haiqing
Liu Lina
Liu Ling
Liu Wen
Liu Xinying
Liu Yongzhuo
Lu Shijian
Ma Lin
Magdziarz Aneta
Miskolczi Norbert
Mlonka-Mędrala Agata
Muller Christoph
Ng Hui Suan
Oko Eni
Otitoju Olajide S
Rekhtina Margarita
Saqline Syed
Sempuga Baraka C
Shang Jin
Shi Huancong
Singh Rasmeet
Sipra Ayesha Tariq
Soccol Carlos Ricardo
Song Chunfeng
Sun Hongman
Sun Mingzhe
Sun Shaozeng
Tomasek Szabina
Tsang Daniel CW
Tu Xin
Vandenberghe Luciana Porto de Souza
Vieira Sabrina
Wang Kaifang
Wang Meihong
Webley Paul A
Weidenkaff Anke
Wu Chunfei
Wu Xianyue
Xing Yupeng
Xu Yongqing
Xu Zhicheng
Yan Xinlong
Yang Haiping
Yang Qing
Yi Shouliang
Yu Zewei
Zhang Guojun
Zhang Xiong
Zhang Yu
Zhang Yu
Zhao Dongya
Zhao Ruikai
Zhao Yijun
Zhao Zhenyu
Zhou Hui
Zhu Jiamei
Publication venue: Elsevier BV
Publication date: 01/12/2023
Field of study

University of Liverpool Repository

Feasibility study of carbon cloth for 3D integrated flexible cathode of lithium-ion battery

Author: GUO Jin
IMANISHI Nobuyuki
TONG Yang
YAN Shijian
Publication venue: Journal of Materials Engineering
Publication date: 01/03/2024
Field of study

The carbon cloths made of carbon fiber as 3D integrated cathode for lithium-ion batterie were studied. The graphitization degree of three types of carbon cloths after heat treatment were qualitatively analyzed and quantitatively calculated. Using lithium metal as the counter electrode, the graphitized carbon cloth electrodes show first discharge specific capacities of 83.6, 94.5 mAh∙g-1 and 115.2 mAh∙g-1 under 0.1-0.5 V, respectively. After 50 cycles, the specific capacities of carbon cloth electrodes remain 55.0, 80.0 mAh∙g-1 and 88.0 mAh∙g-1.With LiFePO4-loaded graphitized carbon cloths as cathodes, the initial discharge specific capacities of electrodes are 73.2, 109.5 mAh∙g-1 and 130.2 mAh∙g-1, respectively. The carbon cloth whose graphitization degree is 76.02% shows stable specific capacity of about 90.0 mAh∙g-1 after 50 cycles, and shows better comprehensive performances. This carbon cloth is more suitable for the integrated flexible cathode of lithium-ion batteries. By establishing the mechanical model of the interaction between LiFePO4 particles and carbon fiber, the relationship between mechanical, electrical and electrochemical properties of the integrated cathode were discussed.Using carbon cloth as an integrated cathode for lithium-ion batteries can simplify the conventional production process and innovate its production process

Directory of Open Access Journals