Search CORE

172 research outputs found

Rethink Cross-Modal Fusion in Weakly-Supervised Audio-Visual Video Parsing

Author: Hu Conghui
Lee Gim Hee
Xu Yating
Publication venue
Publication date: 14/11/2023
Field of study

Existing works on weakly-supervised audio-visual video parsing adopt hybrid attention network (HAN) as the multi-modal embedding to capture the cross-modal context. It embeds the audio and visual modalities with a shared network, where the cross-attention is performed at the input. However, such an early fusion method highly entangles the two non-fully correlated modalities and leads to sub-optimal performance in detecting single-modality events. To deal with this problem, we propose the messenger-guided mid-fusion transformer to reduce the uncorrelated cross-modal context in the fusion. The messengers condense the full cross-modal context into a compact representation to only preserve useful cross-modal information. Furthermore, due to the fact that microphones capture audio events from all directions, while cameras only record visual events within a restricted field of view, there is a more frequent occurrence of unaligned cross-modal context from audio for visual event predictions. We thus propose cross-audio prediction consistency to suppress the impact of irrelevant audio information on visual event prediction. Experiments consistently illustrate the superior performance of our framework compared to existing state-of-the-art methods.Comment: WACV 202

arXiv.org e-Print Archive

Optimal control and bifurcation analysis of a delayed fractional-order SIRS model with general incidence rate and delayed control

Author: Ren Guojian
Si Xinhui
Xu Conghui
Yu Yongguang
Publication venue: Vilnius University Press
Publication date: 01/07/2024
Field of study

A fractional-order generalized SIRS model considering incubation period is established in this paper for the transmission of emerging pathogens. The corresponding Hopf bifurcation is discussed by selecting time delay as the bifurcation parameter. In order to control the occurrence of Hopf bifurcation and achieve better dynamic behaviors, a delayed feedback control is adopted to the model. Further, the delayed fractional-order optimal control problem (DFOCP) is proposed and discussed. The parameters of the proposed model are identified through the measurement data of coronavirus disease 2019 (COVID-19). Based on the results of parameter identification, the corresponding DFOCP with delayed control is numerically solved

Nonlinear Analysis: Modelling and Control

Generalized Few-Shot Point Cloud Segmentation Via Geometric Words

Author: Hu Conghui
Lee Gim Hee
Xu Yating
Zhao Na
Publication venue
Publication date: 20/09/2023
Field of study

Existing fully-supervised point cloud segmentation methods suffer in the dynamic testing environment with emerging new classes. Few-shot point cloud segmentation algorithms address this problem by learning to adapt to new classes at the sacrifice of segmentation accuracy for the base classes, which severely impedes its practicality. This largely motivates us to present the first attempt at a more practical paradigm of generalized few-shot point cloud segmentation, which requires the model to generalize to new categories with only a few support point clouds and simultaneously retain the capability to segment base classes. We propose the geometric words to represent geometric components shared between the base and novel classes, and incorporate them into a novel geometric-aware semantic representation to facilitate better generalization to the new classes without forgetting the old ones. Moreover, we introduce geometric prototypes to guide the segmentation with geometric prior knowledge. Extensive experiments on S3DIS and ScanNet consistently illustrate the superior performance of our method over baseline methods. Our code is available at: https://github.com/Pixie8888/GFS-3DSeg_GWs.Comment: Accepted by ICCV 202

arXiv.org e-Print Archive

Global dynamics for a class of reaction–diffusion multigroup SIR epidemic models with time fractional-order derivatives

Author: Lu Zhenzhen
Meng Xiangyun
Ren Guojian
Xu Conghui
Yu Yongguang
Publication venue: 'Vilnius University Press'
Publication date: 01/01/2022
Field of study

This paper investigates the global dynamics for a class of multigroup SIR epidemic model with time fractional-order derivatives and reaction–diffusion. The fractional order considered in this paper is in (0; 1], which the propagation speed of this process is slower than Brownian motion leading to anomalous subdiffusion. Furthermore, the generalized incidence function is considered so that the data itself can flexibly determine the functional form of incidence rates in practice. Firstly, the existence, nonnegativity, and ultimate boundedness of the solution for the proposed system are studied. Moreover, the basic reproduction number R0 is calculated and shown as a threshold: the disease-free equilibrium point of the proposed system is globally asymptotically stable when R0 ≤ 1, while when R0 > 1, the proposed system is uniformly persistent, and the endemic equilibrium point is globally asymptotically stable. Finally, the theoretical results are verified by numerical simulation

Nonlinear Analysis: Modelling and Control

MiChao-HuaFen 1.0: A Specialized Pre-trained Corpus Dataset for Domain-specific Large Models

Author: He Conghui
Li Wei
Li Yao
Liu Yidong
Shang FuKai
Wang Fang
Wang Jun
Xu Rui
Publication venue
Publication date: 26/09/2023
Field of study

With the advancement of deep learning technologies, general-purpose large models such as GPT-4 have demonstrated exceptional capabilities across various domains. Nevertheless, there remains a demand for high-quality, domain-specific outputs in areas like healthcare, law, and finance. This paper first evaluates the existing large models for specialized domains and discusses their limitations. To cater to the specific needs of certain domains, we introduce the ``MiChao-HuaFen 1.0'' pre-trained corpus dataset, tailored for the news and governmental sectors. The dataset, sourced from publicly available internet data from 2022, underwent multiple rounds of cleansing and processing to ensure high quality and reliable origins, with provisions for consistent and stable updates. This dataset not only supports the pre-training of large models for Chinese vertical domains but also aids in propelling deep learning research and applications in related fields.Comment: 4 pages,2 figure

arXiv.org e-Print Archive

Supramolecular Assembly and Stimuli-Responsive Behavior of Multielement Hybrid Copolymers

Author: Chen Guorong
Dai Lizong
Liu Cheng
Luo Weiang
Mao Jie
Xu Yiting
Yuan Conghui
Zeng Birong
Publication venue: 'IntechOpen'
Publication date: 10/05/2017
Field of study

Toward the organic polymer, hybrid elements can be defined as those beyond C, H, O, and N. Polymers comprising hybrid elements, such as Si, P, B, or metal ions have attracted great attention in the design of high performance or smart materials. Introduction of hybrid elements into a polymeric network may also lead to the formation of new intermolecular interactions, thus promote the self-organization of polymer chains to form controllable structures and morphologies. In this chapter, we introduce some of the recent important development in the design and self-assembly of hybrid amphiphilic copolymers. Specific attention was paid on the hybrid amphiphilic copolymers containing POSS, boronic acid, or boronate functional moieties. We introduce the design, synthesis, self-assembly behavior, and properties of these hybrid amphiphilic copolymers in detail. Also, the advantages and drawbacks of these polymers and their corresponding nanoassemblies are discussed

IntechOpen

ProtLLM: An Interleaved Protein-Language LLM with Protein-as-Word Pre-Training

Author: Chi Zewen
He Conghui
Huang Heyan
Mao Xian-Ling
Xu Minghao
Zhang Wentao
Zheng Heqi
Zhuo Le
Publication venue
Publication date: 27/02/2024
Field of study

We propose ProtLLM, a versatile cross-modal large language model (LLM) for both protein-centric and protein-language tasks. ProtLLM features a unique dynamic protein mounting mechanism, enabling it to handle complex inputs where the natural language text is interspersed with an arbitrary number of proteins. Besides, we propose the protein-as-word language modeling approach to train ProtLLM. By developing a specialized protein vocabulary, we equip the model with the capability to predict not just natural language but also proteins from a vast pool of candidates. Additionally, we construct a large-scale interleaved protein-text dataset, named InterPT, for pre-training. This dataset comprehensively encompasses both (1) structured data sources like protein annotations and (2) unstructured data sources like biological research papers, thereby endowing ProtLLM with crucial knowledge for understanding proteins. We evaluate ProtLLM on classic supervised protein-centric tasks and explore its novel protein-language applications. Experimental results demonstrate that ProtLLM not only achieves superior performance against protein-specialized baselines on protein-centric tasks but also induces zero-shot and in-context learning capabilities on protein-language tasks.Comment: https://protllm.github.io/project

arXiv.org e-Print Archive

OmniCity: Omnipotent City Understanding with Multi-level and Multi-view Images

Author: He Conghui
Lai Yawen
Li Weijia
Lin Dahua
Xia Gui-Song
Xiangli Yuanbo
Xu Linning
Yu Jinhua
Publication venue
Publication date: 04/08/2022
Field of study

This paper presents OmniCity, a new dataset for omnipotent city understanding from multi-level and multi-view images. More precisely, the OmniCity contains multi-view satellite images as well as street-level panorama and mono-view images, constituting over 100K pixel-wise annotated images that are well-aligned and collected from 25K geo-locations in New York City. To alleviate the substantial pixel-wise annotation efforts, we propose an efficient street-view image annotation pipeline that leverages the existing label maps of satellite view and the transformation relations between different views (satellite, panorama, and mono-view). With the new OmniCity dataset, we provide benchmarks for a variety of tasks including building footprint extraction, height estimation, and building plane/instance/fine-grained segmentation. Compared with the existing multi-level and multi-view benchmarks, OmniCity contains a larger number of images with richer annotation types and more views, provides more benchmark results of state-of-the-art models, and introduces a novel task for fine-grained building instance segmentation on street-level panorama images. Moreover, OmniCity provides new problem settings for existing tasks, such as cross-view image matching, synthesis, segmentation, detection, etc., and facilitates the developing of new methods for large-scale city understanding, reconstruction, and simulation. The OmniCity dataset as well as the benchmarks will be available at https://city-super.github.io/omnicity

arXiv.org e-Print Archive