Search CORE

71 research outputs found

Measurement of permeability for ferrous metallic plates using a novel lift-off compensation technique on phase signature

Author: Huang Ruochen
Lu Mingyang
Peyton Anthony
Yin Wuliang
Zhao Qian
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 13/05/2019
Field of study

Lift-off of sensor affects the prediction of electromagnetic properties for both ferrous and non-ferrous steel plates. In this paper, we developed a strategy to address this issue for ferrous plates. With increased lift-off, the phase of the measured impedance for steel plates reduces. Meanwhile, the magnitude of the impedance signal decreases. Based on these facts, a phase compensation algorithm is developed which corrects the phase change due to lift-off considering the magnitude of the impedance signal. Further, a new magnetic permeability prediction technique is presented, which has been validated by analytical and measured results. With this new technique, the error in permeability prediction is less than 2% within the range of lift-offs tested

arXiv.org e-Print Archive

The University of Manchester - Institutional Repository

Learning to Initialize: Can Meta Learning Improve Cross-task Generalization in Prompt Tuning?

Author: Joty Shafiq
Li Qian
Qin Chengwei
Zhao Ruochen
Publication venue
Publication date: 16/02/2023
Field of study

Prompt tuning (PT) which only tunes the embeddings of an additional sequence of tokens per task, keeping the pre-trained language model (PLM) frozen, has shown remarkable performance in few-shot learning. Despite this, PT has been shown to rely heavily on good initialization of the prompt embeddings. In this work, we study meta prompt tuning (MPT) to systematically explore how meta-learning can help improve (if it can) cross-task generalization in PT through learning to initialize the prompt embeddings from other relevant tasks. We empirically analyze a representative set of meta learning algorithms in a wide range of adaptation settings with different source/target task configurations on a large set of few-shot tasks. With extensive experiments and analysis, we demonstrate the effectiveness of MPT. We find the improvement to be significant particularly on classification tasks. For other kinds of tasks such as question answering, we observe that while MPT can outperform PT in most cases, it does not always outperform multi-task learning. We further provide an in-depth analysis from the perspective of task similarity

arXiv.org e-Print Archive

An equivalent-effect phenomenon in eddy current non-destructive testing of thin structures

Author: Huang Ruochen
Lu Mingyang
Peyton Anthony
Tang Jiawei
Xu Hanyang
Yin Wuliang
Zhang Zhijie
Zhao Qian
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2019
Field of study

The inductance/impedance due to thin metallic structures in non-destructive testing (NDT) is difficult to evaluate. In particular, in Finite Element Method (FEM) eddy current simulation, an extremely fine mesh is required to accurately simulate skin effects especially at high frequencies, and this could cause an extremely large total mesh for the whole problem, i.e. including, for example, other surrounding structures and excitation sources like coils. Consequently, intensive computation requirements are needed. In this paper, an equivalent-effect phenomenon is found, which has revealed that alternative structures can produce the same effect on the sensor response, i.e. mutual impedance/inductance of coupled coils if a relationship (reciprocal relationship) between the electrical conductivity and the thickness of the structure is observed. By using this relationship, the mutual inductance/impedance can be calculated from the equivalent structures with much fewer mesh elements, which can significantly save the computation time. In eddy current NDT, coils inductance/impedance is normally used as a critical parameter for various industrial applications, such as flaw detection, coating and microstructure sensing. Theoretical derivation, measurements and simulations have been presented to verify the feasibility of the proposed phenomenon

arXiv.org e-Print Archive

The University of Manchester - Institutional Repository

Lifelong Event Detection with Embedding Space Separation and Compaction

Author: Chen Ruirui
Joty Shafiq
Qin Chengwei
Xia Wenhan
Zhao Ruochen
Publication venue
Publication date: 03/04/2024
Field of study

To mitigate forgetting, existing lifelong event detection methods typically maintain a memory module and replay the stored memory data during the learning of a new task. However, the simple combination of memory data and new-task samples can still result in substantial forgetting of previously acquired knowledge, which may occur due to the potential overlap between the feature distribution of new data and the previously learned embedding space. Moreover, the model suffers from overfitting on the few memory samples rather than effectively remembering learned patterns. To address the challenges of forgetting and overfitting, we propose a novel method based on embedding space separation and compaction. Our method alleviates forgetting of previously learned tasks by forcing the feature distribution of new data away from the previous embedding space. It also mitigates overfitting by a memory calibration mechanism that encourages memory data to be close to its prototype to enhance intra-class compactness. In addition, the learnable parameters of the new task are initialized by drawing upon acquired knowledge from the previously learned task to facilitate forward knowledge transfer. With extensive experiments, we demonstrate that our method can significantly outperform previous state-of-the-art approaches.Comment: NAACL 2024 main conferenc

arXiv.org e-Print Archive

Can ChatGPT-like Generative Models Guarantee Factual Accuracy? On the Mistakes of New Generation Search Engines

Author: Bing Lidong
Chia Yew Ken
Ding Bosheng
Li Xingxuan
Zhao Ruochen
Publication venue
Publication date: 02/03/2023
Field of study

Although large conversational AI models such as OpenAI's ChatGPT have demonstrated great potential, we question whether such models can guarantee factual accuracy. Recently, technology companies such as Microsoft and Google have announced new services which aim to combine search engines with conversational AI. However, we have found numerous mistakes in the public demonstrations that suggest we should not easily trust the factual claims of the AI models. Rather than criticizing specific models or companies, we hope to call on researchers and developers to improve AI models' transparency and factual correctness

arXiv.org e-Print Archive

Language Models can be Logical Solvers

Author: Chen Weizhu
Feng Jiazhan
Hao Junheng
Sharma Hiteshi
Shen Yelong
Xu Ruochen
Zhao Dongyan
Publication venue
Publication date: 10/11/2023
Field of study

Logical reasoning is a fundamental aspect of human intelligence and a key component of tasks like problem-solving and decision-making. Recent advancements have enabled Large Language Models (LLMs) to potentially exhibit reasoning capabilities, but complex logical reasoning remains a challenge. The state-of-the-art, solver-augmented language models, use LLMs to parse natural language logical questions into symbolic representations first and then adopt external logical solvers to take in the symbolic representations and output the answers. Despite their impressive performance, any parsing errors will inevitably result in the failure of the execution of the external logical solver and no answer to the logical questions. In this paper, we introduce LoGiPT, a novel language model that directly emulates the reasoning processes of logical solvers and bypasses the parsing errors by learning to strict adherence to solver syntax and grammar. LoGiPT is fine-tuned on a newly constructed instruction-tuning dataset derived from revealing and refining the invisible reasoning process of deductive solvers. Experimental results on two public deductive reasoning datasets demonstrate that LoGiPT outperforms state-of-the-art solver-augmented LMs and few-shot prompting methods on competitive LLMs like ChatGPT or GPT-4.Comment: Preprin

arXiv.org e-Print Archive

Evaluation of a clinical pharmacist-led antimicrobial stewardship program in a neurosurgical intensive care unit: a pre-and post-intervention cohort study

Author: Chunhua Zhou
Chunhua Zhou
Jing Yu
Jing Yu
Ruochen Qu
Ruochen Qu
Yan Liu
Yan Liu
Yan Zhao
Yan Zhao
Yuanyuan Zhao
Yuanyuan Zhao
Ziyang Wang
Publication venue: Frontiers Media S.A.
Publication date: 01/09/2023
Field of study

Background: Antimicrobial resistance poses a significant challenge in neurosurgical intensive care units (ICU). The excessive use of broad-spectrum antibiotics is closely linked to the emergence and dissemination of drug-resistant bacteria within neurosurgical ICUs. This study assessed the effects of implementing a comprehensive Antimicrobial Stewardship (AMS) program in a neurosurgical ICU setting.Methods: From April 2022 to September 2022, an AMS program was implemented in the neurosurgical ICU. The program involved the regular presence of a pharmacist and an infectious disease physician who conducted prospective audits and provided feedback. To assess the impact of the AMS program, the outcome measures were compared between the AMS period and the 6 months before AMS implementation (pre-AMS period). The primary outcome was the use of antibacterial agents, including anti-pseudomonal beta-lactams (APBLs), polymyxin, and tigecycline. Additionally, the study evaluated the appropriateness of antimicrobial de-escalation and the susceptibility of Gram-negative bacilli to antimicrobial agents.Results: A total of 526 were included during the AMS period, while 487 patients were included in the pre-AMS period. The two groups had no significant differences in disease severity and mortality rates. During the AMS period, there was a notable decrease in the use of APBLs as empiric treatment (43.92% vs. 60.99%, p < 0.001). Multi-drug resistant organism (MDRO) infections decrease significantly during AMS period (11.03% vs. 18.48%, p < 0.001). The number of prescription adjustment increased significantly in all patients (0 item vs. 0 item, p < 0.001) and MDRO-positive patients (3 items vs. 2 items, p < 0.001) during the AMS period. Additionally, appropriate antimicrobial de-escalation for patients with MDRO showed improvement during the AMS period (39.66% vs. 20%, p = 0.001). Polymyxin utilization also decreased during the AMS period (15.52% vs. 31.11%, p = 0.034). Furthermore, the susceptibility of Gram-negative Bacilli isolates to APBLs was significantly higher during the AMS period.Conclusion: Implementing a comprehensive pharmacist-led AMS program led to a decrease in the use of antibacterial agents. This reduction in usage is significant because it can potentially delay the emergence of bacterial resistance

Directory of Open Access Journals

Chain-of-Knowledge: Grounding Large Language Models via Dynamic Knowledge Adapting over Heterogeneous Sources

Author: Bing Lidong
Chia Yew Ken
Ding Bosheng
Joty Shafiq
Li Xingxuan
Poria Soujanya
Zhao Ruochen
Publication venue
Publication date: 21/02/2024
Field of study

We present chain-of-knowledge (CoK), a novel framework that augments large language models (LLMs) by dynamically incorporating grounding information from heterogeneous sources. It results in more factual rationales and reduced hallucination in generation. Specifically, CoK consists of three stages: reasoning preparation, dynamic knowledge adapting, and answer consolidation. Given a knowledge-intensive question, CoK first prepares several preliminary rationales and answers while identifying the relevant knowledge domains. If there is no majority consensus among the answers from samples, CoK corrects the rationales step by step by adapting knowledge from the identified domains. These corrected rationales can plausibly serve as a better foundation for the final answer consolidation. Unlike prior studies that primarily use unstructured data, CoK also leverages structured knowledge sources such as Wikidata and tables that provide more reliable factual information. To access both unstructured and structured knowledge sources in the dynamic knowledge adapting stage, we propose an adaptive query generator that allows the generation of queries for various types of query languages, including SPARQL, SQL, and natural sentences. Moreover, to minimize error propagation between rationales, CoK corrects the rationales progressively using preceding corrected rationales to generate and correct subsequent rationales. Extensive experiments show that CoK consistently improves the performance of LLMs on knowledge-intensive tasks across different domains.Comment: Accepted by ICLR 202

arXiv.org e-Print Archive

ChatGPT's One-year Anniversary: Are Open-Source Large Language Models Catching up?

Author: Chen Hailin
Jiao Fangkai
Joty Shafiq
Li Xingxuan
Qin Chengwei
Ravaut Mathieu
Xiong Caiming
Zhao Ruochen
Publication venue
Publication date: 15/01/2024
Field of study

Upon its release in late 2022, ChatGPT has brought a seismic shift in the entire landscape of AI, both in research and commerce. Through instruction-tuning a large language model (LLM) with supervised fine-tuning and reinforcement learning from human feedback, it showed that a model could answer human questions and follow instructions on a broad panel of tasks. Following this success, interests in LLMs have intensified, with new LLMs flourishing at frequent interval across academia and industry, including many start-ups focused on LLMs. While closed-source LLMs (e.g., OpenAI's GPT, Anthropic's Claude) generally outperform their open-source counterparts, the progress on the latter has been rapid with claims of achieving parity or even better on certain tasks. This has crucial implications not only on research but also on business. In this work, on the first anniversary of ChatGPT, we provide an exhaustive overview of this success, surveying all tasks where an open-source LLM has claimed to be on par or better than ChatGPT.Comment: version v4, included latest top-performing open-sourced LLM

arXiv.org e-Print Archive

Retrieving Multimodal Information for Augmented Generation: A Survey

Author: Chen Hailin
Ding Bosheng
Do Xuan Long
Guo Xiaobao
Jiao Fangkai
Joty Shafiq
Li Minzhi
Li Xingxuan
Qin Chengwei
Wang Weishi
Zhao Ruochen
Publication venue
Publication date: 30/11/2023
Field of study

As Large Language Models (LLMs) become popular, there emerged an important trend of using multimodality to augment the LLMs' generation ability, which enables LLMs to better interact with the world. However, there lacks a unified perception of at which stage and how to incorporate different modalities. In this survey, we review methods that assist and augment generative models by retrieving multimodal knowledge, whose formats range from images, codes, tables, graphs, to audio. Such methods offer a promising solution to important concerns such as factuality, reasoning, interpretability, and robustness. By providing an in-depth review, this survey is expected to provide scholars with a deeper understanding of the methods' applications and encourage them to adapt existing techniques to the fast-growing field of LLMs

arXiv.org e-Print Archive