Search CORE

50 research outputs found

AnomalyGPT: Detecting Industrial Anomalies using Large Vision-Language Models

Author: Chen Yingying
Gu Zhaopeng
Tang Ming
Wang Jinqiao
Zhu Bingke
Zhu Guibo
Publication venue
Publication date: 04/09/2023
Field of study

Large Vision-Language Models (LVLMs) such as MiniGPT-4 and LLaVA have demonstrated the capability of understanding images and achieved remarkable performance in various visual tasks. Despite their strong abilities in recognizing common objects due to extensive training datasets, they lack specific domain knowledge and have a weaker understanding of localized details within objects, which hinders their effectiveness in the Industrial Anomaly Detection (IAD) task. On the other hand, most existing IAD methods only provide anomaly scores and necessitate the manual setting of thresholds to distinguish between normal and abnormal samples, which restricts their practical implementation. In this paper, we explore the utilization of LVLM to address the IAD problem and propose AnomalyGPT, a novel IAD approach based on LVLM. We generate training data by simulating anomalous images and producing corresponding textual descriptions for each image. We also employ an image decoder to provide fine-grained semantic and design a prompt learner to fine-tune the LVLM using prompt embeddings. Our AnomalyGPT eliminates the need for manual threshold adjustments, thus directly assesses the presence and locations of anomalies. Additionally, AnomalyGPT supports multi-turn dialogues and exhibits impressive few-shot in-context learning capabilities. With only one normal shot, AnomalyGPT achieves the state-of-the-art performance with an accuracy of 86.1%, an image-level AUC of 94.1%, and a pixel-level AUC of 95.3% on the MVTec-AD dataset. Code is available at https://github.com/CASIA-IVA-Lab/AnomalyGPT.Comment: Project page: https://anomalygpt.github.i

arXiv.org e-Print Archive

Feature Distilled Tracking

Author: Lu Hanqing
Wang Jinqiao
Wang Peisong
Wu Yi
Zhu Guibo
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/12/2017
Field of study

Feature extraction and representation is one of the most important components for fast, accurate, and robust visual tracking. Very deep convolutional neural networks (CNNs) provide effective tools for feature extraction with good generalization ability. However, extracting features using very deep CNN models needs high performance hardware due to its large computation complexity, which prohibits its extensions in real-time applications. To alleviate this problem, we aim at obtaining small and fast-to-execute shallow models based on model compression for visual tracking. Specifically, we propose a small feature distilled network (FDN) for tracking by imitating the intermediate representations of a much deeper network. The FDN extracts rich visual features with higher speed than the original deeper network. To further speed-up, we introduce a shift-and-stitch method to reduce the arithmetic operations, while preserving the spatial resolution of the distilled feature maps unchanged. Finally, a scale adaptive discriminative correlation filter is learned on the distilled feature for visual tracking to handle scale variation of the target. Comprehensive experimental results on object tracking benchmark datasets show that the proposed approach achieves 5x speed-up with competitive performance to the state-of-the-art deep trackers

IUPUIScholarWorks

ChineseWebText: Large-scale High-quality Chinese Web Text Extracted with Effective Evaluation Model

Author: Chen Jianghao
Ding Chenglin
Du Qianlong
Jian Pu
Wang Jinqiao
Xi Tengxiao
Yi Dongyi
Zhang Jiajun
Zhu Guibo
Zong Chengqing
Publication venue
Publication date: 10/11/2023
Field of study

During the development of large language models (LLMs), the scale and quality of the pre-training data play a crucial role in shaping LLMs' capabilities. To accelerate the research of LLMs, several large-scale datasets, such as C4 [1], Pile [2], RefinedWeb [3] and WanJuan [4], have been released to the public. However, most of the released corpus focus mainly on English, and there is still lack of complete tool-chain for extracting clean texts from web data. Furthermore, fine-grained information of the corpus, e.g. the quality of each text, is missing. To address these challenges, we propose in this paper a new complete tool-chain EvalWeb to extract Chinese clean texts from noisy web data. First, similar to previous work, manually crafted rules are employed to discard explicit noisy texts from the raw crawled web contents. Second, a well-designed evaluation model is leveraged to assess the remaining relatively clean data, and each text is assigned a specific quality score. Finally, we can easily utilize an appropriate threshold to select the high-quality pre-training data for Chinese. Using our proposed approach, we release the largest and latest large-scale high-quality Chinese web text ChineseWebText, which consists of 1.42 TB and each text is associated with a quality score, facilitating the LLM researchers to choose the data according to the desired quality thresholds. We also release a much cleaner subset of 600 GB Chinese data with the quality exceeding 90%

arXiv.org e-Print Archive

The 3rd Anti-UAV Workshop & Challenge: Methods and Results

Author: Chu Jiaming
Gulshad Sadaf
Guo Yandong
Jin Lei
Li Jianan
Li Zechao
Liu Shihan
Liu Yang
Satoh Shin ichi
Shengmei Jane Shen
Sun Baigui
Wang Jun
Wang Kai
Wang Zheng
Xia Jiangqiang
Xing Junliang
Xu Tianyang
Zhang Zhihao
Zhao Jian
Zhao Jiaojiao
Zhu Guibo
Zhu Xuefeng
Zhu Zheng
Publication venue
Publication date: 15/07/2023
Field of study

The 3rd Anti-UAV Workshop & Challenge aims to encourage research in developing novel and accurate methods for multi-scale object tracking. The Anti-UAV dataset used for the Anti-UAV Challenge has been publicly released. There are two main differences between this year's competition and the previous two. First, we have expanded the existing dataset, and for the first time, released a training set so that participants can focus on improving their models. Second, we set up two tracks for the first time, i.e., Anti-UAV Tracking and Anti-UAV Detection & Tracking. Around 76 participating teams from the globe competed in the 3rd Anti-UAV Challenge. In this paper, we provide a brief summary of the 3rd Anti-UAV Workshop & Challenge including brief introductions to the top three methods in each track. The submission leaderboard will be reopened for researchers that are interested in the Anti-UAV challenge. The benchmark dataset and other information can be found at: https://anti-uav.github.io/.Comment: Technical report for 3rd Anti-UAV Workshop and Challenge. arXiv admin note: text overlap with arXiv:2108.0990

arXiv.org e-Print Archive

Перспективы развития фундаментальных наук: сборник научных трудов XII Международной конференция студентов и молодых ученых, г. Томск, 21-24 апреля 2015 г.

Author: Geletii Yurii V.
Hardcastle Kenneth I.
Hill Craig L.
Kögerler Paul
Lense Sheri
Musaev Djamaladdin G.
Schilder Helmut
Song Jie
Zhao Chongchao
Zhu Guibo
Publication venue: Изд-во ТПУ
Publication date: 01/01/2012
Field of study

Сборник содержит труды участников XII Международной конференции студентов и молодых учёных «Перспективы развития фундаментальных наук». Включает доклады студентов и молодых ученых, представленные на секциях «Физика», «Химия», «Математика», «Биология и медицина», «Наноматериалы и нанотехнологии», «Технология», «Конкурс архитектурных работ», «IT- технологии и электроника». Предназначен для студентов, аспирантов, молодых ученых, преподавателей в области естественных наук и высшей математики

Electronic archive of Tomsk Polytechnic University

Deconvolution of single-cell multi-omics layers reveals regulatory heterogeneity

Author: Chen Dongsheng
Chen Fang
Chen Xiaowei
Cheng Mengnan
Dong Guoyi
Eils Roland
Fink J. Lynn
Gao Zhengliang
Herrmann Carl
Hou Yong
Leng Lizhi
Li Guibo
Li Rui
Lin Ge
Lin Xinxin
Liu Chuanyu
Liu Longqi
Liu Shiping
Liu Xin
Liu Yang
Lu Haorong
Quintero Andrés
Shang Zhouchun
Wang Hongru
Wang Mingyue
Wang Qi
Wang Quanlei
Wei Xiaoyu
Wu Liang
Xu Jiangshan
Xu Liqin
Xu Xun
Yang Huanming
Ye Yunming
Yuan Yue
Zhou Qing
Zhu Shida
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2019
Field of study

Heterogeneity in gene expression and epigenetic states exists across individual cells. Here, the authors develop scCAT-seq, a technique for simultaneously performing ATAC-seq and RNA-seq within the same single cell

Directory of Open Access Journals

Copenhagen University Research Information System

The Thermal Infrared Visual Object Tracking VOT-TIR2015 challenge results

The Thermal Infrared Visual Object Tracking challenge 2015, VOT-TIR2015, aims at comparing short-term single-object visual trackers that work on thermal infrared (TIR) sequences and do not apply pre-learned models of object appearance. VOT-TIR2015 is the first benchmark on short-term tracking in TIR sequences. Results of 24 trackers are presented. For each participating tracker, a short description is provided in the appendix. The VOT-TIR2015 challenge is based on the VOT2013 challenge, but introduces the following novelties: (i) the newly collected LTIR (Link - ping TIR) dataset is used, (ii) the VOT2013 attributes are adapted to TIR data, (iii) the evaluation is performed using insights gained during VOT2013 and VOT2014 and is similar to VOT2015

Oxford University Research Archive

Foreground Removal Approach for Hole Filling in 3D Video and FVV Synthesis

Author: Guibo Luo
Yuesheng Zhu
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref

Part Context Learning for Visual Tracking

Author: Chaoyang Zhao
Guibo Zhu
Hanqing Lu
Jinqiao Wang
Publication venue
Publication date: 05/03/2020
Field of study

Abstract Context information is widely used in computer vision for tracking arbitrary objects. Most existing works focus on how to distinguish the tracked object from background or inter-frame object similarity information or key-points supporters as their auxiliary information to assist them in tracking. However, in most cases, how to discover and represent both the intrinsic property inside the object and surrounding information is still an open problem. In this paper, we propose a unified context learning framework that can capture stable structure relations of in-object parts, context parts and the object itself to enhance the tracker's performance. The proposed Part Context Tracker (PCT) consists of an appearance model, an internal relation model and an context relation model. The appearance model represents the appearances of the object and parts. The internal relation model utilizes the parts inside the object to describe the spatio-temporal structure property directly, while the context relation model takes advantage of the latent intersection between the object and background parts. Then the appearance model, internal relation model and context relation model are embedded in a max-margin structured learning framework. Furthermore, a simple robust update strategy using median filter is utilized, which can deal with appearance change effectively and alleviate the drift problem. Extensive experiments are conducted on various benchmark dataset, and the comparisons with state-of-the-arts demonstrate the effectiveness of our work

CiteSeerX