Search CORE

66 research outputs found

DataElixir: Purifying Poisoned Dataset to Mitigate Backdoor Attacks via Diffusion Models

Author: Chen Kai
Lan Yibing
Lv Peizhuo
Ma Hualong
Meng Guozhu
Zhou Jiachen
Publication venue
Publication date: 19/12/2023
Field of study

Dataset sanitization is a widely adopted proactive defense against poisoning-based backdoor attacks, aimed at filtering out and removing poisoned samples from training datasets. However, existing methods have shown limited efficacy in countering the ever-evolving trigger functions, and often leading to considerable degradation of benign accuracy. In this paper, we propose DataElixir, a novel sanitization approach tailored to purify poisoned datasets. We leverage diffusion models to eliminate trigger features and restore benign features, thereby turning the poisoned samples into benign ones. Specifically, with multiple iterations of the forward and reverse process, we extract intermediary images and their predicted labels for each sample in the original dataset. Then, we identify anomalous samples in terms of the presence of label transition of the intermediary images, detect the target label by quantifying distribution discrepancy, select their purified images considering pixel and feature distance, and determine their ground-truth labels by training a benign model. Experiments conducted on 9 popular attacks demonstrates that DataElixir effectively mitigates various complex attacks while exerting minimal impact on benign accuracy, surpassing the performance of baseline defense methods.Comment: Accepted by AAAI202

arXiv.org e-Print Archive

Event-driven Real-time Retrieval in Web Search

Author: Bai Xiaoling
Deng Hualong
Ma Jin
Yang Nan
Zhang Shusen
Zhang Yannan
Zhou Tianhua
Publication venue
Publication date: 04/12/2023
Field of study

Information retrieval in real-time search presents unique challenges distinct from those encountered in classical web search. These challenges are particularly pronounced due to the rapid change of user search intent, which is influenced by the occurrence and evolution of breaking news events, such as earthquakes, elections, and wars. Previous dense retrieval methods, which primarily focused on static semantic representation, lack the capacity to capture immediate search intent, leading to inferior performance in retrieving the most recent event-related documents in time-sensitive scenarios. To address this issue, this paper expands the query with event information that represents real-time search intent. The Event information is then integrated with the query through a cross-attention mechanism, resulting in a time-context query representation. We further enhance the model's capacity for event representation through multi-task training. Since publicly available datasets such as MS-MARCO do not contain any event information on the query side and have few time-sensitive queries, we design an automatic data collection and annotation pipeline to address this issue, which includes ModelZoo-based Coarse Annotation and LLM-driven Fine Annotation processes. In addition, we share the training tricks such as two-stage training and hard negative sampling. Finally, we conduct a set of offline experiments on a million-scale production dataset to evaluate our approach and deploy an A/B testing in a real online system to verify the performance. Extensive experimental results demonstrate that our proposed approach significantly outperforms existing state-of-the-art baseline methods

arXiv.org e-Print Archive

Optimization of ship speed and fleet deployment under carbon emissions policies for container shipping

Author: Hualong Yang
Xuefei Ma
Yan Zhang
Yuwei Xing
Publication venue: 'Vilnius Gediminas Technical University'
Publication date: 01/03/2019
Field of study

In this paper, under the consideration of two carbon emissions policies, the issues of optimizing ship speed and fleet deployment for container shipping were addressed. A mixed-integer nonlinear programming model of ship speed and fleet deployment was established with the objective of minimising total weekly operating costs. A simulated annealing algorithm was proposed to solve the problem. An empirical analysis was conducted with the data selected from the benchmark suite. The applicability and effectiveness of the established model and its algorithm are verified by the results. According to the results, two policies of the cap-and-trade programme and the carbon tax can better optimize the results of the ship speed and fleet deployment problem to achieve the goal of reducing carbon emissions. The research remarks in this paper will provide a solution for container shipping companies to make optimized decisions under carbon emissions policies

Directory of Open Access Journals

VGTU Journals (Vilnius Gediminas Technical University - Vilnius Tech)

MEA-Defender: A Robust Watermark against Model Extraction Attack

Author: Chen Kai
Li Pan
Liang Ruigang
Lv Peizhuo
Ma Hualong
Zhang Shengzhi
Zhang Yingjun
Zhou Jiachen
Zhu Shenchen
Publication venue
Publication date: 26/01/2024
Field of study

Recently, numerous highly-valuable Deep Neural Networks (DNNs) have been trained using deep learning algorithms. To protect the Intellectual Property (IP) of the original owners over such DNN models, backdoor-based watermarks have been extensively studied. However, most of such watermarks fail upon model extraction attack, which utilizes input samples to query the target model and obtains the corresponding outputs, thus training a substitute model using such input-output pairs. In this paper, we propose a novel watermark to protect IP of DNN models against model extraction, named MEA-Defender. In particular, we obtain the watermark by combining two samples from two source classes in the input domain and design a watermark loss function that makes the output domain of the watermark within that of the main task samples. Since both the input domain and the output domain of our watermark are indispensable parts of those of the main task samples, the watermark will be extracted into the stolen model along with the main task during model extraction. We conduct extensive experiments on four model extraction attacks, using five datasets and six models trained based on supervised learning and self-supervised learning algorithms. The experimental results demonstrate that MEA-Defender is highly robust against different model extraction attacks, and various watermark removal/detection approaches.Comment: To Appear in IEEE Symposium on Security and Privacy 2024 (IEEE S&P 2024), MAY 20-23, 2024, SAN FRANCISCO, CA, US

arXiv.org e-Print Archive

Relationship between the current status of research on geological storage of solid, liquid and gas wastes in coal mines and the coordinated development of the ecological environment in China

Author: Chao MA
Hualong HU
Jinglei NIE
Jun WU
Kaicheng ZHU
Kang ZHAO
Publication venue: Editorial Office of Journal of China Coal Society
Publication date: 01/06/2024
Field of study

China is a country in the world with the serious environmental pollution of coal mine “three wastes” (solid, liquid, gas). A lot of in-depth research and practice has been carried out on the utilization and treatment of “three wastes”. However, there are still many problems such as imperfect standards and norms, small scale of treatment and unsound technology. In order to solve the problem of synergistic development of low-cost geological storage of large-scale “three wastes” and ecological environment in China’s coal mines, on the basis of the definition of geological storage in other countries, the connotation of geological storage in China has been expanded. The progress and current status of research on the geological storage of “three wastes” are analyzed. Literature and patents related to the geological storage of “three wastes” at home and abroad are reviewed. The problems faced by China in carrying out the geological storage of “three wastes” and the suggestions for further development are put forward. It is pointed out that the main problem faced by the geological storage of “three wastes” in China is the inadequacy of the standards and regulations in the field of environment, especially the extensive lack of standards for the deep-well injection of waste liquids. The systematic research has shown that the research institutions in China are paying increasing attention to research in the field of the geological storage of the “three wastes”, and that the results of the research account for a high percentage of research in the world. China’s coal mine “three wastes” geological storage and ecological environment synergistic development awareness and system is being formed. However, there is insufficient support for research on the large-scale geological storage of “three wastes”, the cyclic system of geological storage of “three wastes” in the whole cycle of coal mining, and the synergistic relationship between CO2 capture, utilization and storage (CCUS) technology and the ecological environment. This seriously restricts the large-scale implementation and application of the concepts, technologies and projects of geological storage. China should expeditiously strengthen the scientific and technological research and development of coal mine “three wastes” geological storage technology and ecological environment synergistic development. Through the establishment of improved standards and norms, increased technological research and development, and strengthened environmental supervision and other measures, the green and sustainable development in China’s coal mines is promoted, helping the China’s “dual-carbon” goal to be realized

Directory of Open Access Journals

SSL-WM: A Black-Box Watermarking Approach for Encoders Pre-trained by Self-supervised Learning

Author: Cai Yuling
Chen Kai
Li Pan
Liang Ruigang
Lv Peizhuo
Ma Hualong
Meng Guozhu
Xiang Fan
Yue Chang
Zhang Shengzhi
Zhang Yingjun
Zhu Shenchen
Publication venue
Publication date: 08/09/2022
Field of study

Recent years have witnessed significant success in Self-Supervised Learning (SSL), which facilitates various downstream tasks. However, attackers may steal such SSL models and commercialize them for profit, making it crucial to protect their Intellectual Property (IP). Most existing IP protection solutions are designed for supervised learning models and cannot be used directly since they require that the models' downstream tasks and target labels be known and available during watermark embedding, which is not always possible in the domain of SSL. To address such a problem especially when downstream tasks are diverse and unknown during watermark embedding, we propose a novel black-box watermarking solution, named SSL-WM, for protecting the ownership of SSL models. SSL-WM maps watermarked inputs by the watermarked encoders into an invariant representation space, which causes any downstream classifiers to produce expected behavior, thus allowing the detection of embedded watermarks. We evaluate SSL-WM on numerous tasks, such as Computer Vision (CV) and Natural Language Processing (NLP), using different SSL models, including contrastive-based and generative-based. Experimental results demonstrate that SSL-WM can effectively verify the ownership of stolen SSL models in various downstream tasks. Furthermore, SSL-WM is robust against model fine-tuning and pruning attacks. Lastly, SSL-WM can also evade detection from evaluated watermark detection approaches, demonstrating its promising application in protecting the IP of SSL models

arXiv.org e-Print Archive

Virus-Free and Live-Cell Visualizing SARS-CoV-2 Cell Entry for Studies of Neutralizing Antibodies and Compound Inhibitors

Author: Baorong Fu
Chenguang Shen
Chuanlai Yang
Hai Yu
Hualong Xiong
Jiali Cao
Jian Ma
Jianda Hu
Jianghui Ye
Jingjing Xu
Juan Wang
Jun Zhang
Lei Liu
Liang Zhang
Lunzhi Yuan
Min Wei
Ningshao Xia
Qingbing Zheng
Quan Yuan
Shaojuan Wang
Sheng Nian
Shengxiang Ge
Tianying Zhang
Ting Yang
Tong Cheng
Wangheng Hou
Yali Zhang
Yang Shi
Yangtao Wu
Yixin Chen
Zhiyong Li
Zonglin Li
侯汪衡
夏宁邵
巫洋涛
张雅丽
沈晨光
王邵娟
程通
袁伦志
袁权
Publication venue: 'Wiley'
Publication date: 18/12/2020
Field of study

新型冠状病毒SARS-CoV-2在全球蔓延，给全球公共卫生带来严重威胁。快速研制疫苗、抗体和治疗药物成为科学界面临的重大挑战。由于SARS-CoV-2的高度传染性，采用病毒感染模型进行中和抗体及小分子抑制剂的药效评估需要在高等级生物安全实验室中进行，且常需要数天时间才能完成检测，限制了抗体和药物筛选的效率。发展快速、可视、不依赖于活病毒的新冠病毒入胞检测探针和细胞模型，对于加速新冠病毒抗体和药物的研究有重要意义。夏宁邵教授团队通过CHO真核表达系统高效表达制备出C端融合抗酸荧光蛋白Gamillus的重组新冠病毒spike蛋白STG。STG经SEC分子筛和冷冻电镜确认呈现与天然病毒刺突高度相似的三聚体结构，且与ACE2有很高的亲和力（18.2nM）。STG具备良好的细胞相容性和荧光性质，研究者进一步开发了可定量测定感染恢复期血清、疫苗免疫血清中和抗体（入胞阻断抗体）水平的CSBT检测方法。除了抗体检测评估方面的应用外，该研究发展的探针和模型还可用于筛选分析抑制新冠病毒入胞及胞内转运的小分子化合物。我校博士后张雅丽，博士生王邵娟、巫洋涛，博士后侯汪衡、袁伦志和深圳市第三人民医院沈晨光博士为共同第一作者。厦门大学夏宁邵教授、袁权教授、程通教授为该论文共同通讯作者。The ongoing corona virus disease 2019 (COVID-19) pandemic, caused by SARS-CoV-2 infection, has resulted in hundreds of thousands of deaths. Cellular entry of SARS-CoV-2, which is mediated by the viral spike protein and ACE2 receptor, is an essential target for the development of vaccines, therapeutic antibodies, and drugs. Using a mammalian cell expression system,a genetically engineered sensor of fluorescent protein (Gamillus)-fused SARS-CoV-2 spike trimer (STG) to probe the viral entry process is developed.In ACE2-expressing cells, it is found that the STG probe has excellent performance in the live-cell visualization of receptor binding, cellular uptake, and intracellular trafficking of SARS-CoV-2 under virus-free conditions. The new system allows quantitative analyses of the inhibition potentials and detailed influence of COVID-19-convalescent human plasmas, neutralizing antibodies and compounds, providing a versatile tool for high-throughput screening and phenotypic characterization of SARS-CoV-2 entry inhibitors. This approach may also be adapted to develop a viral entry visualization system for other viruses.This study was supported by National Natural Science Foundation of China (81993149041 for N.X.; 81902057 for Y.Z.; 81871316 and U1905205 for Q.Y.), the National Science and Technology Major Project of Infectious Diseases (No. 2017ZX10304402‐002‐003 for T.C. and No. 2017ZX10202203‐009 for Q.Y.), the National Science and Technology Major Projects for Major New Drugs Innovation and Development (No. 2018ZX09711003‐005‐003 for T.C.), the Science and Technology Major Project of Fujian (2020YZ014001), the Science and Technology Major Project of Xiamen (3502Z2020YJ01), and the Guangdong Basic and Applied Basic Research Foundation (2020A1515010368 for C.S.). 该研究得到了国家自然科学基金、传染病防治国家科技重大专项、福建省应急科技攻关项目和厦门应急科技攻关项目的支持

Xiamen University Institutional Repository

Uncovering CO<sub>2</sub> Emissions Patterns from China-Oriented International Maritime Transport: Decomposition and Decoupling Analysis

Author: Hualong Yang
Xuefei Ma
Publication venue: 'MDPI AG'
Publication date: 01/05/2019
Field of study

Given that most commodity transportation depends on the maritime industry, the growing economy and increasing international trade volume are expected to accelerate the development of shipping activities and thus increase associated CO2 emissions. In order to identify the driving factors of CO2 emissions from China’s international shipping and find efficient mitigation strategies, this paper first estimates the CO2 emissions and presents the CO2 emissions features from 2000 to 2017. Second, the Logarithmic Mean Divisia Index (LMDI) method is applied to decompose the changes in CO2 emissions. Finally, the decoupling index is introduced to quantitatively examine the decoupling relationship between economic growth and CO2 emissions. The factors affecting the decoupling relationship are analyzed according to the LMDI results. The results indicate that CO2 emissions in maritime transport activities have experienced rapid growth during the study period. Economic growth appears to be the principal factor driving the CO2 emissions growth, whereas the overall effects of energy intensity and the commodity structure play a significant role in inhibiting CO2 emissions. The decoupling state over the study period has experienced four decoupling stages, with a distinct tendency towards weak decoupling. Economic activity has proven to be the most significant indicator influencing the decoupling relationship during the study period

Directory of Open Access Journals

SIMULINK Simulation Implementation of DS/FH Hybrid Spread Spectrum Communication System

Author: He Ye
Ma Bing
Ma Hualong
Yu Changren
Publication venue: 'EDP Sciences'
Publication date: 01/01/2018
Field of study

The basic principle of DS/FH hybrid spread spectrum communication system is introduced. The Simulink simulation software is used to establish the system simulation model, and the dynamic simulation of the system is realized. The anti-jamming performance of the system under tracking interference and blocking interference is analyzed

Directory of Open Access Journals

SIMULINK Simulation Implementation of DS/FH Hybrid Spread Spectrum Communication System

Author: Bing Ma
Changren Yu
Hualong Ma
Ye He
Publication venue: EDP Sciences
Publication date: 19/11/2018
Field of study

EDP Sciences OAI-PMH repository (1.2.0)