Search CORE

12 research outputs found

On the Adversarial Robustness of Vision Transformers

Author: Chen Pin-Yu
Hsieh Cho-Jui
Shao Rulin
Shi Zhouxing
Yi Jinfeng
Publication venue
Publication date: 14/10/2021
Field of study

Following the success in advancing natural language processing and understanding, transformers are expected to bring revolutionary changes to computer vision. This work provides the first and comprehensive study on the robustness of vision transformers (ViTs) against adversarial perturbations. Tested on various white-box and transfer attack settings, we find that ViTs possess better adversarial robustness when compared with convolutional neural networks (CNNs). This observation also holds for certified robustness. We summarize the following main observations contributing to the improved robustness of ViTs: 1) Features learned by ViTs contain less low-level information and are more generalizable, which contributes to superior robustness against adversarial perturbations. 2) Introducing convolutional or tokens-to-token blocks for learning low-level features in ViTs can improve classification accuracy but at the cost of adversarial robustness. 3) Increasing the proportion of transformers in the model structure (when the model consists of both transformer and CNN blocks) leads to better robustness. But for a pure transformer model, simply increasing the size or adding layers cannot guarantee a similar effect. 4) Pre-training on larger datasets does not significantly improve adversarial robustness though it is critical for training ViTs. 5) Adversarial training is also applicable to ViT for training robust models. Furthermore, feature visualization and frequency analysis are conducted for explanation. The results show that ViTs are less sensitive to high-frequency perturbations than CNNs and there is a high correlation between how well the model learns low-level features and its robustness against different frequency-based perturbations

arXiv.org e-Print Archive

Cross-modal Attention Congruence Regularization for Vision-Language Relation Alignment

Author: Liang Paul Pu
Morency Louis-Philippe
Pandey Rohan
Salakhutdinov Ruslan
Shao Rulin
Publication venue
Publication date: 04/07/2023
Field of study

Despite recent progress towards scaling up multimodal vision-language models, these models are still known to struggle on compositional generalization benchmarks such as Winoground. We find that a critical component lacking from current vision-language models is relation-level alignment: the ability to match directional semantic relations in text (e.g., "mug in grass") with spatial relationships in the image (e.g., the position of the mug relative to the grass). To tackle this problem, we show that relation alignment can be enforced by encouraging the directed language attention from 'mug' to 'grass' (capturing the semantic relation 'in') to match the directed visual attention from the mug to the grass. Tokens and their corresponding objects are softly identified using the cross-modal attention. We prove that this notion of soft relation alignment is equivalent to enforcing congruence between vision and language attention matrices under a 'change of basis' provided by the cross-modal attention matrix. Intuitively, our approach projects visual attention into the language attention space to calculate its divergence from the actual language attention, and vice versa. We apply our Cross-modal Attention Congruence Regularization (CACR) loss to UNITER and improve on the state-of-the-art approach to Winoground.Comment: ACL 202

arXiv.org e-Print Archive

MPCFormer: fast, performant and private Transformer inference with MPC

Author: Guo Han
Li Dacheng
Shao Rulin
Wang Hongyi
Xing Eric P.
Zhang Hao
Publication venue
Publication date: 02/11/2022
Field of study

Enabling private inference is crucial for many cloud inference services that are based on Transformer models. However, existing private inference solutions for Transformers can increase the inference latency by more than 60x or significantly compromise the quality of inference results. In this paper, we design the framework MPCFORMER using secure multi-party computation (MPC) and Knowledge Distillation (KD). It can be used in tandem with many specifically designed MPC-friendly approximations and trained Transformer models. MPCFORMER significantly speeds up Transformer model inference in MPC settings while achieving similar ML performance to the input model. We evaluate MPCFORMER with various settings in MPC. On the IMDb dataset, we achieve similar performance to BERTBASE, while being 5.3x faster. On the GLUE benchmark, we achieve 97% performance of BERTBASE with a 2.2x speedup. We show that MPCFORMER remains effective with different trained Transformer weights such as ROBERTABASE and larger models including BERTLarge. In particular, we achieve similar performance to BERTLARGE, while being 5.93x faster on the IMDb dataset

arXiv.org e-Print Archive

VisIT-Bench: A Benchmark for Vision-Language Instruction Following Inspired by Real-World Use

Author: Awadalla Anas
Bansal Hritik
Bitton Yonatan
Gardner Josh
Hessel Jack
Schimdt Ludwig
Shao Rulin
Taori Rohan
Zhu Wanrong
Publication venue
Publication date: 12/08/2023
Field of study

We introduce VisIT-Bench (Visual InsTruction Benchmark), a benchmark for evaluation of instruction-following vision-language models for real-world use. Our starting point is curating 70 'instruction families' that we envision instruction tuned vision-language models should be able to address. Extending beyond evaluations like VQAv2 and COCO, tasks range from basic recognition to game playing and creative generation. Following curation, our dataset comprises 592 test queries, each with a human-authored instruction-conditioned caption. These descriptions surface instruction-specific factors, e.g., for an instruction asking about the accessibility of a storefront for wheelchair users, the instruction-conditioned caption describes ramps/potential obstacles. These descriptions enable 1) collecting human-verified reference outputs for each instance; and 2) automatic evaluation of candidate multimodal generations using a text-only LLM, aligning with human judgment. We quantify quality gaps between models and references using both human and automatic evaluations; e.g., the top-performing instruction-following model wins against the GPT-4 reference in just 27% of the comparison. VisIT-Bench is dynamic to participate, practitioners simply submit their model's response on the project website; Data, code and leaderboard is available at visit-bench.github.io

arXiv.org e-Print Archive

Efficacy and safety of low-dose IL-2 in the treatment of systemic lupus erythematosus: A randomised, double-blind, placebo-controlled trial

Author: Alexander Jessy
Ambrus Julian
Chen Jiali
Guo Jianping
He Jing
Jacob Alexander
Jia Rulin
Jin Yuebo
Li Chun
Li Xue
Li Zhanguo
Lin Xin
Liu Jiajia
Liu Yanying
Miao Miao
Shao Miao
Shen Nan
Su Yin
Sun Xiaolin
Wang Yu
Yang Yue
Ye Hua
You Xujie
Yu Di
Zhang Ruijun
Zhang Shilei
Zhang Xia
Zhang Xiaoying
Zhao Xiaozhen
Zhou Yunshan
Zhu Lei
Publication venue: 'BMJ'
Publication date: 28/11/2021
Field of study

Objectives Open-labelled clinical trials suggested that low-dose IL-2 might be effective in treatment of systemic lupus erythematosus (SLE). A double-blind and placebocontrolled trial is required to formally evaluate the safety and efficacy of low-dose IL-2 therapy. Methods A randomised, double-blind and placebocontrolled clinical trial was designed to treat 60 patients with active SLE. These patients received either IL-2 (n=30) or placebo (n=30) with standard treatment for 12 weeks, and were followed up for additional 12 weeks. IL-2 at a dose of 1 million IU or placebo was administered subcutaneously every other day for 2 weeks and followed by a 2-week break as one treatment cycle. The primary endpoint was the SLE Responder Index-4 (SRI-4) at week 12. The secondary endpoints were other clinical responses, safety and dynamics of immune cell subsets. Results At week 12, the SRI-4 response rates were 55.17% and 30.00% for IL-2 and placebo, respectively (p=0.052). At week 24, the SRI-4 response rate of IL-2 group was 65.52%, compared with 36.67% of the placebo group (p=0.027). The primary endpoint was not met at week 12. Low-dose IL-2 treatment resulted in 53.85% (7/13) complete remission in patients with lupus nephritis, compared with 16.67% (2/12) in the placebo group (p=0.036). No serious infection was observed in the IL-2 group, but two in placebo group. Besides expansion of regulatory T cells, low-dose IL-2 may also sustain cellular immunity with enhanced natural killer cells. Conclusions Low-dose IL-2 might be effective and tolerated in treatment of SThe work was supported by the National Natural Science Foundation of China (31530020,31570880,81471601,81601417 and 81701598), Peking-Tsinghua Center for Life Sciences to ZG LI, Beijing Sci-Tech Committee Z171100000417007,Clinical Medicine Plus X-Young Scholars Project of Peking University (PKU2019LCXQ013) supported by the Fundamental Research Funds for the Central Universities, Beijing Nova Program Z171100001117025, National Key Research and Development Program of China (2017YFC0909003 to DY), BellberryViertel Senior Medical Research Fellowship to DY and Beijing SL PHARM

The Australian National University

Tubeless video-assisted thoracic surgery for pulmonary ground-glass nodules: expert consensus and protocol (Guangzhou)

Author: Aiolfi Alberto
Akopov Andrey
Ang Keng-Leong
Bertolaccini Luca
Cai Kaican
Cao Qingdong
Chen Baojun
Chen Chang
Chen Chun
Chen Donglai
Chen Fengxia
Chen Jun
Chen Ke-Neng
Chen Lei
Chen Mingwu
Chen Yongbing
Chen Zhuxing
Cheng Chao
Cui Dong
Cui Fei
Cui Jian
Dai Tianyang
Dong Qinglong
Ferrari Paolo A
Flores Raja M
Froudarakis Marios E
Fu Junke
Funaki Soichiro
Gan Xiangfeng
Geng Mingfei
Guo Jialong
Guo Qiang
Han Yongtao
He Jianxing
He Jintao
He Kaiming
Hirai Kyoji
Hu Jian
Hu Shuqiao
Huang Jian
Huang Jun
Jiang Wenfa
Kim Kyung Soo
Kiss Gabor
Kong Fanyi
Lan Lan
Leng Xuefeng
Li Bin
Li Gaofeng
Li Hecheng
Li Hefei
Li Heng
Li Jiwei
Li Shuben
Li Xiaoqiang
Li Yinfen
Li Zhuoyi
Liang Hengrui
Liang Lixia
Liang Wenhua
Liang Yi
Liao Yongde
Lin Wanli
Lin Xu
Liu Hongxu
Liu Hui
Liu Jixian
Liu Jun
Liu Xiang
Liu Zihao
Lu Xingzhao
Luo Qingquan
Mao Naiquan
Pan Qi
Pang Dazhi
Peng Jun
Peng Jun
Pompeo Eugenio
Qian Rulin
Qiao Kun
Redwan Bassam
Sang Zi
Shao Wenlong
Shen Jianfei
Shen Weiyu
Sung Sook-Whan
Tang Wenfang
Wang Guangsuo
Wang Haitao
Wang Huien
Wang Jiyong
Wang Tianhu
Wang Wei
Wang Wen
Wang Yongyong
Wang Zhenyuan
Wei Li
Wei Wei
Wu Hao
Wu Jie
Xia Zhaohua
Xu Chenyang
Xu Enwu
Xu Hai
Xu Ning
Xu Quan
Xu Rongyu
Xu Shun
Yan Yubo
Yang Chaokun
Yang Hanyu
Yang Shengli
Yi Jun
Zhang Guangjian
Zhang Hao
Zhang Jia
Zhang Man
Zhang Xiao
Zhang Yajie
Zhang Zhe
Zhang Zhifeng
Zhao Honglin
Zhao Jian
Zhao Xiaodong
Zhou Jianping
Zhou Yanran
Zhu Chengchu
Zhu Shaojin
Zhu Xinhai
Publication venue: 'AME Publishing Company'
Publication date: 01/08/2021
Field of study

PubMed Central

ART

Stochastic Channel-Based Federated Learning With Neural Network Pruning for Medical Data Privacy Preservation: Model Development and Experimental Validation

Author: Chen Ziwei
He Hongyu
Liu Dianbo
Liu Hui
Shao Rulin
Publication venue: 'JMIR Publications Inc.'
Publication date: 26/10/2021
Field of study

Background: Artificial neural networks have achieved unprecedented success in the medical domain. This success depends on the availability of massive and representative datasets. However, data collection is often prevented by privacy concerns, and people want to take control over their sensitive information during both the training and using processes. Objective: To address security and privacy issues, we propose a privacy-preserving method for the analysis of distributed medical data. The proposed method, termed stochastic channel-based federated learning (SCBFL), enables participants to train a high-performance model cooperatively and in a distributed manner without sharing their inputs. Methods: We designed, implemented, and evaluated a channel-based update algorithm for a central server in a distributed system. The update algorithm will select the channels with regard to the most active features in a training loop, and then upload them as learned information from local datasets. A pruning process, which serves as a model accelerator, was further applied to the algorithm based on the validation set. Results: We constructed a distributed system consisting of 5 clients and 1 server. Our trials showed that the SCBFL method can achieve an area under the receiver operating characteristic curve (AUC-ROC) of 0.9776 and an area under the precision-recall curve (AUC-PR) of 0.9695 with only 10% of channels shared with the server. Compared with the federated averaging algorithm, the proposed SCBFL method achieved a 0.05388 higher AUC-ROC and 0.09695 higher AUC-PR. In addition, our experiment showed that 57% of the time is saved by the pruning process with only a reduction of 0.0047 in AUC-ROC performance and a reduction of 0.0068 in AUC-PR performance. Conclusions: In this experiment, our model demonstrated better performance and a higher saturating speed than the federated averaging method, which reveals all of the parameters of local models to the server. The saturation rate of performance could be promoted by introducing a pruning process and further improvement could be achieved by tuning the pruning rate

DSpace@MIT

Indole compounds with N

Author: Arevalo-Martin
Baker
Belvisi
Chadha
Collins
D'Ambra
Da
Diaz
Elliott
Ermann
Frost
Gratzke
Hohmann
Howlett
Huang
Kalgutkar
Kenneth
Li
Matsuda
Menciu
Morales
Munro
Ohta
Pagé
Pasquini
Pasquini
Pertwee
Picone
Rabelo
Rajesh
Ross
Rulin
Ryberg
Scutt
Shao
Singh
Soethoudt
Steven
Valenzano
Wrobleski
Zindell
Publication venue: 'Royal Society of Chemistry (RSC)'
Publication date: 01/01/2019
Field of study

Crossref