Search CORE

3 research outputs found

Vision Language Pre-training by Contrastive Learning with Cross-Modal Similarity Regulation

Author: Huang Fei
Jiang Chaoya
Xu Haiyang
yan Miang
Ye Wei
Zhang Jie
Zhang Shikun
Publication venue
Publication date: 22/06/2023
Field of study

Cross-modal contrastive learning in vision language pretraining (VLP) faces the challenge of (partial) false negatives. In this paper, we study this problem from the perspective of Mutual Information (MI) optimization. It is common sense that InfoNCE loss used in contrastive learning will maximize the lower bound of MI between anchors and their positives, while we theoretically prove that MI involving negatives also matters when noises commonly exist. Guided by a more general lower bound form for optimization, we propose a contrastive learning strategy regulated by progressively refined cross-modal similarity, to more accurately optimize MI between an image/text anchor and its negative texts/images instead of improperly minimizing it. Our method performs competitively on four downstream cross-modal tasks and systematically balances the beneficial and harmful effects of (partial) false negative samples under theoretical guidance.Comment: Accepted by ACL202

arXiv.org e-Print Archive

PandaLM: An Automatic Evaluation Benchmark for LLM Instruction Tuning Optimization

Author: Chen Hao
Jiang Chaoya
Wang Cunxiang
Wang Jindong
Wang Yidong
Xie Rui
Xie Xing
Yang Linyi
Ye Wei
Yu Zhuohao
Zeng Zhengran
Zhang Shikun
Zhang Yue
Publication venue
Publication date: 08/06/2023
Field of study

Instruction tuning large language models (LLMs) remains a challenging task, owing to the complexity of hyperparameter selection and the difficulty involved in evaluating the tuned models. To determine the optimal hyperparameters, an automatic, robust, and reliable evaluation benchmark is essential. However, establishing such a benchmark is not a trivial task due to the challenges associated with evaluation accuracy and privacy protection. In response to these challenges, we introduce a judge large language model, named PandaLM, which is trained to distinguish the superior model given several LLMs. PandaLM's focus extends beyond just the objective correctness of responses, which is the main focus of traditional evaluation datasets. It addresses vital subjective factors such as relative conciseness, clarity, adherence to instructions, comprehensiveness, and formality. To ensure the reliability of PandaLM, we collect a diverse human-annotated test dataset, where all contexts are generated by humans and labels are aligned with human preferences. Our results indicate that PandaLM-7B achieves 93.75% of GPT-3.5's evaluation ability and 88.28% of GPT-4's in terms of F1-score on our test dataset. PandaLM enables the evaluation of LLM to be fairer but with less cost, evidenced by significant improvements achieved by models tuned through PandaLM compared to their counterparts trained with default Alpaca's hyperparameters. In addition, PandaLM does not depend on API-based evaluations, thus avoiding potential data leakage. All resources of PandaLM are released at https://github.com/WeOpenML/PandaLM

arXiv.org e-Print Archive

Histomorphological Characteristics and Pathological Types of Hyperproliferation of Gastric Surface Epithelial Cells

Author: Baohui Li
Bo Jiang
Chaoya Zhu
Guang Zhao
Jianxue Bu
Lan Shen
Sunan Wang
Yangkun Wang
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2021
Field of study

Objective. To investigate the histomorphological characteristics and pathological types of hyperproliferation of gastric surface epithelial cells. Methods. Hematoxylin and Eosin, Periodic acid–Schiff, and immunohistochemical staining were performed on biopsy specimens obtained from 723 patients with hyperproliferation of gastric surface epithelial cells and/or hyperplasia of gastric pits. Follow-up gastroscopic reexaminations were performed on 475 patients included. Improvement probability was analyzed using Kaplan-Meyer as well as Cox proportional hazards models. Results. Seven different histomorphologies and clinicopathologies of hyperproliferation of gastric surface epithelial cells were identified: (1) common hyperplasia of gastric epithelial cells, which was characterized by focal glandular epithelial hyperplasia of gastric pits with chronic inflammation; (2) drug-induced hyperplasia of gastric epithelial cells, which was characterized by increased hyperplasia of gastric pits and cells arranged in a monolayer; (3) Helicobacter pylori (Hp) infection-induced hyperplasia of gastric epithelial cells, which was characterized by the disappearance of oval, spherical, and bounded membrane-enclosed mucus-containing granules in the cytoplasm and on the nucleus together with cytoplasmic swelling and vacuolation; (4) metaplastic hyperplasia of gastric epithelial cells, which was characterized by the coexistence of intestinal metaplastic cells with hyperplastic gastric epithelial cells; (5) atrophic hyperplasia of gastric epithelial cells, which was characterized by the mucosal atrophy accompanied with hyperplasia of gastric pits; (6) low-grade neoplasia of epithelial cells, which was characterized by the mild to moderate dysplasia of gastric epithelial cells; and (7) high-grade neoplasia of epithelial cells, which was characterized by the evident dysplasia of hyperplastic epithelial cells and losses of cell polarity. The different pathological types are associated with different improvement probabilities. Conclusions. This study demonstrated the histomorphological characteristics and pathological types, which might guide clinicians to track malignant cell transformation, perform precise treatment, predict the clinical prognosis, and control the development of gastric cancer

Directory of Open Access Journals