Search CORE

51 research outputs found

Can ChatGPT Defend its Belief in Truth? Evaluating LLM Reasoning via Debate

Author: Sun Huan
Wang Boshi
Yue Xiang
Publication venue
Publication date: 10/10/2023
Field of study

Large language models (LLMs) such as ChatGPT and GPT-4 have shown impressive performance in complex reasoning tasks. However, it is difficult to know whether the models are reasoning based on deep understandings of truth and logic, or leveraging their memorized patterns in a relatively superficial way. In this work, we explore testing LLMs' reasoning by engaging with them in a debate-like conversation, where given a question, the LLM and the user need to discuss to make the correct decision starting from opposing arguments. Upon mitigating the Clever Hans effect, our task requires the LLM to not only achieve the correct answer on its own, but also be able to hold and defend its belief instead of blindly believing or getting misled by the user's (invalid) arguments and critiques, thus testing in greater depth whether the LLM grasps the essence of the reasoning required to solve the problem. Across a range of complex reasoning benchmarks spanning math, commonsense, logic and BIG-Bench tasks, we find that despite their impressive performance as reported in existing work on generating correct step-by-step solutions in the beginning, LLMs like ChatGPT cannot maintain their beliefs in truth for a significant portion of examples when challenged by oftentimes absurdly invalid arguments. Our work points to danger zones of model alignment, and also suggests more careful treatments and interpretations of the recent findings that LLMs can improve their responses based on feedback.Comment: EMNLP-23 (findings

arXiv.org e-Print Archive

Towards Understanding Chain-of-Thought Prompting: An Empirical Study of What Matters

Author: Deng Xiang
Min Sewon
Shen Jiaming
Sun Huan
Wang Boshi
Wu You
Zettlemoyer Luke
Publication venue
Publication date: 01/06/2023
Field of study

Chain-of-Thought (CoT) prompting can dramatically improve the multi-step reasoning abilities of large language models (LLMs). CoT explicitly encourages the LLM to generate intermediate rationales for solving a problem, by providing a series of reasoning steps in the demonstrations. Despite its success, there is still little understanding of what makes CoT prompting effective and which aspects of the demonstrated reasoning steps contribute to its performance. In this paper, we show that CoT reasoning is possible even with invalid demonstrations - prompting with invalid reasoning steps can achieve over 80-90% of the performance obtained using CoT under various metrics, while still generating coherent lines of reasoning during inference. Further experiments show that other aspects of the rationales, such as being relevant to the query and correctly ordering the reasoning steps, are much more important for effective CoT reasoning. Overall, these findings both deepen our understanding of CoT prompting, and open up new questions regarding LLMs' capability to learn to reason in context.Comment: ACL-23 Camera Ready. Code and model input/output are available at https://github.com/sunlab-osu/Understanding-Co

arXiv.org e-Print Archive

Mind2Web: Towards a Generalist Agent for the Web

Author: Chen Shijie
Deng Xiang
Gu Yu
Stevens Samuel
Su Yu
Sun Huan
Wang Boshi
Zheng Boyuan
Publication venue
Publication date: 09/06/2023
Field of study

We introduce Mind2Web, the first dataset for developing and evaluating generalist agents for the web that can follow language instructions to complete complex tasks on any website. Existing datasets for web agents either use simulated websites or only cover a limited set of websites and tasks, thus not suitable for generalist web agents. With over 2,000 open-ended tasks collected from 137 websites spanning 31 domains and crowdsourced action sequences for the tasks, Mind2Web provides three necessary ingredients for building generalist web agents: 1) diverse domains, websites, and tasks, 2) use of real-world websites instead of simulated and simplified ones, and 3) a broad spectrum of user interaction patterns. Based on Mind2Web, we conduct an initial exploration of using large language models (LLMs) for building generalist web agents. While the raw HTML of real-world websites are often too large to be fed to LLMs, we show that first filtering it with a small LM significantly improves the effectiveness and efficiency of LLMs. Our solution demonstrates a decent level of performance, even on websites or entire domains the model has never seen before, but there is still a substantial room to improve towards truly generalizable agents. We open-source our dataset, model implementation, and trained models (https://osu-nlp-group.github.io/Mind2Web) to facilitate further research on building a generalist agent for the web.Comment: website: https://osu-nlp-group.github.io/Mind2We

arXiv.org e-Print Archive

Fatty acid metabolism is related to the immune microenvironment changes of gastric cancer and RGS2 is a new tumor biomarker

Author: Boshi Sun
Hao Yang
Nana Li
Shifeng Yang
Shifeng Yang
Wenjing Li
Xinyu Zhang
Publication venue: 'Frontiers Media SA'
Publication date: 01/12/2022
Field of study

BackgroundAlterations in lipid metabolism promote tumor progression. However, the role of lipid metabolism in the occurrence and development of gastric cancer have not been fully clarifiedMethodHere, genes that are related to fatty acid metabolism and differentially-expressed between normal and gastric cancer tissues were identified in the TCGA-STAD cohort. The intersection of identified differentially-expressed genes with Geneset was determined to obtain 78 fatty acid metabolism-related genes. The ConsensusClusterPlus R package was used to perform differentially-expressed genes, which yielded divided two gastric cancer subtypes termed cluster 1 and cluster 2.ResultsPatients in cluster 2 was found to display poorer prognosis than patients in cluster 1. Using machine learning method to select 8 differentially expressed genes among subtypes to construct fatty acid prognostic risk score model (FARS), which was found to display good prognostic efficacy. We also identified that certain anticancer drugs, such as bortezomib, elesclomol, GW843682X, and nilotinib, showed significant sensitivity in the high FARS score group. RGS2 was selected as the core gene upon an analysis of the gastric cancer single-cell, and Western blotting and immunofluorescence staining results revealed high level of expression of this gene in gastric cancer cells. The results of immunohistochemical staining showed that a large amount of RGS2 was deposited in the stroma in gastric cancer. A pan-cancer analysis also revealed a significant association of RGS2 with TMB, TIDE, and CD8+ T-cell infiltration in other cancer types as well. RGS2 may thus be studied further as a new target for immunotherapy in future studies on gastric cancer.ConclusionIn summary, the FARS model developed here enhances our understanding of lipid metabolism in the TME in gastric cancer, and provides a theoretical basis for predicting tumor prognosis and clinical treatment

Directory of Open Access Journals

A comparative analysis of aerosol microphysical, optical and radiative properties during the Spring Festival holiday over Beijing and surrounding regions

Author: An Linchang
Che Huizheng
Estellés Leal Víctor
Gui Ke
Kang Boshi
Sun Tianze
Wang Hong
Wang Yaqiang
Xia Xiangao
Zhao Hujia
Zheng Yu
Publication venue: 'Taiwan Association for Aerosol Research'
Publication date: 01/01/2018
Field of study

Using ground-based data, meteorological observations, and atmospheric environmental monitoring data, a comparative analysis of the microphysical and optical properties, and radiative forcing of aerosols was conducted between three stations in different developed environments during a severe air pollution episode during the Spring Festival over Beijing. During the most polluted period, the daily peak values of the aerosol optical depth were ~1.62, ~1.73, and ~0.74, which were about 2.6, 2.9, and 2.1 times higher than the background levels at the CAMS, Xianghe, and Shangdianzi sites, respectively. The daily peak values of the single scattering albedo were ~0.95, ~0.96, and ~0.87. The volume of fine-mode particles varied from 0.04 to 0.21 µm3 µm-2, 0.06 to 0.17 µm3 µm-2, and 0.01 to 0.10 µm3 µm-2, which were about 0.3 to 5.8, 1.1 to 4.7, and 1.2 to 8.9 times greater than the background values, respectively. The daily absorption aerosol optical depth was ~0.01 to ~0.13 at CAMS, ~0.03 to ~0.14 at Xianghe, and ~0.01 to ~0.09 at Shangdianzi, and the absorption Ångström exponents reflected a significant increase in organic aerosols over CAMS and Xianghe and in black carbon over Shangdianzi. Aerosol radiative forcing at the bottom of the atmosphere varied from -20 to -130, -40 to -150, and -10 to -110 W m-2 for the whole holiday period, indicating the cooling effect. The potential source contribution function and concentration-weighted trajectory analysis showed that Beijing, the southern parts of Hebei and Shanxi, and the central northern part of Shandong contributed greatly to the pollution

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Repositori d'Objectes Digitals per a l'Ensenyament la Recerca i la Cultura

Nutlin-3 overcomes arsenic trioxide resistance and tumor metastasis mediated by mutant p53 in Hepatocellular Carcinoma

Author: Chen Xi
Jiang Hongchi
Li Yuejin
Liang Yingjian
Liu Jiaren
Liu Lianxin
Lu Zhaoyang
Meng Xianzhi
Pan Shangha
Qi Shuyi
Song Xuan
Sun Boshi
Wang Jiabei
Xie Changming
Yin Dalong
Zheng Tongsen
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

Background: Arsenic trioxide has been demonstrated as an effective anti-cancer drug against leukemia and solid tumors both in vitro and in vivo. However, recent phase II trials demonstrated that single agent arsenic trioxide was poorly effective against hepatocellular carcinoma (HCC), which might be due to drug resistance. Methods: Mutation detection of p53 gene in arsenic trioxide resistant HCC cell lines was performed. The therapeutic effects of arsenic trioxide and Nutlin-3 on HCC were evaluated both in vitro and in vivo. A series of experiments including MTT, apoptosis assays, co-Immunoprecipitation, siRNA transfection, lentiviral infection, cell migration, invasion, and epithelial-mesenchy-mal transition (EMT) assays were performed to investigate the underlying mechanisms. Results: The acquisition of p53 mutation contributed to arsenic trioxide resistance and enhanced metastatic potential of HCC cells. Mutant p53 (Mutp53) silence could re-sensitize HCC resistant cells to arsenic trioxide and inhibit the metastatic activities, while mutp53 overexpression showed the opposite effects. Neither arsenic trioxide nor Nutlin-3 could exhibit obvious effects against arsenic trioxide resistant HCC cells, while combination of them showed significant effects. Nutlin-3 can not only increase the intracellular arsenicals through inhibition of p-gp but also promote the p73 activation and mutp53 degradation mediated by arsenic trioxide. In vivo experiments indicated that Nutlin-3 can potentiate the antitumor activities of arsenic trioxide in an orthotopic hepatic tumor model and inhibit the metastasis to lung. Conclusions: Acquisitions of p53 mutations contributed to the resistance of HCC to arsenic trioxide. Nutlin-3 could overcome arsenic trioxide resistance and inhibit tumor metastasis through p73 activation and promoting mutant p53 degradation mediated by arsenic trioxide

Crossref

Harvard University - DASH

Springer - Publisher Connector

Whole-genome sequencing of the snub-nosed monkey provides insights into folivory and evolutionary history

Author: Bruford Michael William
Cao Zhisheng
Chang Jiang
Cheng Chen
Garber Paul A.
Hui Yuanyuan
Jiang Wenkai
Jiang Zhi
Kumar Sudhir
Li Baoguo
Li Ming
Li Mingzhou
Li Ruiqiang
Lin Yu
Liu Guangjian
Liu Zhijin
Ma Xingyong
Pan Huijuan
Pan Qi
Ren Baoping
Roos Christian
Ruan Hang
Shi Fanglei
Sun Xiaoqing
Tao Yujing
Wang Boshi
Wang Dawei
Wang Hailong
Xiang Zuofu
Yang Guang
Zhan Wei
Zhang Chenglin
Zhang Jinbo
Zhou Xuming
Zhu Pingfen
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 02/11/2014
Field of study

Colobines are a unique group of Old World monkeys that principally eat leaves and seeds rather than fruits and insects. We report the sequencing at 146× coverage, de novo assembly and analyses of the genome of a male golden snub-nosed monkey (Rhinopithecus roxellana) and resequencing at 30× coverage of three related species (Rhinopithecus bieti, Rhinopithecus brelichi and Rhinopithecus strykeri). Comparative analyses showed that Asian colobines have an enhanced ability to derive energy from fatty acids and to degrade xenobiotics. We found evidence for functional evolution in the colobine RNASE1 gene, encoding a key secretory RNase that digests the high concentrations of bacterial RNA derived from symbiotic microflora. Demographic reconstructions indicated that the profile of ancient effective population sizes for R. roxellana more closely resembles that of giant panda rather than its congeners. These findings offer new insights into the dietary adaptations and evolutionary history of colobine primates

Online Research @ Cardiff

Research on Optimal Charging of Power Lithium-Ion Batteries in Wide Temperature Range Based on Variable Weighting Factors

Author: Boshi Wang
Haitao Min
Weiyi Sun
Yuanbin Yu
Publication venue: 'MDPI AG'
Publication date: 23/03/2021
Field of study

With the popularity of electric vehicles (EV), the charging technology has become one of the bottleneck problems that limit the large-scale deployment of EVs. In this paper, a charging method using multi-stage constant current based on SOC (MCCS) is proposed, and then the charging time, charging capacity and temperature increase of the battery are optimized by multi-objective particle swarm optimization (MOPSO) algorithm. The influence of the number of charging stages, the cut-off voltage, the combination of different target weight factors and the ambient temperature on the charging strategy is further compared and discussed. Finally, according to the ambient temperature and users’ requirements of charging time, a charging strategy suitable for the specific situation is obtained by adjusting the weight factors, and the results are analyzed and justified on the basis of the experiments. The results show that the proposed strategy can intelligently make more reasonable adjustments according to the ambient temperature on the basis of meeting the charging demands of users

Multidisciplinary Digital Publishing Institute

Research on the Combined Control Strategy of Low Temperature Charging and Heating of Lithium-Ion Power Battery Based on Adaptive Fuzzy Control

Author: Boshi Wang
Haitao Min
Weiyi Sun
Yanzhou Zhang
Yuanbin Yu
Zhaopu Zhang
Publication venue: 'MDPI AG'
Publication date: 01/04/2020
Field of study

A low temperature environment will lead to the decrease of chemistry reaction rate and increase of the internal resistance of the lithium battery. In addition, the excessive charging current will cause the lithium to separate out and even the permanent attenuation of battery capacity. In order to solve these problems, this paper proposes a low-temperature charging heating combined control strategy, which takes the temperature acceptable charging current of the battery at low temperature as the charging current constraint and the maximum output power of the system as the power constraint. Firstly, a scheme of combined charging and heating control system is put forward. Secondly, the low temperature charging control strategy based on adaptive fuzzy control is established and then the model is simulated and analyzed in MATLAB software. At last, a Chroma 72,001 charge and discharge tester is used to conduct a low temperature test on 18,650 lithium iron phosphate battery monomers. The results show that the low-temperature charging control strategy proposed in this paper has a more stable temperature control effect on the battery, the constant current charging time of the battery is reduced by 14% compared with the traditional threshold control method, and the overall charging energy consumption is reduced by 5.6%

Multidisciplinary Digital Publishing Institute

Substrate Cleaning Threshold for Various Coated Al Alloys Using a Continuous-Wave Laser

Author: Boshi Yuan
Guangyong Jin
Jixing Cai
Qiansong Yu
Xiaoyu Bai
Xudong Sun
Publication venue: 'MDPI AG'
Publication date: 01/09/2021
Field of study

In this study, different coatings (gray epoxy primer, white epoxy varnish and red alkyd paint) of 7075 aluminum alloy are cleaned with a 500 W continuous-wave (CW) fiber laser. We analyzed the influence of the laser power density on the temperature evolution and target surface morphology. Under the condition of continuous laser irradiation for 1 s, the experimental results indicated that the suitable cleaning thresholds of epoxy primer, epoxy primer and epoxy varnish, as well as epoxy primer, epoxy varnish and alkyd paint were 177.74, 192.89 and 147.44 W/mm2. The results show that the cleaning threshold of thicker three-layer paint target was smaller than the single-layer paint layer, and we analyze the mechanism of this phenomenon

Multidisciplinary Digital Publishing Institute

Directory of Open Access Journals