Search CORE

4,744 research outputs found

ET-AL: Entropy-Targeted Active Learning for Bias Mitigation in Materials Data

Author: Chen Wei
Chen Wei Wayne
Rondinelli James M.
Zhang Hengrui
Publication venue: 'AIP Publishing'
Publication date: 19/02/2023
Field of study

Growing materials data and data-driven informatics drastically promote the discovery and design of materials. While there are significant advancements in data-driven models, the quality of data resources is less studied despite its huge impact on model performance. In this work, we focus on data bias arising from uneven coverage of materials families in existing knowledge. Observing different diversities among crystal systems in common materials databases, we propose an information entropy-based metric for measuring this bias. To mitigate the bias, we develop an entropy-targeted active learning (ET-AL) framework, which guides the acquisition of new data to improve the diversity of underrepresented crystal systems. We demonstrate the capability of ET-AL for bias mitigation and the resulting improvement in downstream machine learning models. This approach is broadly applicable to data-driven materials discovery, including autonomous data acquisition and dataset trimming to reduce bias, as well as data-driven informatics in other scientific domains.Comment: 35 pages, 13 figures, under revie

arXiv.org e-Print Archive

Uncertainty-Aware Mixed-Variable Machine Learning for Materials Design

Author: Apley Daniel W.
Chen Wei
Chen Wei Wayne
Iyer Akshay
Zhang Hengrui
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 04/10/2022
Field of study

Data-driven design shows the promise of accelerating materials discovery but is challenging due to the prohibitive cost of searching the vast design space of chemistry, structure, and synthesis methods. Bayesian Optimization (BO) employs uncertainty-aware machine learning models to select promising designs to evaluate, hence reducing the cost. However, BO with mixed numerical and categorical variables, which is of particular interest in materials design, has not been well studied. In this work, we survey frequentist and Bayesian approaches to uncertainty quantification of machine learning with mixed variables. We then conduct a systematic comparative study of their performances in BO using a popular representative model from each group, the random forest-based Lolo model (frequentist) and the latent variable Gaussian process model (Bayesian). We examine the efficacy of the two models in the optimization of mathematical functions, as well as properties of structural and functional materials, where we observe performance differences as related to problem dimensionality and complexity. By investigating the machine learning models' predictive and uncertainty estimation capabilities, we provide interpretations of the observed performance differences. Our results provide practical guidance on choosing between frequentist and Bayesian uncertainty-aware machine learning models for mixed-variable BO in materials design

arXiv.org e-Print Archive

On the theory of the CO+OH reaction, including H and C kinetic isotope effects

Author: Nolte J.
R. A. Marcus
Wayne R. P.
Wei-Chen Chen
Publication venue: 'AIP Publishing'
Publication date
Field of study

Crossref

SynBody: Synthetic Dataset with Layered Human Models for 3D Human Perception and Modeling

Author: Cai Zhongang
Chen Zhaoxi
Dai Bo
Lin Dahua
Liu Shuai
Liu Ziwei
Mei Haiyi
Qian Chen
Qing Zhongfei
Wei Chen
Wei Yukun
Wu Wayne
Xiao Weiye
Yang Lei
Yang Zhitao
Publication venue
Publication date: 11/09/2023
Field of study

Synthetic data has emerged as a promising source for 3D human research as it offers low-cost access to large-scale human datasets. To advance the diversity and annotation quality of human models, we introduce a new synthetic dataset, SynBody, with three appealing features: 1) a clothed parametric human model that can generate a diverse range of subjects; 2) the layered human representation that naturally offers high-quality 3D annotations to support multiple tasks; 3) a scalable system for producing realistic data to facilitate real-world tasks. The dataset comprises 1.2M images with corresponding accurate 3D annotations, covering 10,000 human body models, 1,187 actions, and various viewpoints. The dataset includes two subsets for human pose and shape estimation as well as human neural rendering. Extensive experiments on SynBody indicate that it substantially enhances both SMPL and SMPL-X estimation. Furthermore, the incorporation of layered annotations offers a valuable training resource for investigating the Human Neural Radiance Fields (NeRF).Comment: Accepted by ICCV 2023. Project webpage: https://synbody.github.io

arXiv.org e-Print Archive

A Survey on Large Language Model based Autonomous Agents

Author: Chen Xu
Chen Zhiyuan
Feng Xueyang
Lin Yankai
Ma Chen
Tang Jiakai
Wang Lei
Wei Zhewei
Wen Ji-Rong
Yang Hao
Zhang Jingsen
Zhang Zeyu
Zhao Wayne Xin
Publication venue
Publication date: 22/08/2023
Field of study

Autonomous agents have long been a prominent research topic in the academic community. Previous research in this field often focuses on training agents with limited knowledge within isolated environments, which diverges significantly from the human learning processes, and thus makes the agents hard to achieve human-like decisions. Recently, through the acquisition of vast amounts of web knowledge, large language models (LLMs) have demonstrated remarkable potential in achieving human-level intelligence. This has sparked an upsurge in studies investigating autonomous agents based on LLMs. To harness the full potential of LLMs, researchers have devised diverse agent architectures tailored to different applications. In this paper, we present a comprehensive survey of these studies, delivering a systematic review of the field of autonomous agents from a holistic perspective. More specifically, our focus lies in the construction of LLM-based agents, for which we propose a unified framework that encompasses a majority of the previous work. Additionally, we provide a summary of the various applications of LLM-based AI agents in the domains of social science, natural science, and engineering. Lastly, we discuss the commonly employed evaluation strategies for LLM-based AI agents. Based on the previous studies, we also present several challenges and future directions in this field. To keep track of this field and continuously update our survey, we maintain a repository for the related references at https://github.com/Paitesanshi/LLM-Agent-Survey.Comment: 32 pages, 3 figure

arXiv.org e-Print Archive

Recommended from our members

An animal model of SARS produced by infection of Macaca mulatta with SARS coronavirus.

Author: Chen Liangbiao
Cong Zhe
Duan Shumin
Gao Hong
Guo Lan
Guo Li
He Wei
Hong Tao
Huang Lan
Jiang Hong
Liu Depei
Liu Peimao
Liu Qian
Liu Yali
Marasco Wayne A
Qin Chuan
Qu Jianguo
Ren Lili
Ruan Li
She Mingpeng
Sun Yili
Tong Wei
Tu Xinming
Wang Jianwei
Wang Yanbin
Wei Qiang
Yang Renquan
Zhang Hua
Zhang Huiyuan
Zhang Jianmin
Zhu Hua
Publication venue: eScholarship, University of California
Publication date: 01/07/2005
Field of study

A new SARS animal model was established by inoculating SARS coronavirus (SARS-CoV) into rhesus macaques (Macaca mulatta) through the nasal cavity. Pathological pulmonary changes were successively detected on days 5-60 after virus inoculation. All eight animals showed a transient fever 2-3 days after inoculation. Immunological, molecular biological, and pathological studies support the establishment of this SARS animal model. Firstly, SARS-CoV-specific IgGs were detected in the sera of macaques from 11 to 60 days after inoculation. Secondly, SARS-CoV RNA could be detected in pharyngeal swab samples using nested RT-PCR in all infected animals from 5 days after virus inoculation. Finally, histopathological changes of interstitial pneumonia were found in the lungs during the 60 days after viral inoculation: these changes were less marked at later time points, indicating that an active healing process together with resolution of an acute inflammatory response was taking place in these animals. This animal model should provide insight into the mechanisms of SARS-CoV-related pulmonary disease and greatly facilitate the development of vaccines and therapeutics against SARS

eScholarship - University of California

Role of cardiac ryanodine receptor calmodulin-binding domains in mediating the action of arrhythmogenic calmodulin N-domain mutation N54I

Author: Brohus Malene
Chen S R Wayne
Guo Wenting
Liu Yingjie
Overgaard Michael T
Søndergaard Mads T
Wang Ruiwu
Wei Jinhong
Publication venue: 'Wiley'
Publication date: 01/06/2020
Field of study

VBN

LncRNAs: the bridge linking RNA and colorectal cancer.

Author: Chen Yi
Deng Xiangbing
Feng Min
Lau Bonnie
Lau Wayne Bond
Le Xiaobing
Lei Lingzi
Luo Zhongyue
Wang Chenlu
Wei Yuquan
Xu Lian
Xuan Yu
Yang Huiliang
Yang Qilian
Yang Yanfei
Yi Tao
Zhao Linjie
Zhao Xia
Zhou Shengtao
Publication venue: Jefferson Digital Commons
Publication date: 24/11/2016
Field of study

Long noncoding RNAs (lncRNAs) are transcribed by genomic regions (exceeding 200 nucleotides in length) that do not encode proteins. While the exquisite regulation of lncRNA transcription can provide signals of malignant transformation, lncRNAs control pleiotropic cancer phenotypes through interactions with other cellular molecules including DNA, protein, and RNA. Recent studies have demonstrated that dysregulation of lncRNAs is influential in proliferation, angiogenesis, metastasis, invasion, apoptosis, stemness, and genome instability in colorectal cancer (CRC), with consequent clinical implications. In this review, we explicate the roles of different lncRNAs in CRC, and the potential implications for their clinical application

PubMed Central

Jefferson Digital Commons

RenderMe-360: A Large Digital Asset Library and Benchmarks Towards High-fidelity Head Avatars

Author: Cheng Wei
Dai Bo
Fan Siming
Lin Dahua
Lin Kwan-Yee
Liu Shengqi
Liu Ziwei
Loy Chen Change
Luo Huiwen
Pan Dongwei
Piao Jingtan
Qian Chen
Wang Yuxin
Wu Wayne
Yang Lei
Zhuo Long
Publication venue
Publication date: 22/05/2023
Field of study

Synthesizing high-fidelity head avatars is a central problem for computer vision and graphics. While head avatar synthesis algorithms have advanced rapidly, the best ones still face great obstacles in real-world scenarios. One of the vital causes is inadequate datasets -- 1) current public datasets can only support researchers to explore high-fidelity head avatars in one or two task directions; 2) these datasets usually contain digital head assets with limited data volume, and narrow distribution over different attributes. In this paper, we present RenderMe-360, a comprehensive 4D human head dataset to drive advance in head avatar research. It contains massive data assets, with 243+ million complete head frames, and over 800k video sequences from 500 different identities captured by synchronized multi-view cameras at 30 FPS. It is a large-scale digital library for head avatars with three key attributes: 1) High Fidelity: all subjects are captured by 60 synchronized, high-resolution 2K cameras in 360 degrees. 2) High Diversity: The collected subjects vary from different ages, eras, ethnicities, and cultures, providing abundant materials with distinctive styles in appearance and geometry. Moreover, each subject is asked to perform various motions, such as expressions and head rotations, which further extend the richness of assets. 3) Rich Annotations: we provide annotations with different granularities: cameras' parameters, matting, scan, 2D/3D facial landmarks, FLAME fitting, and text description. Based on the dataset, we build a comprehensive benchmark for head avatar research, with 16 state-of-the-art methods performed on five main tasks: novel view synthesis, novel expression synthesis, hair rendering, hair editing, and talking head generation. Our experiments uncover the strengths and weaknesses of current methods. RenderMe-360 opens the door for future exploration in head avatars.Comment: Technical Report; Project Page: 36; Github Link: https://github.com/RenderMe-360/RenderMe-36

arXiv.org e-Print Archive

Genome sequence of the Ornithopus/Lupinus-nodulating Bradyrhizobium sp. strain WSM471

Author: Ardley Julie
Bruce David
Chen I-Min
De Meyer Sofie
Detter Chris
Goodwin Lynne
Han Cliff
Han James
Howieson John
Huntemann Marcel
Ivanova Natalia
Kyrpides Nikos C
Lu Megan
Markowitz Victor
Mavromatis Konstantinos
Melino Vanessa
Ninawi Mohamed
O'Hara Graham
Pagani Ioanna
Pati Amrita
Reeve Wayne Gerald
Tapia Roxanne
Terpolilli Jason
Tian Rui
Tiwari Ravi
Wei Chia-Lin
Woyke Tanja
Yates Ronald
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

Bradyrhizobium sp. strain WSM471 is an aerobic, motile, Gram-negative, non-spore-forming rod that was isolated from an effective nitrogen-(N-2) fixing root nodule formed on the annual legume Ornithopus pinnatus (Miller) Druce growing at Oyster Harbour, Albany district, Western Australia in 1982. This strain is in commercial production as an inoculant for Lupinus and Ornithopus. Here we describe the features of Bradyrhizobium sp. strain WSM471, together with genome sequence information and annotation. The 7,784,016 bp high-quality-draft genome is arranged in 1 scaffold of 2 contigs, contains 7,372 protein-coding genes and 58 RNA-only encoding genes, and is one of 20 rhizobial genomes sequenced as part of the DOE Joint Genome Institute 2010 Community Sequencing Program

Ghent University Academic Bibliography