Search CORE

66 research outputs found

An Efficient Method of Estimating Downward Solar Radiation Based on the MODIS Observations for the Use of Land Surface Modeling

Author: Chen Min
He Yujie
Zhuang Qianlai
Publication venue: 'Purdue University (bepress)'
Publication date: 01/01/2014
Field of study

Solar radiation is a critical variable in global change sciences. While most of the current global datasets provide only the total downward solar radiation, we aim to develop a method to estimate the downward global land surface solar radiation and its partitioned direct and diffuse components, which provide the necessary key meteorological inputs for most land surface models. We developed a simple satellite-based computing scheme to enable fast and reliable estimation of these variables. The global Moderate Resolution Imaging Spectroradiometer (MODIS) products at 1° spatial resolution for the period 2003–2011 were used as the forcing data. Evaluations at Baseline Surface Radiation Network (BSRN) sites show good agreement between the estimated radiation and ground-based observations. At all the 48 BSRN sites, the RMSE between the observations and estimations are 34.59, 41.98 and 28.06 W∙m−2 for total, direct and diffuse solar radiation, respectively. Our estimations tend to slightly overestimate the total and diffuse but underestimate the direct solar radiation. The errors may be related to the simple model structure and error of the input data. Our estimation is also comparable to the Clouds and Earth’s Radiant Energy System (CERES) data while shows notable improvement over the widely used National Centers for Environmental Prediction and National Center for Atmospheric Research (NCEP/NCAR) Reanalysis data. Using our MODIS-based datasets of total solar radiation and its partitioned components to drive land surface models should improve simulations of global dynamics of water, carbon and climate

Multidisciplinary Digital Publishing Institute

CiteSeerX

Crossref

Directory of Open Access Journals

Purdue E-Pubs

Incomplete Wood-Ljungdahl pathway facilitates one-carbon metabolism in organohalide-respiring Dehalococcoides mccartyi.

Author: Alvarez-Cohen Lisa
Bill Markus
Brisson Vanessa L
Conrad Mark E
Feng Xueyang
Men Yujie
Tang Yinjie J
Yi Shan
Zhuang Wei-Qin
Publication venue: eScholarship, University of California
Publication date: 14/04/2014
Field of study

The acetyl-CoA "Wood-Ljungdahl" pathway couples the folate-mediated one-carbon (C1) metabolism to either CO2 reduction or acetate oxidation via acetyl-CoA. This pathway is distributed in diverse anaerobes and is used for both energy conservation and assimilation of C1 compounds. Genome annotations for all sequenced strains of Dehalococcoides mccartyi, an important bacterium involved in the bioremediation of chlorinated solvents, reveal homologous genes encoding an incomplete Wood-Ljungdahl pathway. Because this pathway lacks key enzymes for both C1 metabolism and CO2 reduction, its cellular functions remain elusive. Here we used D. mccartyi strain 195 as a model organism to investigate the metabolic function of this pathway and its impacts on the growth of strain 195. Surprisingly, this pathway cleaves acetyl-CoA to donate a methyl group for production of methyl-tetrahydrofolate (CH3-THF) for methionine biosynthesis, representing an unconventional strategy for generating CH3-THF in organisms without methylene-tetrahydrofolate reductase. Carbon monoxide (CO) was found to accumulate as an obligate by-product from the acetyl-CoA cleavage because of the lack of a CO dehydrogenase in strain 195. CO accumulation inhibits the sustainable growth and dechlorination of strain 195 maintained in pure cultures, but can be prevented by CO-metabolizing anaerobes that coexist with D. mccartyi, resulting in an unusual syntrophic association. We also found that this pathway incorporates exogenous formate to support serine biosynthesis. This study of the incomplete Wood-Ljungdahl pathway in D. mccartyi indicates a unique bacterial C1 metabolism that is critical for D. mccartyi growth and interactions in dechlorinating communities and may play a role in other anaerobic communities

PubMed Central

eScholarship - University of California

PDT: Pretrained Dual Transformers for Time-aware Bipartite Graphs

Author: Dai Xin
Fan Yujie
Jain Shubham
Wang Junpeng
Wang Liang
Yeh Chin-Chia Michael
Zhang Wei
Zheng Yan
Zhuang Zhongfang
Publication venue
Publication date: 02/06/2023
Field of study

Pre-training on large models is prevalent and emerging with the ever-growing user-generated content in many machine learning application categories. It has been recognized that learning contextual knowledge from the datasets depicting user-content interaction plays a vital role in downstream tasks. Despite several studies attempting to learn contextual knowledge via pre-training methods, finding an optimal training objective and strategy for this type of task remains a challenging problem. In this work, we contend that there are two distinct aspects of contextual knowledge, namely the user-side and the content-side, for datasets where user-content interaction can be represented as a bipartite graph. To learn contextual knowledge, we propose a pre-training method that learns a bi-directional mapping between the spaces of the user-side and the content-side. We formulate the training goal as a contrastive learning task and propose a dual-Transformer architecture to encode the contextual knowledge. We evaluate the proposed method for the recommendation task. The empirical studies have demonstrated that the proposed method outperformed all the baselines with significant gains

arXiv.org e-Print Archive

Simultaneous evolutionary expansion and constraint of genomic heterogeneity in multifocal lung cancer.

Author: Bivona Trever G
Cai Mei-Chun
Chen Hong-Zhuan
Chen Minjiang
Fu Yujie
Gao Wei-Qiang
Gu Zhenyu
Jing Ying
Ma Pengfei
Shen Ying
Wang Mengzhao
Wu Jie
Yan Ying
Zhang Shengzhe
Zhao Xiaojing
Zhu Liang
Zhuang Guanglei
Publication venue: eScholarship, University of California
Publication date: 01/01/2017
Field of study

Recent genomic analyses have revealed substantial tumor heterogeneity across various cancers. However, it remains unclear whether and how genomic heterogeneity is constrained during tumor evolution. Here, we sequence a unique cohort of multiple synchronous lung cancers (MSLCs) to determine the relative diversity and uniformity of genetic drivers upon identical germline and environmental background. We find that each multicentric primary tumor harbors distinct oncogenic alterations, including novel mutations that are experimentally demonstrated to be functional and therapeutically targetable. However, functional studies show a strikingly constrained tumorigenic pathway underlying heterogeneous genetic variants. These results suggest that although the mutation-specific routes that cells take during oncogenesis are stochastic, genetic trajectories may be constrained by selection for functional convergence on key signaling pathways. Our findings highlight the robust evolutionary pressures that simultaneously shape the expansion and constraint of genomic diversity, a principle that holds important implications for understanding tumor evolution and optimizing therapeutic strategies.Across cancer types tumor heterogeneity has been observed, but how this relates to tumor evolution is unclear. Here, the authors sequence multiple synchronous lung cancers, highlighting the evolutionary pressures that simultaneously shape the expansion and constraint of genomic heterogeneity

Crossref

Directory of Open Access Journals

eScholarship - University of California

FATA-Trans: Field And Time-Aware Transformer for Sequential Tabular Data

Author: Dai Xin
Fan Yujie
Jain Shubham
Wang Junpeng
Wang Liang
Yeh Chin-Chia Michael
Zhang Dongyu
Zhang Wei
Zheng Yan
Zhuang Zhongfang
Publication venue
Publication date: 20/10/2023
Field of study

Sequential tabular data is one of the most commonly used data types in real-world applications. Different from conventional tabular data, where rows in a table are independent, sequential tabular data contains rich contextual and sequential information, where some fields are dynamically changing over time and others are static. Existing transformer-based approaches analyzing sequential tabular data overlook the differences between dynamic and static fields by replicating and filling static fields into each transformer, and ignore temporal information between rows, which leads to three major disadvantages: (1) computational overhead, (2) artificially simplified data for masked language modeling pre-training task that may yield less meaningful representations, and (3) disregarding the temporal behavioral patterns implied by time intervals. In this work, we propose FATA-Trans, a model with two field transformers for modeling sequential tabular data, where each processes static and dynamic field information separately. FATA-Trans is field- and time-aware for sequential tabular data. The field-type embedding in the method enables FATA-Trans to capture differences between static and dynamic fields. The time-aware position embedding exploits both order and time interval information between rows, which helps the model detect underlying temporal behavior in a sequence. Our experiments on three benchmark datasets demonstrate that the learned representations from FATA-Trans consistently outperform state-of-the-art solutions in the downstream tasks. We also present visualization studies to highlight the insights captured by the learned representations, enhancing our understanding of the underlying data. Our codes are available at https://github.com/zdy93/FATA-Trans.Comment: This work is accepted by ACM International Conference on Information and Knowledge Management (CIKM) 202

arXiv.org e-Print Archive

Multitask Learning for Time Series Data with 2D Convolution

Author: Chen Huiyuan
Dai Xin
Der Audrey
Fan Yujie
Wang Junpeng
Wang Liang
Yeh Chin-Chia Michael
Zhang Wei
Zheng Yan
Zhuang Zhongfang
Publication venue
Publication date: 10/10/2023
Field of study

Multitask learning (MTL) aims to develop a unified model that can handle a set of closely related tasks simultaneously. By optimizing the model across multiple tasks, MTL generally surpasses its non-MTL counterparts in terms of generalizability. Although MTL has been extensively researched in various domains such as computer vision, natural language processing, and recommendation systems, its application to time series data has received limited attention. In this paper, we investigate the application of MTL to the time series classification (TSC) problem. However, when we integrate the state-of-the-art 1D convolution-based TSC model with MTL, the performance of the TSC model actually deteriorates. By comparing the 1D convolution-based models with the Dynamic Time Warping (DTW) distance function, it appears that the underwhelming results stem from the limited expressive power of the 1D convolutional layers. To overcome this challenge, we propose a novel design for a 2D convolution-based model that enhances the model's expressiveness. Leveraging this advantage, our proposed method outperforms competing approaches on both the UCR Archive and an industrial transaction TSC dataset

arXiv.org e-Print Archive

PA-Efficiency-Aware Hybrid PAPR Reduction for F-OFDM Systems with ICA Based Blind Equalization

Author: Gao Lin
IEEE
Jiang Yufei
Lin Han
Liu Yujie
Zhu Xu
Zhuang Yuan
Publication venue
Publication date: 01/01/2019
Field of study

University of Liverpool Repository

Crossref

Has Your Pretrained Model Improved? A Multi-head Posterior Based Approach

Author: Aboagye Prince
Dai Xin
Fan Yujie
Jain Shubham
Saini Uday Singh
Wang Junpeng
Wang Liang
Yeh Michael
Zhang Wei
Zheng Yan
Zhuang Zhongfang
Publication venue
Publication date: 14/02/2024
Field of study

The emergence of pre-trained models has significantly impacted Natural Language Processing (NLP) and Computer Vision to relational datasets. Traditionally, these models are assessed through fine-tuned downstream tasks. However, this raises the question of how to evaluate these models more efficiently and more effectively. In this study, we explore a novel approach where we leverage the meta-features associated with each entity as a source of worldly knowledge and employ entity representations from the models. We propose using the consistency between these representations and the meta-features as a metric for evaluating pre-trained models. Our method's effectiveness is demonstrated across various domains, including models with relational datasets, large language models and image models

arXiv.org e-Print Archive

Toward a Foundation Model for Time Series Data

Author: Chen Huiyuan
Dai Xin
Der Audrey
Fan Yujie
Lai Vivian
Wang Junpeng
Wang Liang
Yeh Chin-Chia Michael
Zhang Wei
Zheng Yan
Zhuang Zhongfang
Publication venue
Publication date: 05/10/2023
Field of study

A foundation model is a machine learning model trained on a large and diverse set of data, typically using self-supervised learning-based pre-training techniques, that can be adapted to various downstream tasks. However, current research on time series pre-training has mostly focused on models pre-trained solely on data from a single domain, resulting in a lack of knowledge about other types of time series. However, current research on time series pre-training has predominantly focused on models trained exclusively on data from a single domain. As a result, these models possess domain-specific knowledge that may not be easily transferable to time series from other domains. In this paper, we aim to develop an effective time series foundation model by leveraging unlabeled samples from multiple domains. To achieve this, we repurposed the publicly available UCR Archive and evaluated four existing self-supervised learning-based pre-training methods, along with a novel method, on the datasets. We tested these methods using four popular neural network architectures for time series to understand how the pre-training methods interact with different network designs. Our experimental results show that pre-training improves downstream classification tasks by enhancing the convergence of the fine-tuning process. Furthermore, we found that the proposed pre-training method, when combined with the Transformer model, outperforms the alternatives

arXiv.org e-Print Archive