Search CORE

12 research outputs found

Network-Adjusted Covariates for Community Detection

Author: Hu Yaofang
Wang Wanjie
Publication venue
Publication date: 27/06/2023
Field of study

Community detection is a crucial task in network analysis that can be significantly improved by incorporating subject-level information, i.e. covariates. However, current methods often struggle with selecting tuning parameters and analyzing low-degree nodes. In this paper, we introduce a novel method that addresses these challenges by constructing network-adjusted covariates, which leverage the network connections and covariates with a unique weight to each node based on the node's degree. Spectral clustering on network-adjusted covariates yields an exact recovery of community labels under certain conditions, which is tuning-free and computationally efficient. We present novel theoretical results about the strong consistency of our method under degree-corrected stochastic blockmodels with covariates, even in the presence of mis-specification and sparse communities with bounded degrees. Additionally, we establish a general lower bound for the community detection problem when both network and covariates are present, and it shows our method is optimal up to a constant factor. Our method outperforms existing approaches in simulations and a LastFM app user network, and provides interpretable community structures in a statistics publication citation network where

30\%

of nodes are isolated.Comment: 48 page

arXiv.org e-Print Archive

Covariate-Assisted Community Detection on Sparse Networks

Author: Hu Yaofang
Wang Wanjie
Publication venue
Publication date: 31/10/2022
Field of study

Community detection is an important problem when processing network data. Traditionally, this is done by exploiting the connections between nodes, but connections can be too sparse to detect communities in many real datasets. Node covariates can be used to assist community detection; see Binkiewicz et al. (2017); Weng and Feng (2022); Yan and Sarkar (2021); Yang et al. (2013). However, how to combine covariates with network connections is challenging, because covariates may be high-dimensional and inconsistent with community labels. To study the relationship between covariates and communities, we propose the degree corrected stochastic block model with node covariates (DCSBM-NC). It allows degree heterogeneity among communities and inconsistent labels between communities and covariates. Based on DCSBM-NC, we design the adjusted neighbor-covariate (ANC) data matrix, which leverages covariate information to assist community detection. We then propose the covariate-assisted spectral clustering on ratios of singular vectors (CA-SCORE) method on the ANC matrix. We prove that CA-SCORE successfully recovers community labels when 1) the network is relatively dense; 2) the covariate class labels match the community labels; 3) the data is a mixture of 1) and 2). CA-SCORE has good performance on synthetic and real datasets. The algorithm is implemented in the R(R Core Team (2021)) package CASCORE

arXiv.org e-Print Archive

Graph matching beyond perfectly-overlapping Erdős–Rényi random graphs

Author: Hu Yaofang
Wang Wanjie
Yu Yi
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 11/02/2022
Field of study

Graph matching is a fruitful area in terms of both algorithms and theories. Given two graphs G1=(V1,E1) and G2=(V2,E2), where V1 and V2 are the same or largely overlapped upon an unknown permutation π∗, graph matching is to seek the correct mapping π∗. In this paper, we exploit the degree information, which was previously used only in noiseless graphs and perfectly-overlapping Erdős–Rényi random graphs matching. We are concerned with graph matching of partially-overlapping graphs and stochastic block models, which are more useful in tackling real-life problems. We propose the edge exploited degree profile graph matching method and two refined variations. We conduct a thorough analysis of our proposed methods’ performances in a range of challenging scenarios, including coauthorship data set and a zebrafish neuron activity data set. Our methods are proved to be numerically superior than the state-of-the-art methods. The algorithms are implemented in the R (A language and environment for statistical computing, R Foundation for Statistical Computing, Vienna, 2020) package GMPro (GMPro: graph matching with degree profiles, 2020)

Warwick Research Archives Portal Repository

Development of a Personalized Pharmacologic Treatment Repository for Bronchial Asthma Based on the 2018 Guideline for the Diagnosis and Management of Bronchial Asthma in Primary Care（Practice Edition）

Author: HUO Hongmin YANG Yaofang, DONG Jiatian, MAO Zixian, HU Ping, SONG Fengchun
Publication venue: Chinese General Practice Publishing House Co., Ltd
Publication date: 01/04/2022
Field of study

Bronchial asthma is a chronic inflammatory disease with high heterogeneity, polygenic inheritance, complex etiology and many complications. The effects of prevention and treatment for a bronchial asthma patient often depend on whether the patient has received a personalized health management. In order to be in line with the international management level of bronchial asthma, the updates in the 2018 Guideline for the Diagnosis and Management of Bronchial Asthma in Primary Care (Practice Edition, here in after referred to as the 2018 Guideline) contain the idea of early intervention, optimized medication regimens, and highlighted standardized management approach regarding bronchial asthma. To promote personalized pharmacologic management of bronchial asthma in primary care, and to provide online pre-, mid- and post-diagnosis pharmaceutical services for physicians, as well as personalized pharmacologic monitoring and management services for bronchial asthma patients in the community, pharmacists have developed a search engine with integrated functions of "pre-judgment, early warning and prediction" to collect medication information related to bronchial asthma using the information technology, according to the pharmacologic treatment path "initial treatment, long-term treatment, degradation principle" put forward in the 2018 Guideline, with the "one factory, one drug, one specification" individualized instruction as a basis

Directory of Open Access Journals

Characterization and fine mapping of a new early leaf senescence mutant es3(t) in rice

Author: A Herrera-Vásquez
A Robert-Seilaniantz
AK Biswas
BB Jiao
Bin Zhang
CH Foyer
DI Arnon
E Gentinetta
E Graaff van der
G Miller
GH Zhu
GK Agrawal
H Thomas
H Zhang
HB Wu
J Li
J Song
JC Yang
JJ Tan
JT Bai
KM Gothandam
KT Hung
KT Hung
LF Zhu
LM Weaver
Longbiao Guo
LW Wu
P Saleethong
PO Lim
PO Lim
Qian Qian
QN Huang
QY Zhou
R Khanna-Chopra
RH Lee
RR Finkelstein
S Balazadeh
S Gepstein
S Ray
S Temnykh
Shikai Hu
SJ Wi
SR McCouch
T Hirayama
V Buchanan-Wollaston
V Buchanan-Wollaston
V Pandey
Weijun Ye
WY Yan
X Wang
X Yang
Y Guo
Yan Su
Yaofang Niu
YC Rao
Z Li
ZH Wang
ZS Kong
ZW Li
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Bayesian Variational Inference in Keyword Identification and Multiple Instance Classification

Author: Hu Yaofang
Publication venue: SMU Scholar
Publication date: 06/08/2024
Field of study

This dissertation investigates (1) Variational Bayesian Semi-supervised Keyword Extraction and (2) Variational Bayesian Multimodal Multiple Instance Classification. The expansion of textual data, stemming from various sources such as online product reviews and scholarly publications on scientific discoveries, has created a demand for the extraction of succinct yet comprehensive information. As a result, in recent years, efforts have been spent in developing novel methodologies for keyword extraction. Although many methods have been proposed to automatically extract keywords in the contexts of both unsupervised and fully supervised learning, how to effectively use partially observed keywords, such as author-specified keywords, remains an under-explored area. In Chapter 1, we propose a novel variational Bayesian semi-supervised (VBSS) keyword extraction approach, built on a recent Bayesian semi-supervised (BSS) technique that uses the information from a small set of known keywords to identify previously undetected ones. Our proposed VBSS method greatly enhances the computational efficiency of BSS via mean-field variational inference, coupled with data augmentation, which brings closed-form solutions at each step of the optimization process. Further, our numerical results show that VBSS offers enhanced accuracy for long texts and improved control over false discovery rates when compared with a list of state-of-the-art keyword extraction methods. In Chapter 2, we apply mean-field variational inference on multiple instance learning (MIL). In MIL, objects are represented by bags of instances. Each instance shares the same feature set but has unique feature values. MIL aims to train models that predict bag-level outcomes based on these instances, making it a weakly supervised approach due to the lack of instance-level labels. While MIL methods focusing on binary classification are abundant, they often cannot identify which specific instances drive bag labels and have limited or little interpretability. Xiong et al. (2024) introduced MICProB, a Bayesian multiple instance classification (MIC) algorithm that addresses these issues. However, MICProB is computationally intensive and best suited for unimodal instances. To overcome these limitations, we propose a novel variational Bayesian multimodal MIC (vMMIC) algorithm. vMMIC handles diverse instance types and significantly improves computational efficiency through Bayesian variational inference, combined with data augmentation. We benchmark vMMIC against MICProB and many other MIC approaches on both simulated and real-world data. Results demonstrate vMMIC\u27s superior performance, computational efficiency, and interpretability

SMU Digital Repository

TurboID screening of ApxI toxin interactants identifies host proteins involved in Actinobacillus pleuropneumoniae-induced apoptosis of immortalized porcine alveolar macrophages

Author: Changsheng Jiang
Hua Cao
Jingping Ren
Mengjia Zhang
Qigai He
Wei Zeng
Wentao Li
Yaofang Hu
Yongtao Li
Yueqiao Zhao
Publication venue: BMC
Publication date: 01/07/2023
Field of study

Abstract Actinobacillus pleuropneumoniae (APP) is a gram-negative pathogenic bacterium responsible for porcine contagious pleuropneumonia (PCP), which can cause porcine necrotizing and hemorrhagic pleuropneumonia. Actinobacillus pleuropneumoniae-RTX-toxin (Apx) is an APP virulence factor. APP secretes a total of four Apx toxins, among which, ApxI demonstrates strong hemolytic activity and cytotoxicity, causing lysis of porcine erythrocytes and apoptosis of porcine alveolar macrophages. However, the protein interaction network between this toxin and host cells is still poorly understood. TurboID mediates the biotinylation of endogenous proteins, thereby targeting specific proteins and local proteomes through gene fusion. We applied the TurboID enzyme-catalyzed proximity tagging method to identify and study host proteins in immortalized porcine alveolar macrophage (iPAM) cells that interact with the exotoxin ApxI of APP. His-tagged TurboID-ApxIA and TurboID recombinant proteins were expressed and purified. By mass spectrometry, 318 unique interacting proteins were identified in the TurboID ApxIA-treated group. Among them, only one membrane protein, caveolin-1 (CAV1), was identified. A co-immunoprecipitation assay confirmed that CAV1 can interact with ApxIA. In addition, overexpression and RNA interference experiments revealed that CAV1 was involved in ApxI toxin-induced apoptosis of iPAM cells. This study provided first-hand information about the proteome of iPAM cells interacting with the ApxI toxin of APP through the TurboID proximity labeling system, and identified a new host membrane protein involved in this interaction. These results lay a theoretical foundation for the clinical treatment of PCP

Directory of Open Access Journals

A Novel Splicing Mutation Leading to Wiskott-Aldrich Syndrome from a Family

Author: Gang Wang
Jie Zhang
Juan Ren
Lidong Zhao
Lingyu Wang
Linhua Yang
Linna Lu
Shuai Fang
Wukang Shen
Xiaomei Lu
Xucheng Hu
Yaofang Zhang
Publication venue: Hindawi Limited
Publication date: 01/01/2024
Field of study

Wiskott-Aldrich syndrome (WAS) is a rare X-linked recessive genetic disease characterized by clinical symptoms such as eczema, thrombocytopenia with small platelets, immune deficiency, prone to autoimmune diseases, and malignant tumors. This disease is caused by mutations of the WAS gene encoding WASprotein (WASP). The locus and type of mutations of the WAS gene and the expression quantity of WASP were strongly correlated with the clinical manifestations of patients. We found a novel mutation in the WAS gene (c.931+5G>C), which affected splicing to produce three abnormal mRNA, resulting in an abnormally truncated WASP. This mutation led to a reduction but not the elimination of the normal WASP population, resulting in causes X-linked thrombocytopenia (XLT) with mild clinical manifestations. Our findings revealed the pathogenic mechanism of this mutation

Directory of Open Access Journals