Search CORE

14 research outputs found

Multi-Source Multi-View Clustering via Discrepancy Penalty

Author: He Lifang
Shao Weixiang
Yu Philip S.
Zhang Jiawei
Publication venue
Publication date: 18/04/2016
Field of study

With the advance of technology, entities can be observed in multiple views. Multiple views containing different types of features can be used for clustering. Although multi-view clustering has been successfully applied in many applications, the previous methods usually assume the complete instance mapping between different views. In many real-world applications, information can be gathered from multiple sources, while each source can contain multiple views, which are more cohesive for learning. The views under the same source are usually fully mapped, but they can be very heterogeneous. Moreover, the mappings between different sources are usually incomplete and partially observed, which makes it more difficult to integrate all the views across different sources. In this paper, we propose MMC (Multi-source Multi-view Clustering), which is a framework based on collective spectral clustering with a discrepancy penalty across sources, to tackle these challenges. MMC has several advantages compared with other existing methods. First, MMC can deal with incomplete mapping between sources. Second, it considers the disagreements between sources while treating views in the same source as a cohesive set. Third, MMC also tries to infer the instance similarities across sources to enhance the clustering performance. Extensive experiments conducted on real-world data demonstrate the effectiveness of the proposed approach

arXiv.org e-Print Archive

Crossref

Online Unsupervised Multi-view Feature Selection

Author: He Lifang
Lu Chun-Ta
Shao Weixiang
Wei Xiaokai
Yu Philip S.
Publication venue
Publication date: 27/09/2016
Field of study

In the era of big data, it is becoming common to have data with multiple modalities or coming from multiple sources, known as "multi-view data". Multi-view data are usually unlabeled and come from high-dimensional spaces (such as language vocabularies), unsupervised multi-view feature selection is crucial to many applications. However, it is nontrivial due to the following challenges. First, there are too many instances or the feature dimensionality is too large. Thus, the data may not fit in memory. How to select useful features with limited memory space? Second, how to select features from streaming data and handles the concept drift? Third, how to leverage the consistent and complementary information from different views to improve the feature selection in the situation when the data are too big or come in as streams? To the best of our knowledge, none of the previous works can solve all the challenges simultaneously. In this paper, we propose an Online unsupervised Multi-View Feature Selection, OMVFS, which deals with large-scale/streaming multi-view data in an online fashion. OMVFS embeds unsupervised feature selection into a clustering algorithm via NMF with sparse learning. It further incorporates the graph regularization to preserve the local structure information and help select discriminative features. Instead of storing all the historical data, OMVFS processes the multi-view data chunk by chunk and aggregates all the necessary information into several small matrices. By using the buffering technique, the proposed OMVFS can reduce the computational and storage cost while taking advantage of the structure information. Furthermore, OMVFS can capture the concept drifts in the data streams. Extensive experiments on four real-world datasets show the effectiveness and efficiency of the proposed OMVFS method. More importantly, OMVFS is about 100 times faster than the off-line methods

arXiv.org e-Print Archive

Crossref

Aggregator: a machine learning approach to identifying MEDLINE articles that derive from the same underlying clinical trial

Author: Aaron M. Cohen
Bourgeois
Boyack
Califf
Clive E. Adams
Cohen
Cook
Gu
Hao
Huser
John M. Davis
Lin
Lin
Marian S. McDonagh
Neil R. Smalheiser
Philip S. Yu
Ross
Roumiantseva
Sampson
Smalheiser
Sujata Thakurta
Thornton
Torvik
Torvik
Tramèr
von Elm
Weixiang Shao
Wilhelmus
Zhang
Zhou
Publication venue: 'Elsevier BV'
Publication date: 01/03/2015
Field of study

Objective It is important to identify separate publications that report outcomes from the same underlying clinical trial, in order to avoid over-counting these as independent pieces of evidence. Methods We created positive and negative training sets (comprised of pairs of articles reporting on the same condition and intervention) that were, or were not, linked to the same clinicaltrials.gov trial registry number. Features were extracted from MEDLINE and PubMed metadata; pairwise similarity scores were modeled using logistic regression. Results Article pairs from the same trial were identified with high accuracy (F1 score = 0.843). We also created a clustering tool, Aggregator, that takes as input a PubMed user query for RCTs on a given topic, and returns article clusters predicted to arise from the same clinical trial. Discussion Although painstaking examination of full-text may be needed to be conclusive, metadata are surprisingly accurate in predicting when two articles derive from the same underlying clinical trial

Nottingham ePrints

Nottingham eTheses

Crossref

Repository@Nottingham

PubMed Central

Unsupervised Learning from Multi-view Data

Author: Weixiang Shao (7984739)
Publication venue
Publication date: 18/10/2016
Field of study

With the advance of technology, data are often with multiple modalities or coming from multiple sources. Such data are called multi-view data. Usually, multiple views provide complementary information for the semantically same data. Learning from multi-view data can obtain better performance than relying on just one single view. Also, as the data explodes, most of the multi-view data are unlabeled and it is expensive to label the data. Thus, unsupervised learning from multi-view data is very important in many real-world applications. However, in real-world application, multi-view data are usually heterogeneous (different feature spaces for different views), incomplete, large-scale and high-dimensional. These challenges prevent us from applying existing unsupervised learning methods to real-world multi-view data. This dissertation presents my Ph.D. research works on unsupervised learning from multi-view data. First, we present the first algorithm to solve the multiple incomplete views clustering problem by collectively learning the kernel matrices for different views. Furthermore, we propose a more general tensor based multi-incomplete-view clustering method, which uses a tensor to model the multiple incomplete views and learns the latent features by sparse tensor factorization. Third, we present a faster multi-incomplete-view clustering algorithm based on weighted nonnegative matrix factorization. Lastly, we propose an online multi-view unsupervised feature selection algorithm to solve the scalability and high-dimensionality challenges

University of Illinois at Chicago: UIC INDIGO (INtellectual property in DIGital form available online in an Open environment)

Nuggets: findings shared in multiple clinical case reports

Author: Neil R. Smalheiser
Philip S. Yu
Weixiang Shao
Publication venue: 'Medical Library Association'
Publication date: 01/10/2015
Field of study

OBJECTIVE: The researchers assessed prevalence in the clinical case report literature of multiple reports independently reporting the same (or nearly the same) main finding. METHODS: Results from forty-five PubMed queries were examined for incidence and features of main findings (“nuggets”) shared in at least four case reports. RESULTS: The authors found that nuggets are surprisingly prevalent and large in the case report literature, the largest found so far was reported in seventeen articles. In most cases, the main findings of case reports were evident from examining titles alone. CONCLUSIONS: Our curated examples should serve as gold standards for developing specific automated methods for finding nuggets. Nuggets potentially enable finding-based (instead of topic-based) information retrieval

Crossref

PubMed Central

University of Illinois at Chicago: UIC INDIGO (INtellectual property in DIGital form available online in an Open environment)

Improving Soil Enzyme Activities and Related Quality Properties of Reclaimed Soil by Applying Weathered Coal in Opencast-Mining Areas of the Chinese Loess Plateau

Author: Bai Zhongke
Bi Rutian
Li Hua
Li Weixiang
Shao Hongbo
Publication venue: 'Wiley'
Publication date: 01/03/2012
Field of study

There are many problems for the reclaimed soil in opencast-mining areas of the Loess Plateau of China such as poor soil structure and extreme poverty in soil nutrients and so on. For the sake of finding a better way to improve soil quality, the current study was to apply the weathered coal for repairing soil media and investigate the physicochemical properties of the reclaimed soil and the changes in enzyme activities after planting Robinia pseucdoacacia. The results showed that the application of the weathered coal significantly improved the quality of soil aggregates, increased the content of water stable aggregates, and the organic matter, humus, and the cation exchange capacity of topsoil were significantly improved, but it did not have a significant effect on soil pH. Planting R. pseucdoacacia significantly enhanced the activities of soil catalase, urease, and invertase, but the application of the weathered coal inhibited the activity of catalase. Although the application of appropriate weathered coal was able to significantly increase urease activity, the activities of catalase, urease, or invertase had a close link with the soil profile levels and time. This study suggests that applying weathered coals could improve the physicochemical properties and soil enzyme activities of the reclaimed soil in opencast-mining areas of the Loess Plateau of China and the optimum applied amount of the weathered coal for reclaimed soil remediation is about 27?000?kg?hm-2.There are many problems for the reclaimed soil in opencast-mining areas of the Loess Plateau of China such as poor soil structure and extreme poverty in soil nutrients and so on. For the sake of finding a better way to improve soil quality, the current study was to apply the weathered coal for repairing soil media and investigate the physicochemical properties of the reclaimed soil and the changes in enzyme activities after planting Robinia pseucdoacacia. The results showed that the application of the weathered coal significantly improved the quality of soil aggregates, increased the content of water stable aggregates, and the organic matter, humus, and the cation exchange capacity of topsoil were significantly improved, but it did not have a significant effect on soil pH. Planting R. pseucdoacacia significantly enhanced the activities of soil catalase, urease, and invertase, but the application of the weathered coal inhibited the activity of catalase. Although the application of appropriate weathered coal was able to significantly increase urease activity, the activities of catalase, urease, or invertase had a close link with the soil profile levels and time. This study suggests that applying weathered coals could improve the physicochemical properties and soil enzyme activities of the reclaimed soil in opencast-mining areas of the Loess Plateau of China and the optimum applied amount of the weathered coal for reclaimed soil remediation is about 27?000?kg?hm-2

Institutional Repository of Yantai Institute of Coastal Zone Research, CAS

Strongly secure certificateless key-insulated signature secure in the standard model

Author: CK Miller
D He
G Itkis
H Xiong
Hu Xiong
J Yu
K-A Shim
N Tiwari
W Diffie
Weixiang Xu
Yanan Chen
Z Shao
Z Wan
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Recommended from our members

lncRNA BREA2 promotes metastasis by disrupting the WWP2-mediated ubiquitination of Notch1.

Author: Bian Weixiang
Chen Shi-Yi
Fu Peifen
Li Jun
Li Xu
Lin Aifu
Liu Fangzhou
Lu Yun-Xin
Qu Lei
Sang Lingjie
Shao Jianzhong
Shi Chengyu
Wang Wenqi
Xie Shaofang
Yan Qingfeng
Yang Jie-Cheng
Yang Lu
Yang Zuozhen
Zhang Zhen
Publication venue: eScholarship, University of California
Publication date: 21/02/2023
Field of study

Notch has been implicated in human cancers and is a putative therapeutic target. However, the regulation of Notch activation in the nucleus remains largely uncharacterized. Therefore, characterizing the detailed mechanisms governing Notch degradation will identify attractive strategies for treating Notch-activated cancers. Here, we report that the long noncoding RNA (lncRNA) BREA2 drives breast cancer metastasis by stabilizing the Notch1 intracellular domain (NICD1). Moreover, we reveal WW domain containing E3 ubiquitin protein ligase 2 (WWP2) as an E3 ligase for NICD1 at K1821 and a suppressor of breast cancer metastasis. Mechanistically, BREA2 impairs WWP2-NICD1 complex formation and in turn stabilizes NICD1, leading to Notch signaling activation and lung metastasis. BREA2 loss sensitizes breast cancer cells to inhibition of Notch signaling and suppresses the growth of breast cancer patient-derived xenograft tumors, highlighting its therapeutic potential in breast cancer. Taken together, these results reveal the lncRNA BREA2 as a putative regulator of Notch signaling and an oncogenic player driving breast cancer metastasis

eScholarship - University of California