Search CORE

162 research outputs found

Multilabel Consensus Classification

Author: Fan Wei
Gao Jing
Kong Xiangnan
Xie Sihong
Yu Philip S.
Publication venue
Publication date: 01/01/2013
Field of study

In the era of big data, a large amount of noisy and incomplete data can be collected from multiple sources for prediction tasks. Combining multiple models or data sources helps to counteract the effects of low data quality and the bias of any single model or data source, and thus can improve the robustness and the performance of predictive models. Out of privacy, storage and bandwidth considerations, in certain circumstances one has to combine the predictions from multiple models or data sources to obtain the final predictions without accessing the raw data. Consensus-based prediction combination algorithms are effective for such situations. However, current research on prediction combination focuses on the single label setting, where an instance can have one and only one label. Nonetheless, data nowadays are usually multilabeled, such that more than one label have to be predicted at the same time. Direct applications of existing prediction combination methods to multilabel settings can lead to degenerated performance. In this paper, we address the challenges of combining predictions from multiple multilabel classifiers and propose two novel algorithms, MLCM-r (MultiLabel Consensus Maximization for ranking) and MLCM-a (MLCM for microAUC). These algorithms can capture label correlations that are common in multilabel classifications, and optimize corresponding performance metrics. Experimental results on popular multilabel classification tasks verify the theoretical analysis and effectiveness of the proposed methods

arXiv.org e-Print Archive

CiteSeerX

Crossref

Dual Attention Network for Product Compatibility and Function Satisfiability Analysis

Author: Shu Lei
Xie Sihong
Xu Hu
Yu Philip S.
Publication venue
Publication date: 05/12/2017
Field of study

Product compatibility and their functionality are of utmost importance to customers when they purchase products, and to sellers and manufacturers when they sell products. Due to the huge number of products available online, it is infeasible to enumerate and test the compatibility and functionality of every product. In this paper, we address two closely related problems: product compatibility analysis and function satisfiability analysis, where the second problem is a generalization of the first problem (e.g., whether a product works with another product can be considered as a special function). We first identify a novel question and answering corpus that is up-to-date regarding product compatibility and functionality information. To allow automatic discovery product compatibility and functionality, we then propose a deep learning model called Dual Attention Network (DAN). Given a QA pair for a to-be-purchased product, DAN learns to 1) discover complementary products (or functions), and 2) accurately predict the actual compatibility (or satisfiability) of the discovered products (or functions). The challenges addressed by the model include the briefness of QAs, linguistic patterns indicating compatibility, and the appropriate fusion of questions and answers. We conduct experiments to quantitatively and qualitatively show that the identified products and functions have both high coverage and accuracy, compared with a wide spectrum of baselines

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Product Function Need Recognition via Semi-supervised Attention Network

Author: Shu Lei
Xie Sihong
Xu Hu
Yu Philip S.
Publication venue
Publication date: 01/12/2017
Field of study

Functionality is of utmost importance to customers when they purchase products. However, it is unclear to customers whether a product can really satisfy their needs on functions. Further, missing functions may be intentionally hidden by the manufacturers or the sellers. As a result, a customer needs to spend a fair amount of time before purchasing or just purchase the product on his/her own risk. In this paper, we first identify a novel QA corpus that is dense on product functionality information \footnote{The annotated corpus can be found at \url{https://www.cs.uic.edu/~hxu/}.}. We then design a neural network called Semi-supervised Attention Network (SAN) to discover product functions from questions. This model leverages unlabeled data as contextual information to perform semi-supervised sequence labeling. We conduct experiments to show that the extracted function have both high coverage and accuracy, compared with a wide spectrum of baselines

arXiv.org e-Print Archive

Crossref

SEVEN: Deep Semi-supervised Verification Networks

Author: Bahaadini Sara
Noroozi Vahid
Xie Sihong
Yu Philip S.
Zheng Lei
Publication venue
Publication date: 14/06/2017
Field of study

Verification determines whether two samples belong to the same class or not, and has important applications such as face and fingerprint verification, where thousands or millions of categories are present but each category has scarce labeled examples, presenting two major challenges for existing deep learning models. We propose a deep semi-supervised model named SEmi-supervised VErification Network (SEVEN) to address these challenges. The model consists of two complementary components. The generative component addresses the lack of supervision within each category by learning general salient structures from a large amount of data across categories. The discriminative component exploits the learned general features to mitigate the lack of supervision within categories, and also directs the generative component to find more informative structures of the whole data manifold. The two components are tied together in SEVEN to allow an end-to-end training of the two components. Extensive experiments on four verification tasks demonstrate that SEVEN significantly outperforms other state-of-the-art deep semi-supervised techniques when labeled data are in short supply. Furthermore, SEVEN is competitive with fully supervised baselines trained with a larger amount of labeled data. It indicates the importance of the generative component in SEVEN.Comment: 7 pages, 2 figures, accepted to the 2017 International Joint Conference on Artificial Intelligence (IJCAI-17

arXiv.org e-Print Archive

Crossref

Recommended from our members

Essays On Estimation of A Regression Jump: A Generalized Reflection Approach

Author: Xie Sihong
Publication venue: University of Colorado Boulder
Publication date: 28/07/2019
Field of study

Regression Discontinuity (RD) designs are popular models in economics used by researchers to evaluate the effects of policy interventions. In the past two decades, a great number of papers on RD applications and methodology have been published in leading economic journals. However, research on RD estimators, which is fundamental to RD models, has been few and far between. The main estimation approach is to apply local linear (LL) or local polynomial estimators on both sides of the known discontinuity point and then to estimate the jump. Most developments have focused on amendments to and improvements of LL, but there are almost no competitive alternatives for LL estimators. This dissertation adopts a novel approach by providing a completely new class of RD estimators taking a generalized reflection approach by using the extension of Hestenes (1941). My estimators have simple analytical representations, desirable asymptotic properties, and are computationally easy to implement. Having boundary properties that are as good as LL estimators and performing better than LL estimators in finite samples, my estimators offer a competitive alternative for LL estimators in RD models. In Chapter 1, I review major theoretical developments in RD design in the econometrics literature, focusing on estimators for regression discontinuity. In Chapter 2, I introduce my Hestenes-based RD estimators. Focusing on properties at boundary points, I provide results on the bias, variance and asymptotic distribution of my estimators. I compare the finite sample properties of my estimators with popular regression estimators – the Nadaraya-Watson and LL estimators – using Monte Carlo studies, empirical examples, and empirically motivated simulations. Chapter 3 extends the estimation of univariate regression with a discontinuity to multivariate regression settings. I consider an additive model and propose four two-stage estimators: at the first stage, I use a marginal integration, instrument variable, backfitting, or B-splines estimator for the continuous components of the regression; at the second stage, I use the Hestenes estimator developed in Chapter 2 to estimate the jump discontinuity. Monte Carlo studies show my estimators outperform the local linear RD estimators in an additive linear model that are commonly used in empirical research.</p

CU Scholar Institutional Repository

DetectGPT-SC: Improving Detection of Text Generated by Large Language Models through Self-Consistency with Masked Predictions

Author: Li Qi
Wang Rongsheng
Xie Sihong
Publication venue
Publication date: 22/10/2023
Field of study

General large language models (LLMs) such as ChatGPT have shown remarkable success, but it has also raised concerns among people about the misuse of AI-generated texts. Therefore, an important question is how to detect whether the texts are generated by ChatGPT or by humans. Existing detectors are built on the assumption that there is a distribution gap between human-generated and AI-generated texts. These gaps are typically identified using statistical information or classifiers. In contrast to prior research methods, we find that large language models such as ChatGPT exhibit strong self-consistency in text generation and continuation. Self-consistency capitalizes on the intuition that AI-generated texts can still be reasoned with by large language models using the same logical reasoning when portions of the texts are masked, which differs from human-generated texts. Using this observation, we subsequently proposed a new method for AI-generated texts detection based on self-consistency with masked predictions to determine whether a text is generated by LLMs. This method, which we call DetectGPT-SC. We conducted a series of experiments to evaluate the performance of DetectGPT-SC. In these experiments, we employed various mask scheme, zero-shot, and simple prompt for completing masked texts and self-consistency predictions. The results indicate that DetectGPT-SC outperforms the current state-of-the-art across different tasks.Comment: 7 pages, 3 figure

arXiv.org e-Print Archive

A Differential Geometric View and Explainability of GNN on Evolving Graphs

Author: Liu Yazheng
Xie Sihong
Zhang Xi
Publication venue
Publication date: 11/03/2024
Field of study

Graphs are ubiquitous in social networks and biochemistry, where Graph Neural Networks (GNN) are the state-of-the-art models for prediction. Graphs can be evolving and it is vital to formally model and understand how a trained GNN responds to graph evolution. We propose a smooth parameterization of the GNN predicted distributions using axiomatic attribution, where the distributions are on a low-dimensional manifold within a high-dimensional embedding space. We exploit the differential geometric viewpoint to model distributional evolution as smooth curves on the manifold. We reparameterize families of curves on the manifold and design a convex optimization problem to find a unique curve that concisely approximates the distributional evolution for human interpretation. Extensive experiments on node classification, link prediction, and graph classification tasks with evolving graphs demonstrate the better sparsity, faithfulness, and intuitiveness of the proposed method over the state-of-the-art methods.Comment: Accepted into ICLR 202

arXiv.org e-Print Archive