2,885 research outputs found
When do Words Matter? Understanding the Impact of Lexical Choice on Audience Perception using Individual Treatment Effect Estimation
Studies across many disciplines have shown that lexical choice can affect
audience perception. For example, how users describe themselves in a social
media profile can affect their perceived socio-economic status. However, we
lack general methods for estimating the causal effect of lexical choice on the
perception of a specific sentence. While randomized controlled trials may
provide good estimates, they do not scale to the potentially millions of
comparisons necessary to consider all lexical choices. Instead, in this paper,
we first offer two classes of methods to estimate the effect on perception of
changing one word to another in a given sentence. The first class of algorithms
builds upon quasi-experimental designs to estimate individual treatment effects
from observational data. The second class treats treatment effect estimation as
a classification problem. We conduct experiments with three data sources (Yelp,
Twitter, and Airbnb), finding that the algorithmic estimates align well with
those produced by randomized-control trials. Additionally, we find that it is
possible to transfer treatment effect classifiers across domains and still
maintain high accuracy.Comment: AAAI_201
Recommended from our members
Learning and reasoning with physical, structural, and symbolic priors
General purpose neural network models such as transformers have achieved remark- able success in solving complicated tasks across multiple domains and modalities. However, in many applications, the need for domain-specific models remains evident. Moreover, large-scale general-purpose models often demand an excessive amount of data to achieve satisfactory generalization. This PhD research aims to address these challenges by focusing on several application domain settings and improving model generalization through the integration of domain-specific priors into the learning and reasoning process. To accomplish this goal, we explore four distinct computational domains: traffic light control, graph cold start recommendation, program synthesis, and optimizer learning. For traffic light control, we tackle the task by incorporating the delayed propagation physical prior into the model architecture. We introduce the Delayed Propagation Transformer (DePT), a transformer-based model that leverages a cone-shaped spatial-temporal attention prior. DePT enables global modeling of CPS by considering the immutable constraints from the physical world, resulting in improved generalization performance compared to state-of-the-art expert methods. In the context of graph cold start recommendation, we address the limitations of Graph Neural Networks (GNNs) under cold start scenarios. Specifically, we introduce Cold Brew, a teacher-student distillation approach that incorporates neighborhood message passing, and quantified the behavior of inductive GNNs through the feature contribution ratio. For program synthesis, we focus on the challenge of generating high-quality code solutions by integrating structured thought processes. We propose ChainCoder, a program synthesis language model that progressively generates Python code in multiple passes, reflecting the “outline-then-detail" paradigm. By decomposing source code into layout frame components and accessory components, ChainCoder incorporates hierarchical generation, syntactic structure priors, and a tailored transformer architecture. In the domain of optimizer learning, we leverage the symbolic regression tool to overcome scalability and interpretability challenges in Learning to Optimize (L2O) models. By introducing a holistic symbolic representation and analysis framework for L2O, we gain insights into learnable optimizers and explicitly develop optimizers in symbolic form. This approach eliminates the scalability limitations associated with numerical rule representation in L2O models and provides interpretability and comparability among different L2O models. This PhD research has explored and developed methods to integrate domain-specific priors into the learning process, both by incorporating them into neural network architecture and by explicitly leveraging symbolic representations. By incorporating these physical, structural, and symbolic priors, we have improved generalization with less data requirements, and demonstrated the potential for more efficient and generalizable learning and reasoning systems.Electrical and Computer Engineerin
Mitigating Semantic Confusion from Hostile Neighborhood for Graph Active Learning
Graph Active Learning (GAL), which aims to find the most informative nodes in
graphs for annotation to maximize the Graph Neural Networks (GNNs) performance,
has attracted many research efforts but remains non-trivial challenges. One
major challenge is that existing GAL strategies may introduce semantic
confusion to the selected training set, particularly when graphs are noisy.
Specifically, most existing methods assume all aggregating features to be
helpful, ignoring the semantically negative effect between inter-class edges
under the message-passing mechanism. In this work, we present Semantic-aware
Active learning framework for Graphs (SAG) to mitigate the semantic confusion
problem. Pairwise similarities and dissimilarities of nodes with semantic
features are introduced to jointly evaluate the node influence. A new
prototype-based criterion and query policy are also designed to maintain
diversity and class balance of the selected nodes, respectively. Extensive
experiments on the public benchmark graphs and a real-world financial dataset
demonstrate that SAG significantly improves node classification performances
and consistently outperforms previous methods. Moreover, comprehensive analysis
and ablation study also verify the effectiveness of the proposed framework.Comment: Accepted by CIKM 202
Local selection of features and its applications to image search and annotation
In multimedia applications, direct representations of data objects typically involve hundreds or thousands of features. Given a query object, the similarity between the query object and a database object can be computed as the distance between their feature vectors. The neighborhood of the query object consists of those database objects that are close to the query object. The semantic quality of the neighborhood, which can be measured as the proportion of neighboring objects that share the same class label as the query object, is crucial for many applications, such as content-based image retrieval and automated image annotation. However, due to the existence of noisy or irrelevant features, errors introduced into similarity measurements are detrimental to the neighborhood quality of data objects.
One way to alleviate the negative impact of noisy features is to use feature selection techniques in data preprocessing. From the original vector space, feature selection techniques select a subset of features, which can be used subsequently in supervised or unsupervised learning algorithms for better performance. However, their performance on improving the quality of data neighborhoods is rarely evaluated in the literature. In addition, most traditional feature selection techniques are global, in the sense that they compute a single set of features across the entire database. As a consequence, the possibility that the feature importance may vary across different data objects or classes of objects is neglected.
To compute a better neighborhood structure for objects in high-dimensional feature spaces, this dissertation proposes several techniques for selecting features that are important to the local neighborhood of individual objects. These techniques are then applied to image applications such as content-based image retrieval and image label propagation. Firstly, an iterative K-NN graph construction method for image databases is proposed. A local variant of the Laplacian Score is designed for the selection of features for individual images. Noisy features are detected and sparsified iteratively from the original standardized feature vectors. This technique is incorporated into an approximate K-NN graph construction method so as to improve the semantic quality of the graph. Secondly, in a content-based image retrieval system, a generalized version of the Laplacian Score is used to compute different feature subspaces for images in the database. For online search, a query image is ranked in the feature spaces of database images. Those database images for which the query image is ranked highly are selected as the query results. Finally, a supervised method for the local selection of image features is proposed, for refining the similarity graph used in an image label propagation framework. By using only the selected features to compute the edges leading from labeled image nodes to unlabeled image nodes, better annotation accuracy can be achieved.
Experimental results on several datasets are provided in this dissertation, to demonstrate the effectiveness of the proposed techniques for the local selection of features, and for the image applications under consideration
- …