7 research outputs found
Construction and analysis of political networks over time via government and me
In this work we present a tool that generates real world political networks from user provided lists of politicians and news sites. We use as input a dataset of current Texas politicians and 6 news sites to illustrate the graphs, tools and maps created by the tool to give users political insight
Error Discovery by Clustering Influence Embeddings
We present a method for identifying groups of test examples -- slices -- on
which a model under-performs, a task now known as slice discovery. We formalize
coherence -- a requirement that erroneous predictions, within a slice, should
be wrong for the same reason -- as a key property that any slice discovery
method should satisfy. We then use influence functions to derive a new slice
discovery method, InfEmbed, which satisfies coherence by returning slices whose
examples are influenced similarly by the training data. InfEmbed is simple, and
consists of applying K-Means clustering to a novel representation we deem
influence embeddings. We show InfEmbed outperforms current state-of-the-art
methods on 2 benchmarks, and is effective for model debugging across several
case studies.Comment: NeuRIPs 2023 conference pape
Construction and analysis of political networks over time via government and me
In this work we present a tool that generates real world political networks from user provided lists of politicians and news sites. We use as input a dataset of current Texas politicians and 6 news sites to illustrate the graphs, tools and maps created by the tool to give users political insight
Recommended from our members
In-process diagnostic methods for entity representation learning on sequential data at scale
The performance gains and expanded utilization of deep learning models in the fields of machine learning and natural language processing have been followed by a need for the internal mechanisms guiding them to be explainable and accompanied by methods allowing humans to diagnose and correct such models at inference time if needed. In contrast to post-hoc methods for explainability that train a secondary model to infer the decision reasoning of a primary model by using only its inputs and outputs, in-process methods offer faithful explanations of a model’s decisions by explicitly training the model to include such capabilities as an additional objective rather than trying to infer them in a post-hoc manner. Such methods should scale without sacrificing model performance and be sufficiently broad enough to incorporate diverse tasks and data types including sequential language, time series, and multi-modal data. Of particular interest is the analysis of such techniques for the learning of rich dense or sparse interpretable entity representations tied to knowledge bases. In this thesis we try to address these aims by developing efficient frameworks that handle different data types and provide diverse, in-process explainable techniques for transparent and trustworthy models. First, we show that it is feasible to learn dense entity representations from text via a dual encoder framework that encodes mentions and entities in the same dense vector space. Such representations can then be used for extremely fast entity linking where candidate entities are retrieved by approximate nearest neighbor search and generalize well to new datasets. During training the model leverages a novel negative mining algorithm which guides learning by iteratively constructing training batches to contain top candidates that were previously incorrectly ranked above the true entity. The technique dramatically improves model accuracy over iterations and the final batches can be viewed as samples most difficult for the model to learn. We then introduce a framework for learning in-process prototypes from an autoencoder that provides both instance-level and global explanations for time series classification. We explicitly optimize for increased prototype diversity which improves model accuracy and produces prototypes generated by learning regions of the latent space that highlight features the model uses for distinguishing amongst classes. We show that the prototypes are capable of learning real-world features - in our case-study ECG morphology related to bradycardia. Next we derive Biomedical Interpretable Entity Representations (BIER) in which dimensions correspond to fine-grained entity types, and values are predicted probabilities that a given entity is of the corresponding type. We propose a diagnostic method that exploits BIER’s final sparse and intermediate dense representations to facilitate model and entity type debugging and show BIERs achieve strong performance in biomedical tasks including named entity disambiguation and entity linking. We next propose a method for entity-based knowledge injection for the multimodal Knowledge-Based Visual Question Answering (KBVQA) task, which contains questions whose answers explicitly require external knowledge about named entities within an image, and study how it affects both task accuracy and an existing inprocess, bi-modal explainability technique. Our results show substantially improved performance on the KBVQA task without the need for additional costly pre-training, and we provide insights for when entity knowledge injection helps improve a model’s understanding. Finally, we introduce Intermediate enTity-based Sparse Interpretable Representation Learning (ItsIRL), an architecture that allows for fine-tuning of sparse, interpretable entity representations (IERs) on downstream tasks while preserving the semantics of the dimensions learned during pretraining. This approach surpasses prior IERs work and realizes competitive performance with dense models on biomedical tasks. We propose and study ‘counterfactual’ entity type manipulation techniques made possible by our architecture that allows fixing of ItsIRL errors that can surpass performance against dense non- interpretable models. Additionally, we propose a method to construct entity type based class prototypes for showing global semantic properties learned by our model, both for positive and negative instances.Electrical and Computer Engineerin