Search CORE

7 research outputs found

Construction and analysis of political networks over time via government and me

Author: Garcia-Olano Diego
Publication venue: Universitat Politècnica de Catalunya
Publication date: 01/01/2015
Field of study

In this work we present a tool that generates real world political networks from user provided lists of politicians and news sites. We use as input a dataset of current Texas politicians and 6 news sites to illustrate the graphs, tools and maps created by the tool to give users political insight

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Error Discovery by Clustering Influence Embeddings

Author: Adebayo Julius
Garcia-Olano Diego
Kokhlikyan Narine
Tan Sarah
Wang Fulton
Publication venue
Publication date: 07/12/2023
Field of study

We present a method for identifying groups of test examples -- slices -- on which a model under-performs, a task now known as slice discovery. We formalize coherence -- a requirement that erroneous predictions, within a slice, should be wrong for the same reason -- as a key property that any slice discovery method should satisfy. We then use influence functions to derive a new slice discovery method, InfEmbed, which satisfies coherence by returning slices whose examples are influenced similarly by the training data. InfEmbed is simple, and consists of applying K-Means clustering to a novel representation we deem influence embeddings. We show InfEmbed outperforms current state-of-the-art methods on 2 benchmarks, and is effective for model debugging across several case studies.Comment: NeuRIPs 2023 conference pape

arXiv.org e-Print Archive

Construction and analysis of political networks over time via government and me

Author: Garcia-Olano Diego
Publication venue: Universitat Politècnica de Catalunya
Publication date: 30/06/2015
Field of study

Recommended from our members

In-process diagnostic methods for entity representation learning on sequential data at scale

Author: Garcia-Olano Diego
Publication venue
Publication date: 07/11/2023
Field of study

The performance gains and expanded utilization of deep learning models in the fields of machine learning and natural language processing have been followed by a need for the internal mechanisms guiding them to be explainable and accompanied by methods allowing humans to diagnose and correct such models at inference time if needed. In contrast to post-hoc methods for explainability that train a secondary model to infer the decision reasoning of a primary model by using only its inputs and outputs, in-process methods offer faithful explanations of a model’s decisions by explicitly training the model to include such capabilities as an additional objective rather than trying to infer them in a post-hoc manner. Such methods should scale without sacrificing model performance and be sufficiently broad enough to incorporate diverse tasks and data types including sequential language, time series, and multi-modal data. Of particular interest is the analysis of such techniques for the learning of rich dense or sparse interpretable entity representations tied to knowledge bases. In this thesis we try to address these aims by developing efficient frameworks that handle different data types and provide diverse, in-process explainable techniques for transparent and trustworthy models. First, we show that it is feasible to learn dense entity representations from text via a dual encoder framework that encodes mentions and entities in the same dense vector space. Such representations can then be used for extremely fast entity linking where candidate entities are retrieved by approximate nearest neighbor search and generalize well to new datasets. During training the model leverages a novel negative mining algorithm which guides learning by iteratively constructing training batches to contain top candidates that were previously incorrectly ranked above the true entity. The technique dramatically improves model accuracy over iterations and the final batches can be viewed as samples most difficult for the model to learn. We then introduce a framework for learning in-process prototypes from an autoencoder that provides both instance-level and global explanations for time series classification. We explicitly optimize for increased prototype diversity which improves model accuracy and produces prototypes generated by learning regions of the latent space that highlight features the model uses for distinguishing amongst classes. We show that the prototypes are capable of learning real-world features - in our case-study ECG morphology related to bradycardia. Next we derive Biomedical Interpretable Entity Representations (BIER) in which dimensions correspond to fine-grained entity types, and values are predicted probabilities that a given entity is of the corresponding type. We propose a diagnostic method that exploits BIER’s final sparse and intermediate dense representations to facilitate model and entity type debugging and show BIERs achieve strong performance in biomedical tasks including named entity disambiguation and entity linking. We next propose a method for entity-based knowledge injection for the multimodal Knowledge-Based Visual Question Answering (KBVQA) task, which contains questions whose answers explicitly require external knowledge about named entities within an image, and study how it affects both task accuracy and an existing inprocess, bi-modal explainability technique. Our results show substantially improved performance on the KBVQA task without the need for additional costly pre-training, and we provide insights for when entity knowledge injection helps improve a model’s understanding. Finally, we introduce Intermediate enTity-based Sparse Interpretable Representation Learning (ItsIRL), an architecture that allows for fine-tuning of sparse, interpretable entity representations (IERs) on downstream tasks while preserving the semantics of the dimensions learned during pretraining. This approach surpasses prior IERs work and realizes competitive performance with dense models on biomedical tasks. We propose and study ‘counterfactual’ entity type manipulation techniques made possible by our architecture that allows fixing of ItsIRL errors that can surpass performance against dense non- interpretable models. Additionally, we propose a method to construct entity type based class prototypes for showing global semantic properties learned by our model, both for positive and negative instances.Electrical and Computer Engineerin

Texas ScholarWorks

Text as Policy: Measuring Policy Similarity Through Bill Text Reuse

Author: Ana Langer
Andrew Karch
Annelise Russell
Antje Witting
Barry R Weingast
Benjamin Blair
Beth A Simmons
Boris Shor
Brian J Gerber
Bruce A Desmarais
Bruce A. Desmarais
Charles R Shipan
Christoph Knill
Clinton Gormley
Craig Volden
Diego Garcia-Olano
Eduardo Alem�n
Elizabeth K Brown
Eugenia Giraudy
Frances Berry
Frederick J Boehmke
Frederick J Boehmke
Fridolin Linder
Gary M Weiss
Gary M Weiss
Gilardi
Gomez Rodriguez
Henry R Glick
Jack L Walker
Jacqui YES
James M Cook
Jason C Sharman
Jeffrey J Harden
Jeffrey J Harden
John D Huber
John Wilkerson
Justin Grimmer
Kristin N Garrett
Kurt Weyland
Lanny W Martin
Lawrence J Grossback
Lori Young
Margaret M Commins
Marie-Claude Tremblay
Matt Grossmann
Matt Grossmann
Matthew Burgess
Matthew Burgess
Matthieu Mondou
Michael D Ward
Michael Laver
Patrick G Scott
Rachel M Krause
Richard C Witmer
Roderick P Hart
Scott D Mcclurg
Sean Gerrish
Skyler J Cranmer
Skyler J Cranmer
Slava Mikhaylov
Stephen Robertson
Temple F Smith
Timothy Marquez
Van Atteveldt
Vickie D Ybarra
Will Lowe
Xun Cao
Publication venue: 'Elsevier BV'
Publication date: 01/01/2018
Field of study

Crossref