Search CORE

946 research outputs found

Dependency Parsing with Dilated Iterated Graph CNNs

Author: McCallum Andrew
Strubell Emma
Publication venue
Publication date: 01/01/2017
Field of study

Dependency parses are an effective way to inject linguistic knowledge into many downstream tasks, and many practitioners wish to efficiently parse sentences at scale. Recent advances in GPU hardware have enabled neural networks to achieve significant gains over the previous best models, these models still fail to leverage GPUs' capability for massive parallelism due to their requirement of sequential processing of the sentence. In response, we propose Dilated Iterated Graph Convolutional Neural Networks (DIG-CNNs) for graph-based dependency parsing, a graph convolutional architecture that allows for efficient end-to-end GPU parsing. In experiments on the English Penn TreeBank benchmark, we show that DIG-CNNs perform on par with some of the best neural network parsers.Comment: 2nd Workshop on Structured Prediction for Natural Language Processing (at EMNLP '17

arXiv.org e-Print Archive

Crossref

Topic Models Conditioned on Arbitrary Features with Dirichlet-multinomial Regression

Author: McCallum Andrew
Mimno David
Publication venue
Publication date: 13/06/2012
Field of study

Although fully generative models have been successfully used to model the contents of text documents, they are often awkward to apply to combinations of text data and document metadata. In this paper we propose a Dirichlet-multinomial regression (DMR) topic model that includes a log-linear prior on document-topic distributions that is a function of observed features of the document, such as author, publication venue, references, and dates. We show that by selecting appropriate features, DMR topic models can meet or exceed the performance of several previously published topic models designed for specific data.Comment: Appears in Proceedings of the Twenty-Fourth Conference on Uncertainty in Artificial Intelligence (UAI2008

arXiv.org e-Print Archive

ScholarWorks@UMass Amherst

Distantly Labeling Data for Large Scale Cross-Document Coreference

Author: McCallum Andrew
Singh Sameer
Wick Michael
Publication venue
Publication date: 24/05/2010
Field of study

Cross-document coreference, the problem of resolving entity mentions across multi-document collections, is crucial to automated knowledge base construction and data mining tasks. However, the scarcity of large labeled data sets has hindered supervised machine learning research for this task. In this paper we develop and demonstrate an approach based on ``distantly-labeling'' a data set from which we can train a discriminative cross-document coreference model. In particular we build a dataset of more than a million people mentions extracted from 3.5 years of New York Times articles, leverage Wikipedia for distant labeling with a generative model (and measure the reliability of such labeling); then we train and evaluate a conditional random field coreference model that has factors on cross-document entities as well as mention-pairs. This coreference model obtains high accuracy in resolving mentions and entities that are not present in the training data, indicating applicability to non-Wikipedia data. Given the large amount of data, our work is also an exercise demonstrating the scalability of our approach.Comment: 16 pages, submitted to ECML 201

arXiv.org e-Print Archive

ScholarWorks@UMass Amherst