549 research outputs found
Volume 35, Number 3, September 2015 OLAC Newsletter
Digitized September 2015 issue of the OLAC Newsletter
Recommended from our members
Learning with Aggregate Data
Various real-world applications involve directly dealing with aggregate data. In this work, we study Learning with Aggregate Data from several perspectives and try to address their combinatorial challenges.
At first, we study the problem of learning in Collective Graphical Models (CGMs), where only noisy aggregate observations are available. Inference in CGMs is NP- hard and we proposed an approximate inference algorithm. By solving the inference problems, we are empowered to build large-scale bird migration models, and models for human mobility under the differential privacy setting.
Secondly, we consider problems given bags of instances and bag-level aggregate supervisions. Specifically, we study the US presidential election and try to build a model to understand the voting preferences of either individuals or demographic groups. The data consists of characteristic individuals from the US Census as well as
voting tallies for each voting precinct. We proposed a fully probabilistic Learning with Label Proportions (LLPs) model with exact inference to build an instance-level model.
Thirdly, we study distribution regression. It has similar problem setting to LLPs but builds bag-level models. We experimentally evaluated different algorithms on three tasks, and identified key factors in problem settings that impact the choice of algorithm
A Voted Regularized Dual Averaging Method for Large-Scale Discriminative Training in Natural Language Processing
Abstract We propose a new algorithm based on the dual averaging method for large-scale discriminative training in natural language processing (NLP), as an alternative to the perceptron algorithms or stochastic gradient descent (SGD). The new algorithm estimates parameters of linear models by minimizing 1 regularized objectives and are effective in obtaining sparse solutions, which is particularly desirable for large scale NLP tasks. We then give the mistake bound of the algorithm, and show how the bound is affected by the additional 1 regularization term. Evaluations on the tasks of parse reranking and statistical machine translation attest the success of the new algorithm
Volume 43, Number 1, March 2023 OLAC Newsletter
Digitized March 2023 issue of the OLAC Newsletter
Volume 36, Number 1, March 2016 OLAC Newsletter
Digitized March 2016 issue of the OLAC Newsletter
Volume 23, Number 3, September 2003 OLAC Newsletter
Digitized September 2003 issue of the OLAC Newsletter
Volume 26, Number 3, September 2006 OLAC Newsletter
Digitized September 2006 issue of the OLAC Newsletter
- …