4,591 research outputs found
Cross-lingual Distillation for Text Classification
Cross-lingual text classification(CLTC) is the task of classifying documents
written in different languages into the same taxonomy of categories. This paper
presents a novel approach to CLTC that builds on model distillation, which
adapts and extends a framework originally proposed for model compression. Using
soft probabilistic predictions for the documents in a label-rich language as
the (induced) supervisory labels in a parallel corpus of documents, we train
classifiers successfully for new languages in which labeled training data are
not available. An adversarial feature adaptation technique is also applied
during the model training to reduce distribution mismatch. We conducted
experiments on two benchmark CLTC datasets, treating English as the source
language and German, French, Japan and Chinese as the unlabeled target
languages. The proposed approach had the advantageous or comparable performance
of the other state-of-art methods.Comment: Accepted at ACL 2017; Code available at
https://github.com/xrc10/cross-distil
Recommended from our members
Gaussian process regression for virtual metrology of microchip quality and the resulting strategic sampling scheme
Manufacturing of integrated circuits involves many sequential processes, often ex- ecuted to nanoscale tolerances, and the yield depends on the often unmeasured quality of intermediate steps. In the high-throughput industry of fabricating microelectronics on semi-conducting wafers, scheduling measurements of product quality before the electrical test of the complete IC can be expensive. We therefore seek to predict metrics of product quality based on sensor readings describing the environment within the relevant tool during the processing of each wafer, or to apply the concept of virtual metrology (VM) to monitor these intermediate steps. We model the data using Gaussian process regression (GPR), adapted to simultaneously learn the nonlinear dynamics that govern the quality characteristic, as well as their operating space, expressed by a linear embedding of the sensor traces’ features. Such Bayesian models predict a distribution for the target metric, such as a critical dimension, so one may assess the model’s credibility through its predictive uncertainty. Assuming measurements of the quality characteristic of interest are budgeted, we seek to hasten convergence of the GPR model to a credible form through an active sampling scheme, whereby the predictive uncertainty informs which wafer’s quality to measure next. We evaluate this convergence when predicting and updating online, as if in a factory, using a large dataset for plasma-enhanced chemical vapor deposition (PECVD), with measured thicknesses for ~32,000 wafers. By approximately optimizing the information extracted from this seemingly repetitive data describing a tightly controlled process, GPR achieves ~10% greater accuracy on average than a baseline linear model based on partial least squares (PLS). In a derivative study, we seek to discern the degree of drift in the process over the several months the data spans. We express this drift by how unusual the relevant features, as embedded by the GPR model, appear as the in- puts compensate for degrading conditions. This method detects the onset of consistently unusual behavior that extends to a bimodal thickness fault, anticipating its flagging by as much as two days.Mechanical Engineerin
A Diagram Is Worth A Dozen Images
Diagrams are common tools for representing complex concepts, relationships
and events, often when it would be difficult to portray the same information
with natural images. Understanding natural images has been extensively studied
in computer vision, while diagram understanding has received little attention.
In this paper, we study the problem of diagram interpretation and reasoning,
the challenging task of identifying the structure of a diagram and the
semantics of its constituents and their relationships. We introduce Diagram
Parse Graphs (DPG) as our representation to model the structure of diagrams. We
define syntactic parsing of diagrams as learning to infer DPGs for diagrams and
study semantic interpretation and reasoning of diagrams in the context of
diagram question answering. We devise an LSTM-based method for syntactic
parsing of diagrams and introduce a DPG-based attention model for diagram
question answering. We compile a new dataset of diagrams with exhaustive
annotations of constituents and relationships for over 5,000 diagrams and
15,000 questions and answers. Our results show the significance of our models
for syntactic parsing and question answering in diagrams using DPGs
- …