748 research outputs found
A Survey of Word Reordering in Statistical Machine Translation: Computational Models and Language Phenomena
Word reordering is one of the most difficult aspects of statistical machine
translation (SMT), and an important factor of its quality and efficiency.
Despite the vast amount of research published to date, the interest of the
community in this problem has not decreased, and no single method appears to be
strongly dominant across language pairs. Instead, the choice of the optimal
approach for a new translation task still seems to be mostly driven by
empirical trials. To orientate the reader in this vast and complex research
area, we present a comprehensive survey of word reordering viewed as a
statistical modeling challenge and as a natural language phenomenon. The survey
describes in detail how word reordering is modeled within different
string-based and tree-based SMT frameworks and as a stand-alone task, including
systematic overviews of the literature in advanced reordering modeling. We then
question why some approaches are more successful than others in different
language pairs. We argue that, besides measuring the amount of reordering, it
is important to understand which kinds of reordering occur in a given language
pair. To this end, we conduct a qualitative analysis of word reordering
phenomena in a diverse sample of language pairs, based on a large collection of
linguistic knowledge. Empirical results in the SMT literature are shown to
support the hypothesis that a few linguistic facts can be very useful to
anticipate the reordering characteristics of a language pair and to select the
SMT framework that best suits them.Comment: 44 pages, to appear in Computational Linguistic
FO2(<,+1,~) on data trees, data tree automata and branching vector addition systems
A data tree is an unranked ordered tree where each node carries a label from
a finite alphabet and a datum from some infinite domain. We consider the two
variable first order logic FO2(<,+1,~) over data trees. Here +1 refers to the
child and the next sibling relations while < refers to the descendant and
following sibling relations. Moreover, ~ is a binary predicate testing data
equality. We exhibit an automata model, denoted DAD# that is more expressive
than FO2(<,+1,~) but such that emptiness of DAD# and satisfiability of
FO2(<,+1,~) are inter-reducible. This is proved via a model of counter tree
automata, denoted EBVASS, that extends Branching Vector Addition Systems with
States (BVASS) with extra features for merging counters. We show that, as
decision problems, reachability for EBVASS, satisfiability of FO2(<,+1,~) and
emptiness of DAD# are equivalent
Causal Effect Random Forest Of Interaction Trees For Learning Individualized Treatment Regimes In Observational Studies: With Applications To Education Study Data
Learning individualized treatment regimes (ITR) using observational data holds great interest in various fields, as treatment recommendations based on individual characteristics may improve individual treatment benefits with a reduced cost. It has long been observed that different individuals may respond to a certain treatment with significant heterogeneity. ITR can be defined as a mapping between individual characteristics to a treatment assignment. The optimal ITR is the treatment assignment that maximizes expected individual treatment effects. Rooted from personalized medicine, many studies and applications of ITR are in medical fields and clinical practice. Heterogeneous responses are also well documented in educational interventions. However, unlike the efficacy study in medical studies, educational interventions are often not randomized. Study results often suffer greatly from self-selection bias. Besides the intervention itself, the efficacy and effectiveness of interventions usually interact with a wide range of confounders.
In this study, we propose a novel algorithm to extend random forest of interaction trees to Casual Effect Random Forest of Interaction Trees (CERFIT) for learning individualized treatment effects and regimes. We first consider the study under a binary treatment setting. Each interaction tree recursively partitions the data into two subgroups with greatest heterogeneity of treatment effect. By integrating propensity score into the tree growing process, subgroups from the proposed CERFIT not only have maximized treatment effect differences, but also similar baseline covariates. Thus it allows for the estimation of the individualized treatment effects using observational data. In addition, we also propose to use residuals from linear models instead of the original responses in the algorithm. By doing so, the numerical stability of the algorithm is greatly improved, which leads to an improved prediction accuracy. We then consider the learning problem under non-binary treatment settings. For multiple treatments, through recursively partitioning data into two subgroups with greatest treatment effects heterogeneity with respect to two randomly selected treatment groups, the algorithm transforms the multiple learning ITR into a binary task. Similarly, continuous treatment can be handled through recursively partitioning the data into subgroups with greatest homogeneity in terms of the association between the response and the treatment within a child node. For all treatment settings, the CERFIT provides variable importance ranking in terms of treatment effects. Extensive simulation studies for assessing estimation accuracy and variable importance ranking are presented. CERFIT demonstrates competitive performance among all competing methods in simulation studies. The methods are also illustrated through an assessment of a voluntary education intervention for binary treatment setting and learning optimal ITR among multiple interventions for non-binary treatments using data from a large public university
Random intersection trees
Finding interactions between variables in large and high-dimensional datasets
is often a serious computational challenge. Most approaches build up
interaction sets incrementally, adding variables in a greedy fashion. The
drawback is that potentially informative high-order interactions may be
overlooked. Here, we propose at an alternative approach for classification
problems with binary predictor variables, called Random Intersection Trees. It
works by starting with a maximal interaction that includes all variables, and
then gradually removing variables if they fail to appear in randomly chosen
observations of a class of interest. We show that informative interactions are
retained with high probability, and the computational complexity of our
procedure is of order for a value of that can reach values
as low as 1 for very sparse data; in many more general settings, it will still
beat the exponent obtained when using a brute force search constrained to
order interactions. In addition, by using some new ideas based on min-wise
hash schemes, we are able to further reduce the computational cost.
Interactions found by our algorithm can be used for predictive modelling in
various forms, but they are also often of interest in their own right as useful
characterisations of what distinguishes a certain class from others.This is the author's accepted manuscript. The final version of the manuscript can be found in the Journal of Machine Learning Research here: jmlr.csail.mit.edu/papers/volume15/shah14a/shah14a.pdf
Sign Spotting using Hierarchical Sequential Patterns with Temporal Intervals
This paper tackles the problem of spotting a set of signs occuring in videos with sequences of signs. To achieve this, we propose to model the spatio-temporal signatures of a sign using an extension of sequential patterns that contain temporal intervals called Sequential Interval Patterns (SIP). We then propose a novel multi-class classifier that organises different sequential interval patterns in a hierarchical tree structure called a Hierarchical SIP Tree (HSP-Tree). This allows one to exploit any subsequence sharing that exists between different SIPs of different classes. Multiple trees are then combined together into a forest of HSP-Trees resulting in a strong classifier that can be used to spot signs. We then show how the HSP-Forest can be used to spot sequences of signs that occur in an input video. We have evaluated the method on both concatenated sequences of isolated signs and continuous sign sequences. We also show that the proposed method is superior in robustness and accuracy to a state of the art sign recogniser when applied to spotting a sequence of signs.This work was funded by the UK government
- …