157 research outputs found
Causal Inference by Stochastic Complexity
The algorithmic Markov condition states that the most likely causal direction
between two random variables X and Y can be identified as that direction with
the lowest Kolmogorov complexity. Due to the halting problem, however, this
notion is not computable.
We hence propose to do causal inference by stochastic complexity. That is, we
propose to approximate Kolmogorov complexity via the Minimum Description Length
(MDL) principle, using a score that is mini-max optimal with regard to the
model class under consideration. This means that even in an adversarial
setting, such as when the true distribution is not in this class, we still
obtain the optimal encoding for the data relative to the class.
We instantiate this framework, which we call CISC, for pairs of univariate
discrete variables, using the class of multinomial distributions. Experiments
show that CISC is highly accurate on synthetic, benchmark, as well as
real-world data, outperforming the state of the art by a margin, and scales
extremely well with regard to sample and domain sizes
Methods and algorithms for unsupervised learning of morphology
This is an accepted manuscript of a chapter published by Springer in Computational Linguistics and Intelligent Text Processing. CICLing 2014. Lecture Notes in Computer Science, vol 8403 in 2014 available online: https://doi.org/10.1007/978-3-642-54906-9_15
The accepted version of the publication may differ from the final published version.This paper is a survey of methods and algorithms for unsupervised learning of morphology. We provide a description of the methods and algorithms used for morphological segmentation from a computational linguistics point of view. We survey morphological segmentation methods covering methods based on MDL (minimum description length), MLE (maximum likelihood estimation), MAP (maximum a posteriori), parametric and non-parametric Bayesian approaches. A review of the evaluation schemes for unsupervised morphological segmentation is also provided along with a summary of evaluation results on the Morpho Challenge evaluations.Published versio
- …