Search CORE

1 research outputs found

Collocation Lattices and Maximum Entropy Models Andrei Mikheev

Author: Andrei Mikheev
Hcrc Language Technology
Steven Finch
Publication venue
Publication date: 01/01/1997
Field of study

Maximum entropy framework proved to be expreve and powerfid for the statistical language modelling, but it swffers rom the computational expeiveness of the model building. The iterative scaling algorithm that s ued for the paramete estimation is computationally expensive while the feature select[on process requixes to estimate parameters o the model for many candidate features many Curies. In this paper we present a novel approach for burriding mzJnum entropy model. Our approach uses a eatures collocation lattice and selects the atomic features without resorting to iterative scaling. After the atomic features have been selected we, using the iterative scaling, compile a fully saturated mode] for the maximal constraint space and then start to te the most specic constraints. Since du_v:mg constraint deselection at every point we have a fadly fit mozdmum entropy model, we raxk the constraints on the basis of their weights in the model. Therefore we don't have to use the iterative scalin during constrafmt rareking and apply it only for linear model regresion. Another important improvement is that sknce the simplified model deviates from the previous larger model only in a small number of constraints, we use the parameters of the old model as the iJtial values of the parameters for the iterative _alig of the new one. This proved to decrease the number of required iterations by about tenfold.. As practical results we discuss how our method has been applied to several tasks of Language modelling such 8s sentence boxmdary disambiguation, part-of-speech taggin and automatic document abtracting

CiteSeerX