1 research outputs found

    Collocation Lattices and Maximum Entropy Models Andrei Mikheev

    No full text
    Maximum entropy framework proved to be expreve and powerfid for the statistical language modelling, but it swffers rom the computational expeiveness of the model building. The iterative scaling algorithm that s ued for the paramete estimation is computationally expensive while the feature select[on process requixes to estimate parameters o the model for many candidate features many Curies. In this paper we present a novel approach for burriding mzJnum entropy model. Our approach uses a eatures collocation lattice and selects the atomic features without resorting to iterative scaling. After the atomic features have been selected we, using the iterative scaling, compile a fully saturated mode] for the maximal constraint space and then start to te the most specic constraints. Since du_v:mg constraint deselection at every point we have a fadly fit mozdmum entropy model, we raxk the constraints on the basis of their weights in the model. Therefore we don't have to use the iterative scalin during constrafmt rareking and apply it only for linear model regresion. Another important improvement is that sknce the simplified model deviates from the previous larger model only in a small number of constraints, we use the parameters of the old model as the iJtial values of the parameters for the iterative _alig of the new one. This proved to decrease the number of required iterations by about tenfold.. As practical results we discuss how our method has been applied to several tasks of Language modelling such 8s sentence boxmdary disambiguation, part-of-speech taggin and automatic document abtracting
    corecore