4 research outputs found

    Unsupervised Bilingual Segmentation using MDL for Machine Translation

    Get PDF

    Can MDL Improve Unsupervised Chinese Word Segmentation?

    Get PDF
    International audienceIt is often assumed that Minimum Descrip- tion Length (MDL) is a good criterion for unsupervised word segmentation. In this paper, we introduce a new approach to unsupervised word segmentation of Man- darin Chinese, that leads to segmentations whose Description Length is lower than what can be obtained using other algo- rithms previously proposed in the litera- ture. Suprisingly, we show that this lower Description Length does not necessarily corresponds to better segmentation results. Finally, we show that we can use very basic linguistic knowledge to coerce the MDL towards a linguistically plausible hypoth- esis and obtain better results than any pre- viously proposed method for unsupervised Chinese word segmentation with minimal human effort

    A usage-based model for the acquisition of syntactic constructions and its application in spoken language understanding

    Get PDF
    Gaspers J. A usage-based model for the acquisition of syntactic constructions and its application in spoken language understanding. Bielefeld: Universitätsbibliothek Bielefeld; 2014
    corecore