Text-based language identification (LID) is the task of determining the language a piece of text\ud
is written in. Although modem LID tools achieve high accuracy using the widely-accepted\ud
n-gram method, there are several areas of LID that remain more difficult, particularly the task\ud
of distinguishing between closely related languages. Langscape, a project of the University\ud
of Maryland's Language Science Center, has an LID tool that uses a variation on the n-gram\ud
method. In this thesis, 1 propose and test a modification to Langscape's LID tool to improve its\ud
ability to distinguish between closely related languages
Is data on this page outdated, violates copyrights or anything else? Report the problem now and we will take corresponding actions after reviewing your request.