research

Determining the Unithood of Word Sequences using Mutual Information and Independence Measure

Abstract

Most works related to unithood were conducted as part of a larger effort for the determination of termhood. Consequently, the number of independent research that study the notion of unithood and produce dedicated techniques for measuring unithood is extremely small. We propose a new approach, independent of any influences of termhood, that provides dedicated measures to gather linguistic evidence from parsed text and statistical evidence from Google search engine for the measurement of unithood. Our evaluations revealed a precision and recall of 98.68% and 91.82% respectively with an accuracy at 95.42% in measuring the unithood of 1005 test cases.Comment: More information is available at http://explorer.csse.uwa.edu.au/reference

    Similar works

    Full text

    thumbnail-image

    Available Versions