research

Evaluating effectiveness of linguistic technologies of knowledge identification in text collections

Abstract

The possibility of using integral coefficients of recall and precision to evaluate effectiveness of linguistic technologies of knowledge identification in texts is analyzed in the paper. An approach is based on the method of test collections, which is used for experimental validation of received effectiveness coefficients, and on methods of mathematical statistics. The problem of maximizing the reliability of sample results in their propagation on the general population of the tested text collection is studied. The method for determining the confidence interval for the attribute proportion, which is based on Wilson’s formula, and the method for determining the required size of the relevant sample under specified relative error and confidence probability, are considered

    Similar works