Abstract

Despite the many initiatives in recent years aimed at creating Language Engineering standards, it is often the case that different projects use different approaches and often define their own standards. Even within the same project it often happens that different tools will require different ways to represent their linguistic data. In a recently started EU project focusing on the integration of Information Extraction and Data Mining techniques, we aim at avoiding the problem of incompatibility among different tools by defining a Common Annotation Scheme internal to the project. However, when the project was started (Sep 2002) we were unaware of the standardization effort of ISO TC37/SC4, and so we commenced once again trying to define our own schema. Fortunately, as this work is still at an early stage (the project will last till 2005) it is still possible to redirect it in a way that it will be compatible with the standardization work of ISO. In this paper we describe the status of the work in the project and explore possible synergies with the work in ISO TC37 SC4

    Similar works

    Full text

    thumbnail-image

    Available Versions

    Last time updated on 09/07/2013