3 research outputs found

    Spelling Correction for Estonian Learner Language

    Get PDF

    Soft ontologies, spatial representations and multi-perspective exploration

    Get PDF
    Abstract: It is against the dynamically evolving nature of many contemporary media applications to be analysed in terms of conventional rigid ontologies that rely on expertise-based fixed categories and hierarchical structure. Many of these rely on sharing 'folksonomies', personal descriptions of information and objects for one's own retrieval. Such applications involve many feedback mechanisms via the community, and have been shown to have emergent properties of complex dynamic systems. We propose that such dynamically evolving information domains can be more usefully described by means of a soft ontology, a dynamically flexible and inherently spatial metadata approach for ill-defined domains. Our contribution is (1) the elaboration of the so far intuitive concept of soft ontology in a way that supports conceptualizing dynamically evolving domains. Further, our approach proposes (2) a whole new mode of interaction with information domains by means of recurring exploration of an information domain from multiple perspectives in search of more comprehensive understanding of it, i.e. multi-perspective exploration. We demonstrate this concept with an example of collaborative tagging in an educational context

    Eesti keele sõnajärje vealeidja prototüübi arendamine [The development of the prototype for an automatic word order error detector for the Estonian language]

    No full text
    The article presents the possibilities for recognizing word order errors in Estonian, the methods used and the current results. The article concentrates on the prototype for an automatic word order error detector for Estonian developed in Tallinn University. The statistic‐based program works on a method that is similar to n‐grams and the rules used are the patterns formed with 9 compulsory parts of a sentence. The set of correct word order patterns were found from the fiction sub‐corpus of Tartu University’s Corpus of Written Estonian. For the statistically reliable results and the utmost efficiency and speed of the program, the rules were placed in a tree structure. The prototype starts the searches by finding a proper initial tag and continues to find a correct compatible pattern that has the highest frequency rate. At current stage the work is focused on detecting the right/wrong position of the finite/infinite verb and the predicative (since most commonly Estonian is known as a verb second language). Prototype’s efficiency was tested on Estonian learner language corpus texts. In the test described in the article 5880 sentences were analyzed with the error analyzer and 300 sentences of the output were assessed. The prototype estimated the correctitude of the word order properly in 87.82% of the cases. Although there are a number of problems that still need to be solved including the misspelled or unknown words (i.e. proper nouns) and erringly unmarked clausal border, the method and the algorithm of the prototype for an automatic word order error detector for Estonian could also be used on other languages’ word order studies as well. The article is summarized with the survey of the problems occurred on word order detection and the possible ways to make the detector more efficient.Keywords morphosyntax, automatic error detection, word order errors
    corecore