1,099 research outputs found

    Text Categorization and Machine Learning Methods: Current State Of The Art

    Get PDF
    In this informative age, we find many documents are available in digital forms which need classification of the text. For solving this major problem present researchers focused on machine learning techniques: a general inductive process automatically builds a classifier by learning, from a set of pre classified documents, the characteristics of the categories. The main benefit of the present approach is consisting in the manual definition of a classifier by domain experts where effectiveness, less use of expert work and straightforward portability to different domains are possible. The paper examines the main approaches to text categorization comparing the machine learning paradigm and present state of the art. Various issues pertaining to three different text similarity problems, namely, semantic, conceptual and contextual are also discussed

    An Evidential Fractal Analytic Hierarchy Process Target Recognition Method

    Get PDF
    Target recognition in uncertain environments is a hot issue, especially in extremely uncertain situation where both the target attribution and the sensor report are not clearly represented. To address this issue, a model which combines fractal theory, Dempster-Shafer evidence theory and analytic hierarchy process (AHP) to classify objects with incomplete information is proposed. The basic probability assignment (BPA), or belief function, can be modelled by conductivity function. The weight of each BPA is determined by AHP. Finally, the collected data are discounted with the weights. The feasibility and validness of proposed model is verified by an evidential classifier case in which sensory data are incomplete and collected from multiple level of granularity. The proposed fusion algorithm takes the advantage of not only efficient modelling of uncertain information, but also efficient combination of uncertain information

    From n-grams to n-sets: A Fuzzy-Logic-Based Approach to Shakespearian Authorship Attribution.

    Get PDF
    This thesis surveys the principles of Fuzzy Logic as they have been applied in the last three decades in the micro-electronic field and, in the context of resolving problems of authorship verification and attribution shows how these principles can assist with the detection of stylistic similarities or dissimilarities of an anonymous, disputed play to an author’s general or patterns-based known style. The main stylistic markers are the counts of semantic sets of 100 individual words-tokens and an index of counts of these words’ frequencies (a cosine index), as found in the first extract of approximately 10,000 words of each of 27 well attributed Shakespearian plays. Based on these markers, their geometrical representation, fuzzy modelling and on thee ground of Set Theory and Boolean Algebra, in the core part of this thesis three Mamdani (Type-1) genre-based Fuzzy Expert Systems were built for the detection of degrees (measured on a scale from 0 to 1) of Shakespearianness of disputed and, probably, co-authored plays of the early modern English period. Each of these three expert systems is composed of seven input and two output variables that are associated through a set of approximately 30 to 40 rules. There is a detailed description of the properties of the three expert systems’ inference mechanisms and the various experimentation phases. There is also an indicative graphical analysis of the phases of the experimentation and a thorough explanation of terms, such as partial truths membership, approximate reasoning and output centroids on an X-axis of a two-dimensional space. Throughout the thesis there is an extensive demonstration of various Fuzzy Logic techniques, including Sugeno-ANFIS (adaptive neuro-fuzzy inference system), with which the style of Shakespeare can be modelled in order to compare it with well attributed plays of other authors or plays that are not included in the strict Shakespearian canon of the selected 27 well-attributed, sole authored plays. In addition, other relevant issues of stylometric concern are discussed, such as the investigation and classification of known ‘problem’ and disputed plays through holistic classifiers (irrespective of genre). The results of the experimentation advocate the use of this novel, automated and computer simulation-based method of classification in the stylometric field for various purposes. In fact, the three models have succeeded in detecting the low Shakespearianness of non Shakespearian plays and the results they provided for anonymous, disputed plays are in conformance with the general evidence of historical scholarship. Therefore, the original contribution of this thesis is to define fully functional automated fuzzy classifiers of Shakespearianness. The result of this discovery is that we now know that the principles of fuzzy modelling can be applied for the creation of Fuzzy Expert Stylistic Classifiers and the concomitant detection of degrees of similarity of a play under scrutiny with the general or patterns-based known style of a specific author (in our case, Shakespeare). Furthermore, this thesis shows that, given certain premises, counts of words’ frequencies and counts of semantic sets of words can be employed satisfactorily for stylistic discrimination
    • …
    corecore