4,664 research outputs found

    Probing the topological properties of complex networks modeling short written texts

    Get PDF
    In recent years, graph theory has been widely employed to probe several language properties. More specifically, the so-called word adjacency model has been proven useful for tackling several practical problems, especially those relying on textual stylistic analysis. The most common approach to treat texts as networks has simply considered either large pieces of texts or entire books. This approach has certainly worked well -- many informative discoveries have been made this way -- but it raises an uncomfortable question: could there be important topological patterns in small pieces of texts? To address this problem, the topological properties of subtexts sampled from entire books was probed. Statistical analyzes performed on a dataset comprising 50 novels revealed that most of the traditional topological measurements are stable for short subtexts. When the performance of the authorship recognition task was analyzed, it was found that a proper sampling yields a discriminability similar to the one found with full texts. Surprisingly, the support vector machine classification based on the characterization of short texts outperformed the one performed with entire books. These findings suggest that a local topological analysis of large documents might improve its global characterization. Most importantly, it was verified, as a proof of principle, that short texts can be analyzed with the methods and concepts of complex networks. As a consequence, the techniques described here can be extended in a straightforward fashion to analyze texts as time-varying complex networks

    Formally Specifying and Proving Operational Aspects of Forensic Lucid in Isabelle

    Get PDF
    A Forensic Lucid intensional programming language has been proposed for intensional cyberforensic analysis. In large part, the language is based on various predecessor and codecessor Lucid dialects bound by the higher-order intensional logic (HOIL) that is behind them. This work formally specifies the operational aspects of the Forensic Lucid language and compiles a theory of its constructs using Isabelle, a proof assistant system.Comment: 23 pages, 3 listings, 3 figures, 1 table, 1 Appendix with theorems, pp. 76--98. TPHOLs 2008 Emerging Trends Proceedings, August 18-21, Montreal, Canada. Editors: Otmane Ait Mohamed and Cesar Munoz and Sofiene Tahar. The individual paper's PDF is at http://users.encs.concordia.ca/~tphols08/TPHOLs2008/ET/76-98.pd

    A proposed translator writing system language - Computer project, volume 3, no. 1

    Get PDF
    Programming language for advanced translator writing syste

    Kolmogorov Complexity in perspective. Part II: Classification, Information Processing and Duality

    Get PDF
    We survey diverse approaches to the notion of information: from Shannon entropy to Kolmogorov complexity. Two of the main applications of Kolmogorov complexity are presented: randomness and classification. The survey is divided in two parts published in a same volume. Part II is dedicated to the relation between logic and information system, within the scope of Kolmogorov algorithmic information theory. We present a recent application of Kolmogorov complexity: classification using compression, an idea with provocative implementation by authors such as Bennett, Vitanyi and Cilibrasi. This stresses how Kolmogorov complexity, besides being a foundation to randomness, is also related to classification. Another approach to classification is also considered: the so-called "Google classification". It uses another original and attractive idea which is connected to the classification using compression and to Kolmogorov complexity from a conceptual point of view. We present and unify these different approaches to classification in terms of Bottom-Up versus Top-Down operational modes, of which we point the fundamental principles and the underlying duality. We look at the way these two dual modes are used in different approaches to information system, particularly the relational model for database introduced by Codd in the 70's. This allows to point out diverse forms of a fundamental duality. These operational modes are also reinterpreted in the context of the comprehension schema of axiomatic set theory ZF. This leads us to develop how Kolmogorov's complexity is linked to intensionality, abstraction, classification and information system.Comment: 43 page
    • …
    corecore