4,664 research outputs found
Probing the topological properties of complex networks modeling short written texts
In recent years, graph theory has been widely employed to probe several
language properties. More specifically, the so-called word adjacency model has
been proven useful for tackling several practical problems, especially those
relying on textual stylistic analysis. The most common approach to treat texts
as networks has simply considered either large pieces of texts or entire books.
This approach has certainly worked well -- many informative discoveries have
been made this way -- but it raises an uncomfortable question: could there be
important topological patterns in small pieces of texts? To address this
problem, the topological properties of subtexts sampled from entire books was
probed. Statistical analyzes performed on a dataset comprising 50 novels
revealed that most of the traditional topological measurements are stable for
short subtexts. When the performance of the authorship recognition task was
analyzed, it was found that a proper sampling yields a discriminability similar
to the one found with full texts. Surprisingly, the support vector machine
classification based on the characterization of short texts outperformed the
one performed with entire books. These findings suggest that a local
topological analysis of large documents might improve its global
characterization. Most importantly, it was verified, as a proof of principle,
that short texts can be analyzed with the methods and concepts of complex
networks. As a consequence, the techniques described here can be extended in a
straightforward fashion to analyze texts as time-varying complex networks
Formally Specifying and Proving Operational Aspects of Forensic Lucid in Isabelle
A Forensic Lucid intensional programming language has been proposed for
intensional cyberforensic analysis. In large part, the language is based on
various predecessor and codecessor Lucid dialects bound by the higher-order
intensional logic (HOIL) that is behind them. This work formally specifies the
operational aspects of the Forensic Lucid language and compiles a theory of its
constructs using Isabelle, a proof assistant system.Comment: 23 pages, 3 listings, 3 figures, 1 table, 1 Appendix with theorems,
pp. 76--98. TPHOLs 2008 Emerging Trends Proceedings, August 18-21, Montreal,
Canada. Editors: Otmane Ait Mohamed and Cesar Munoz and Sofiene Tahar. The
individual paper's PDF is at
http://users.encs.concordia.ca/~tphols08/TPHOLs2008/ET/76-98.pd
A proposed translator writing system language - Computer project, volume 3, no. 1
Programming language for advanced translator writing syste
Kolmogorov Complexity in perspective. Part II: Classification, Information Processing and Duality
We survey diverse approaches to the notion of information: from Shannon
entropy to Kolmogorov complexity. Two of the main applications of Kolmogorov
complexity are presented: randomness and classification. The survey is divided
in two parts published in a same volume. Part II is dedicated to the relation
between logic and information system, within the scope of Kolmogorov
algorithmic information theory. We present a recent application of Kolmogorov
complexity: classification using compression, an idea with provocative
implementation by authors such as Bennett, Vitanyi and Cilibrasi. This stresses
how Kolmogorov complexity, besides being a foundation to randomness, is also
related to classification. Another approach to classification is also
considered: the so-called "Google classification". It uses another original and
attractive idea which is connected to the classification using compression and
to Kolmogorov complexity from a conceptual point of view. We present and unify
these different approaches to classification in terms of Bottom-Up versus
Top-Down operational modes, of which we point the fundamental principles and
the underlying duality. We look at the way these two dual modes are used in
different approaches to information system, particularly the relational model
for database introduced by Codd in the 70's. This allows to point out diverse
forms of a fundamental duality. These operational modes are also reinterpreted
in the context of the comprehension schema of axiomatic set theory ZF. This
leads us to develop how Kolmogorov's complexity is linked to intensionality,
abstraction, classification and information system.Comment: 43 page
- …