3,872 research outputs found
Facticity as the amount of self-descriptive information in a data set
Using the theory of Kolmogorov complexity the notion of facticity {\phi}(x)
of a string is defined as the amount of self-descriptive information it
contains. It is proved that (under reasonable assumptions: the existence of an
empty machine and the availability of a faithful index) facticity is definite,
i.e. random strings have facticity 0 and for compressible strings 0 < {\phi}(x)
< 1/2 |x| + O(1). Consequently facticity measures the tension in a data set
between structural and ad-hoc information objectively. For binary strings there
is a so-called facticity threshold that is dependent on their entropy. Strings
with facticty above this threshold have no optimal stochastic model and are
essentially computational. The shape of the facticty versus entropy plot
coincides with the well-known sawtooth curves observed in complex systems. The
notion of factic processes is discussed. This approach overcomes problems with
earlier proposals to use two-part code to define the meaningfulness or
usefulness of a data set.Comment: 10 pages, 2 figure
Usage-based and emergentist approaches to language acquisition
It was long considered to be impossible to learn grammar based on linguistic experience alone. In the past decade, however, advances in usage-based linguistic theory, computational linguistics, and developmental psychology changed the view on this matter. So-called usage-based and emergentist approaches to language acquisition state that language can be learned from language use itself, by means of social skills like joint attention, and by means of powerful generalization mechanisms. This paper first summarizes the assumptions regarding the nature of linguistic representations and processing. Usage-based theories are nonmodular and nonreductionist, i.e., they emphasize the form-function relationships, and deal with all of language, not just selected levels of representations. Furthermore, storage and processing is considered to be analytic as well as holistic, such that there is a continuum between children's unanalyzed chunks and abstract units found in adult language. In the second part, the empirical evidence is reviewed. Children's linguistic competence is shown to be limited initially, and it is demonstrated how children can generalize knowledge based on direct and indirect positive evidence. It is argued that with these general learning mechanisms, the usage-based paradigm can be extended to multilingual language situations and to language acquisition under special circumstances
Collective Phenomena and Non-Finite State Computation in a Human Social System
We investigate the computational structure of a paradigmatic example of
distributed social interaction: that of the open-source Wikipedia community. We
examine the statistical properties of its cooperative behavior, and perform
model selection to determine whether this aspect of the system can be described
by a finite-state process, or whether reference to an effectively unbounded
resource allows for a more parsimonious description. We find strong evidence,
in a majority of the most-edited pages, in favor of a collective-state model,
where the probability of a "revert" action declines as the square root of the
number of non-revert actions seen since the last revert. We provide evidence
that the emergence of this social counter is driven by collective interaction
effects, rather than properties of individual users.Comment: 23 pages, 4 figures, 3 tables; to appear in PLoS ON
- ā¦