44,939 research outputs found
Cross-lingual Word Clusters for Direct Transfer of Linguistic Structure
It has been established that incorporating word cluster features derived from large unlabeled corpora can significantly improve prediction of linguistic structure. While previous work has focused primarily on English, we extend these results to other languages along two dimensions. First, we show that these results hold true for a number of languages across families. Second, and more interestingly, we provide an algorithm for inducing cross-lingual clusters and we show that features derived from these clusters significantly improve the accuracy of cross-lingual structure prediction. Specifically, we show that by augmenting direct-transfer systems with cross-lingual cluster features, the relative error of delexicalized dependency parsers, trained on English treebanks and transferred to foreign languages, can be reduced by up to 13%. When applying the same method to direct transfer of named-entity recognizers, we observe relative improvements of up to 26%
Meta-Learning for Phonemic Annotation of Corpora
We apply rule induction, classifier combination and meta-learning (stacked
classifiers) to the problem of bootstrapping high accuracy automatic annotation
of corpora with pronunciation information. The task we address in this paper
consists of generating phonemic representations reflecting the Flemish and
Dutch pronunciations of a word on the basis of its orthographic representation
(which in turn is based on the actual speech recordings). We compare several
possible approaches to achieve the text-to-pronunciation mapping task:
memory-based learning, transformation-based learning, rule induction, maximum
entropy modeling, combination of classifiers in stacked learning, and stacking
of meta-learners. We are interested both in optimal accuracy and in obtaining
insight into the linguistic regularities involved. As far as accuracy is
concerned, an already high accuracy level (93% for Celex and 86% for Fonilex at
word level) for single classifiers is boosted significantly with additional
error reductions of 31% and 38% respectively using combination of classifiers,
and a further 5% using combination of meta-learners, bringing overall word
level accuracy to 96% for the Dutch variant and 92% for the Flemish variant. We
also show that the application of machine learning methods indeed leads to
increased insight into the linguistic regularities determining the variation
between the two pronunciation variants studied.Comment: 8 page
Induction, complexity, and economic methodology
This paper focuses on induction, because the supposed weaknesses of that process are the main reason for favouring falsificationism, which plays an important part in scientific methodology generally; the paper is part of a wider study of economic methodology. The standard objections to, and paradoxes of, induction are reviewed, and this leads to the conclusion that the supposed âproblemâ or âriddleâ of induction is a false one. It is an artefact of two assumptions: that the classic two-valued logic (CL) is appropriate for the contexts in which induction is relevant; and that it is the touchstone of rational thought. The status accorded to CL is the result of historical and cultural factors. The material we need to reason about falls into four distinct domains; these are explored in turn, while progressively relaxing the restrictions that are essential to the valid application of CL. The restrictions include the requirement for a pre-existing, independently-guaranteed classification, into which we can fit all new cases with certainty; and non-ambiguous relationships between antecedents and consequents. Natural kinds, determined by the existence of complex entities whose characteristics cannot be unbundled and altered in a piecemeal, arbitrary fashion, play an important part in the review; so also does fuzzy logic (FL). These are used to resolve two famous paradoxes about induction (the grue and raven paradoxes); and the case for believing that conventional logic is a subset of fuzzy logic is outlined. The latter disposes of all questions of justifying induction deductively. The concept of problem structure is used as the basis for a structured concept of rationality that is appropriate to all four of the domains mentioned above. The rehabilitation of induction supports an alternative definition of science: that it is the business of developing networks of contrastive, constitutive explanations of reproducible, inter-subjective (âobjectiveâ) data. Social and psychological obstacles ensure the progress of science is slow and convoluted; however, the relativist arguments against such a project are rejected.induction; economics; methodology; complexity
Apperceptive patterning: Artefaction, extensional beliefs and cognitive scaffolding
In âPsychopower and Ordinary Madnessâ my ambition, as it relates to Bernard Stieglerâs recent literature, was twofold: 1) critiquing Stieglerâs work on exosomatization and artefactual posthumanismâor, more specifically, nonhumanismâto problematize approaches to media archaeology that rely upon technical exteriorization; 2) challenging how Stiegler engages with Giuseppe Longo and Francis Baillyâs conception of negative entropy. These efforts were directed by a prevalent techno-cultural qualifier: the rise of Synthetic Intelligence (including neural nets, deep learning, predictive processing and Bayesian models of cognition). This paper continues this project but first directs a critical analytic lens at the Derridean practice of the ontologization of grammatization from which Stiegler emerges while also distinguishing how metalanguages operate in relation to object-oriented environmental interaction by way of inferentialism. Stalking continental (Kapp, Simondon, Leroi-Gourhan, etc.) and analytic traditions (e.g., Carnap, Chalmers, Clark, Sutton, Novaes, etc.), we move from artefacts to AI and Predictive Processing so as to link theories related to technicity with philosophy of mind. Simultaneously drawing forth Robert Brandomâs conceptualization of the roles that commitments play in retrospectively reconstructing the social experiences that lead to our endorsement(s) of norms, we compliment this account with Reza Negarestaniâs deprivatized account of intelligence while analyzing the equipollent role between language and media (both digital and analog)
- âŠ