10 research outputs found

    Information theory : proceedings of the 1990 IEEE international workshop, Eindhoven, June 10-15, 1990

    Get PDF

    Information theory : proceedings of the 1990 IEEE international workshop, Eindhoven, June 10-15, 1990

    Get PDF

    Acta Cybernetica : Volume 14. Number 3.

    Get PDF

    Reinforcing connectionism: learning the statistical way

    Get PDF
    Connectionism's main contribution to cognitive science will prove to be the renewed impetus it has imparted to learning. Learning can be integrated into the existing theoretical foundations of the subject, and the combination, statistical computational theories, provide a framework within which many connectionist mathematical mechanisms naturally fit. Examples from supervised and reinforcement learning demonstrate this. Statistical computational theories already exist for certainn associative matrix memories. This work is extended, allowing real valued synapses and arbitrarily biased inputs. It shows that a covariance learning rule optimises the signal/noise ratio, a measure of the potential quality of the memory, and quantifies the performance penalty incurred by other rules. In particular two that have been suggested as occuring naturally are shown to be asymptotically optimal in the limit of sparse coding. The mathematical model is justified in comparison with other treatments whose results differ. Reinforcement comparison is a way of hastening the learning of reinforcement learning systems in statistical environments. Previous theoretical analysis has not distinguished between different comparison terms, even though empirically, a covariance rule has been shown to be better than just a constant one. The workings of reinforcement comparison are investigated by a second order analysis of the expected statistical performance of learning, and an alternative rule is proposed and empirically justified. The existing proof that temporal difference prediction learning converges in the mean is extended from a special case involving adjacent time steps to the general case involving arbitary ones. The interaction between the statistical mechanism of temporal difference and the linear representation is particularly stark. The performance of the method given a linearly dependent representation is also analysed

    Is the most likely model likely to be the correct model?

    Get PDF
    Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2009.Cataloged from PDF version of thesis.Includes bibliographical references (p. 89-93).In this work, I test the hypothesis that the 2-dimensional dependencies of a deterministic model can be correctly recovered via hypothesis-enumeration and Bayesian selection for a linear sequence, and what the degree of 'ignorance' or 'uncertainty' is that Bayesian selection can tolerate concerning the properties of the model and data. The experiment tests the data created by a number of rules of size 3 and compares the implied dependency map to the (correct) dependencies of the various generating rules, then extends it to a composition of 2 rules of total size 5. I found that 'causal' belief networks do not map directly to the dependencies of actual causal structures. For deterministic rules satisfying the condition of multiple involvement (two tails), the correct model is not likely to be retrieved without augmenting the model selection with a prior high enough to suggest that the desired dependency model is already known - simply restricting the class of models to trees, and placing other restrictions (such as ordering) is not sufficient. Second, the identified-model to correct-model map is not 1 to 1 - in the rule cases where the correct model is identified, the identified model could just as easily have been produced by a different rule. Third, I discovered that uncertainty concerning identification of observations directly resulted in the loss of existing information and made model selection the product of pure chance (such as the last observation). How to read and identify observations had to be agreed upon a-priori by both the rule and the learner to have any consistency in model identification.(cont.) Finally, I discovered that it is not the rule-observations that discriminate between models, but rather the noise, or uncaptured observations that govern the identified model. In analysis, I found that in enumeration of hypotheses (as dependency graphs) the differentiating space is very small. With representations of conditional independence, the equivalent factorizations of the graphs make the differentiating space even smaller. As Bayesian model identification relies on convergence to the differentiating space, if those spaces are diminishing in size (if the model size is allowed to grow) relative to the observation sequence, then maximizing the likelihood of a particular hypothesis may fail to converge on the correct one. Overall I found that if a learning mechanism either does not know how to read observations or know the dependencies he is looking for a-priori, then it is not likely to identify them probabilistically. Finally, I also confirmed existing results - that model selection always prefers increasingly connected models over independent models was confirmed, as was the knowledge that several conditional-independence graphs have equivalent factorizations. Finally Shannon's Asymptotic Equipartition Property was confirmed to apply both for novel observations and for an increasing model/parameter space size. These results are applicable to a number of domains: natural language processing and language induction by statistical means, bioinformatics and statistical identification and merging of ontologies, and induction of real-world causal dependencies.by Beracah Yankama.S.M

    Margins and combined classifiers

    Get PDF

    Management: A continuing bibliography with indexes, March 1983

    Get PDF
    This bibliography lists 960 reports, articles, and other documents introduced into the NASA scientific and technical information system in 1982

    Universal semantic communication

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2010.Cataloged from PDF version of thesis.Includes bibliographical references (p. 325-334).Is meaningful communication possible between two intelligent parties who share no common language or background? We propose that this problem can be rigorously addressed by explicitly focusing on the goals of the communication. We propose a theoretical framework in which we can address when and to what extent such semantic communication is possible. Our starting point is a mathematical definition of a generic goal for communication, that is pursued by agents of bounded computational complexity. We then model a "lack of common language or background" by considering a class of potential partners for communication; in general, this formalism is rich enough to handle varying degrees of common language and backgrounds, but the complete lack of knowledge is modeled by simply considering the class of all partners with which some agent of similar power could achieve our goal. In this formalism, we will find that for many goals (but not all), communication without any common language or background is possible. We call the strategies for achieving goals without relying on such background universal protocols. The main intermediate notions introduced by our theory are formal notions of feedback that we call sensing. We show that sensing captures the essence of whether or not reliable universal protocols can be constructed in many natural settings of interest: we find that across settings, sensing is almost always sufficient, usually necessary, and generally a useful design principle for the construction of universal protocols. We support this last point by developing a number of examples of protocols for specific goals. Notably, we show that universal delegation of computation from a space-efficient client to a general-purpose server is possible, and we show how a variant of TCP can allow end-users on a packet network to automatically adapt to small changes in the packet format (e.g., changes in IP). The latter example above alludes to our main motivation for considering such problems, which is to develop techniques for modeling and constructing computer systems that do not require that their components strictly adhere to protocols: said differently, we hope to be able to design components that function properly with a sufficiently wide range of other components to permit a rich space of "backwards-compatible" designs for those components. We expect that in the long run, this paradigm will lead to simpler systems because "backwards compatibility" is no longer such a severe constraint, and we expect it to lead to more robust systems, partially because the components should be simpler, and partially because such components are inherently robust to deviations from any fixed protocol. Unfortunately, we find that the techniques for communication under the complete absence of any common background suffer from overhead that is too severe for such practical purposes, so we consider two natural approaches for introducing some assumed common background between components while retaining some nontrivial amount of flexibility. The first approach supposes that the designer of a component has some "belief" about what protocols would be "natural" to use to interact with other components; we show that, given sensing and some sufficient "agreement" between the beliefs of the designers of two components, the components can be made universal with some relatively modest overhead. The second approach supposes that the protocols are taken from some restricted class of functions, and we will see that for certain classes of functions and simple goals, efficient universal protocols can again be constructed from sensing. Actually, we show more: the special case of our model described in the second approach above corresponds precisely to the well-known model of mistake-bounded on-line learning first studied by Barzdirs and Frievalds, and later considered in more depth by Littlestone. This connection provides a reasonably complete picture of the conditions under which we can apply the second approach. Furthermore, it also seems that the first approach is closely related to the problem of designing good user interfaces in Human-Computer Interaction. We conclude by briefly sketching the connection, and suggest that further development of this connection may be a potentially fruitful direction for future work.by Brendan Juba.Ph.D
    corecore