209,408 research outputs found

    Receptive process theory

    Get PDF

    The Computational Magic of the Ventral Stream: Towards a Theory

    Get PDF
    I conjecture that the sample complexity of object recognition is mostly due to geometric image transformations and that a main goal of the ventral stream – V1, V2, V4 and IT – is to learn-and-discount image transformations. The most surprising implication of the theory emerging from these assumptions is that the computational goals and detailed properties of cells in the ventral stream follow from symmetry properties of the visual world through a process of unsupervised correlational learning.

From the assumption of a hierarchy of areas with receptive fields of increasing size the theory predicts that the size of the receptive fields determines which transformations are learned during development and then factored out during normal processing; that the transformation represented in each area determines the tuning of the neurons in the aerea, independently of the statistics of natural images; and that class-specific transformations are learned and represented at the top of the ventral stream hierarchy.

Some of the main predictions of this theory-in-fieri are:
1. the type of transformation that are learned from visual experience depend on the size (measured in terms of wavelength) and thus on the area (layer in the models) – assuming that the aperture size increases with layers;
2. the mix of transformations learned determine the properties of the receptive fields – oriented bars in V1+V2, radial and spiral patterns in V4 up to class specific tuning in AIT (eg face tuned cells);
3. invariance to small translations in V1 may underly stability of visual perception
4. class-specific modules – such as faces, places and possibly body areas – should exist in IT to process images of object classes

    The computational magic of the ventral stream

    Get PDF
    I argue that the sample complexity of (biological, feedforward) object recognition is mostly due to geometric image transformations and conjecture that a main goal of the ventral stream – V1, V2, V4 and IT – is to learn-and-discount image transformations.

In the first part of the paper I describe a class of simple and biologically plausible memory-based modules that learn transformations from unsupervised visual experience. The main theorems show that these modules provide (for every object) a signature which is invariant to local affine transformations and approximately invariant for other transformations. I also prove that,
in a broad class of hierarchical architectures, signatures remain invariant from layer to layer. The identification of these memory-based modules with complex (and simple) cells in visual areas leads to a theory of invariant recognition for the ventral stream.

In the second part, I outline a theory about hierarchical architectures that can learn invariance to transformations. I show that the memory complexity of learning affine transformations is drastically reduced in a hierarchical architecture that factorizes transformations in terms of the subgroup of translations and the subgroups of rotations and scalings. I then show how translations are automatically selected as the only learnable transformations during development by enforcing small apertures – eg small receptive fields – in the first layer.

In a third part I show that the transformations represented in each area can be optimized in terms of storage and robustness, as a consequence determining the tuning of the neurons in the area, rather independently (under normal conditions) of the statistics of natural images. I describe a model of learning that can be proved to have this property, linking in an elegant way the spectral properties of the signatures with the tuning of receptive fields in different areas. A surprising implication of these theoretical results is that the computational goals and some of the tuning properties of cells in the ventral stream may follow from symmetry properties (in the sense of physics) of the visual world through a process of unsupervised correlational learning, based on Hebbian synapses. In particular, simple and complex cells do not directly care about oriented bars: their tuning is a side effect of their role in translation invariance. Across the whole ventral stream the preferred features reported for neurons in different areas are only a symptom of the invariances computed and represented.

The results of each of the three parts stand on their own independently of each other. Together this theory-in-fieri makes several broad predictions, some of which are:

-invariance to small transformations in early areas (eg translations in V1) may underly stability of visual perception (suggested by Stu Geman);

-each cell’s tuning properties are shaped by visual experience of image transformations during developmental and adult plasticity;

-simple cells are likely to be the same population as complex cells, arising from different convergence of the Hebbian learning rule. The input to complex “complex” cells are dendritic branches with simple cell properties;

-class-specific transformations are learned and represented at the top of the ventral stream hierarchy; thus class-specific modules such as faces, places and possibly body areas should exist in IT;

-the type of transformations that are learned from visual experience depend on the size of the receptive fields and thus on the area (layer in the models) – assuming that the size increases with layers;

-the mix of transformations learned in each area influences the tuning properties of the cells oriented bars in V1+V2, radial and spiral patterns in V4 up to class specific tuning in AIT (eg face tuned cells);

-features must be discriminative and invariant: invariance to transformations is the primary determinant of the tuning of cortical neurons rather than statistics of natural images.

The theory is broadly consistent with the current version of HMAX. It explains it and extend it in terms of unsupervised learning, a broader class of transformation invariance and higher level modules. The goal of this paper is to sketch a comprehensive theory with little regard for mathematical niceties. If the theory turns out to be useful there will be scope for deep mathematics, ranging from group representation tools to wavelet theory to dynamics of learning

    Factors Affecting Early Detection and Stimulation by Mothers and their Impact on Receptive Language Skills of Children Age 4 to 6 Years

    Get PDF
    Background:  Language is a communication tool used by humans since birth. Receptive langu­age can be interpreted as the ability to com­mu­nic­ate symbolically both visual and auditory. Through early detection measures, parents can find out the problem of child growth and deve­lop­­ment early, so that prevention, stimu­lation, heal­ing, and recovery efforts can be given with clear in­di­ca­tions at critical times of the child's growth and development process. Stimulation of child growth and development is carried out by mothers and fathers who are the closest people to children, other family mem­bers, and community groups in their respective households and in everyday life. This study aims to analyze the relationship bet­ween early detection and early stimulation with the receptive language skills of preschool children using the Health Belief Model (HBM) theory. Subjects and Method: This research was con­duc­ted using a cross-sectional research design in Surakarta, from December 2019 - January 2020. A sample of 200 children was selected using a fix­ed disease sampling tech­ni­que. The dependent va­riable is receptive langua­ge ability. The inde­pen­dent variables are per­ception of vulnerability, perception of serious­­­ness, cues to action, and self-efficacy. The intermediate variables are early detection and early stimulation. Data collection using questionnaires and Receptive One-Word Picture Vocabulary Test(ROWPVT), data were analyzed using path analysis with Stata 13.Results: Receptive language skills are improved with mothers who do early detection (b= 0.83 units; 95% CI= 0.19 to 1.47; p= 0.011) and early stimulation (b= 0.87 units; 95% CI= 0.28 to 1.47; p= 0.004).Conclusion: Children's receptive language skills increase with mothers who do early detec­tion and early stimulation. Children's receptive language skills are indirectly affected by per­cep­tion of vulnerability, perception of serious­­ness, cues to action, and self-efficacy through early detection or early stimulation by the mother.Correspondence: Anggi ResinaPutri. Masters Program in Public Health. Universitas Sebelas Maret, Jl. Ir. Su­tami 36A, Surakarta, Central Java, Indo­nesia, 57126. Email: anggiresina­putri­@­gma­il.com. Mo­­bile: 085727387689Journal of Maternal and Child Health (2020), 5(3): 235-242https://doi.org/10.26911/thejmch.2020.05.03.02  

    Fostering reflection in the training of speech-receptive action

    Get PDF
    Dieser Aufsatz erörtert Möglichkeiten und Probleme der Förderung kommunikativer Fertigkeiten durch die UnterstĂŒtzung der Reflexion eigenen sprachrezeptiven Handelns und des Einsatzes von computerunterstĂŒtzten Lernumgebungen fĂŒr dessen Förderung. Kommunikationstrainings widmen sich meistens der Förderung des beobachtbaren sprachproduktiven Handelns (Sprechen). Die individuellen kognitiven Prozesse, die dem sprachrezeptiven Handeln (Hören und Verstehen) zugrunde liegen, werden hĂ€ufig vernachlĂ€ssigt. Dies wird dadurch begrĂŒndet, dass sprachrezeptives Handeln in einer kommunikativen Situation nur schwer zugĂ€nglich und die Förderung der individuellen Prozesse sprachrezeptiven Handelns sehr zeitaufwĂ€ndig ist. Das zentrale Lernprinzip - die Reflexion des eigenen sprachlich-kommunikativen Handelns - wird aus verschiedenen Perspektiven diskutiert. Vor dem Hintergrund der Reflexionsmodelle wird die computerunterstĂŒtzte Lernumgebung CaiMan© vorgestellt und beschrieben. Daran anschließend werden sieben Erfolgsfaktoren aus der empirischen Forschung zur Lernumgebung CaiMan© abgeleitet. Der Artikel endet mit der Vorstellung von zwei empirischen Studien, die Möglichkeiten der ReflexionsunterstĂŒtzung untersucheThis article discusses the training of communicative skills by fostering the reflection of speech-receptive action and the opportunities for using software for this purpose. Most frameworks for the training of communicative behavior focus on fostering the observable speech-productive action (i.e. speaking); the individual cognitive processes underlying speech-receptive action (hearing and understanding utterances) are often neglected. Computer-supported learning environments employed as cognitive tools can help to foster speech-receptive action. Seven success factors for the integration of software into the training of soft skills have been derived from empirical research. The computer-supported learning environment CaiMan© based on these ideas is presented. One central learning principle in this learning environment reflection of one's own action will be discussed from different perspectives. The article concludes with two empirical studies examining opportunities to foster reflecti

    Computational role of eccentricity dependent cortical magnification

    Get PDF
    We develop a sampling extension of M-theory focused on invariance to scale and translation. Quite surprisingly, the theory predicts an architecture of early vision with increasing receptive field sizes and a high resolution fovea -- in agreement with data about the cortical magnification factor, V1 and the retina. From the slope of the inverse of the magnification factor, M-theory predicts a cortical "fovea" in V1 in the order of 4040 by 4040 basic units at each receptive field size -- corresponding to a foveola of size around 2626 minutes of arc at the highest resolution, ≈6\approx 6 degrees at the lowest resolution. It also predicts uniform scale invariance over a fixed range of scales independently of eccentricity, while translation invariance should depend linearly on spatial frequency. Bouma's law of crowding follows in the theory as an effect of cortical area-by-cortical area pooling; the Bouma constant is the value expected if the signature responsible for recognition in the crowding experiments originates in V2. From a broader perspective, the emerging picture suggests that visual recognition under natural conditions takes place by composing information from a set of fixations, with each fixation providing recognition from a space-scale image fragment -- that is an image patch represented at a set of increasing sizes and decreasing resolutions
    • 

    corecore