39 research outputs found

    Reconstruction-as-Feedback Serves as an Effective Attention Mechanism for Object Recognition and Grouping

    Get PDF
    Our model uses these object reconstructions as a top-down attentional bias for efficiently routing relevant spatial and feature information of the object. This reconstruction-based attention operates on two levels. First, the model has a long-range projection that inhibits irrelevant spatial regions based on the mask generated from the most likely object reconstruction. Second, the model dynamically changes its feature routing weights through local recurrence, where part-whole connection is modulated based on the reconstruction error for each hypothesized object (represented as a slot). This formulation loosely implements biased-competition theory, where the reconstruction error biases a competition between object slots for the visual parts

    A comparative study of signal processing methods for structural health monitoring

    Get PDF
    In this paper four non-parametric and five parametric signal processing techniques are reviewed and their performances are compared through application to a sample exponentially damped synthetic signal with closely-spaced frequencies representing the ambient response of structures. The non-parametric methods are Fourier transform, periodogram estimate of power spectral density, wavelet transform, and empirical mode decomposition with Hilbert spectral analysis (Hilbert-Huang transform). The parametric methods are pseudospectrum estimate using the multiple signal categorization (MUSIC), empirical wavelet transform, approximate Prony method, matrix pencil method, and the estimation of signal parameters by rotational invariance technique (ESPRIT) method. The performances of different methods are studied statistically using the Monte Carlo simulation and the results are presented in terms of average errors of multiple sample analyses

    Affinity-based Attention in Self-supervised Transformers Predicts Dynamics of Object Grouping in Humans

    Full text link
    The spreading of attention has been proposed as a mechanism for how humans group features to segment objects. However, such a mechanism has not yet been implemented and tested in naturalistic images. Here, we leverage the feature maps from self-supervised vision Transformers and propose a model of human object-based attention spreading and segmentation. Attention spreads within an object through the feature affinity signal between different patches of the image. We also collected behavioral data on people grouping objects in natural images by judging whether two dots are on the same object or on two different objects. We found that our models of affinity spread that were built on feature maps from the self-supervised Transformers showed significant improvement over baseline and CNN based models on predicting reaction time patterns of humans, despite not being trained on the task or with any other object labels. Our work provides new benchmarks for evaluating models of visual representation learning including Transformers

    COVID-19 related stigma among the general population in Iran

    Get PDF
    Funding Information: GT is supported by the National Institute for Health Research (NIHR) Applied Research Collaboration South London at King’s College London NHS Foundation Trust, and by the NIHR Asset Global Health Unit award. The views expressed are those of the author(s) and not necessarily those of the NHS, the NIHR or the Department of Health and Social Care. GT is also supported by the Guy’s and St Thomas’ Charity for the On Trac project (EFT151101), and by the UK Medical Research Council (UKRI) in relation to the Emilia (MR/S001255/1) and Indigo Partnership (MR/R023697/1) awards. Publisher Copyright: © 2022, The Author(s).Peer reviewedPublisher PD

    Modeling Salient Object-Object Interactions to Generate Textual Descriptions for Natural Images

    No full text
    In this thesis we consider the problem of automatically generating textual descriptions of images which is useful in many applications. For example, searching and retrieving visual data in overwhelming number of images and videos available on the Internet requires better understanding of the multimedia content that is not provided by user annotated tags and meta-data. While this task remains a very challenging problem for machines, humans can easily generate concise descriptions of the images; they can avoid what seems to be unnecessary and not related to the main point of the images and talk about the objects, their actions and attributes, their interactions with each other and the context that all is happening.   Our method consists of two main steps to automatically generate the image description. By using saliency maps and object detectors, it determines the objects that are of interests to the observer and hence, should appear in the description of the image. Then pose (body part configuration) of those objects/entities is used to recognize the single actions and interactions between them. For generating the sentences, we use a syntactic model that first orders the nouns (objects) and then builds sub-trees around the detected objects using the predicted actions. The model then combines those sub-trees using the recognized interactions and at the end, the context of interactions, which is detected with a separate algorithm, is added to create a full sentence for the image. The results show the improved accuracy of the descriptions generated, using our method.  M.S

    Die sozio-kulturellen und religiös-politischen Auswirkungen der Re-Islamisierung in den Maghrebstaaten Algerien und Tunesien seit der Islamischen Revolution von 1979 im Iran

    No full text
    Die sozio-kulturellen und religiös-politischen Auswirkungen der Re-Islamisierung in den Maghrebstaaten Algerien und Tunesien seit der Islamischen Revolution von 1979 im Iran. - Egelsbach u.a. : HÀnsel-Hohenhausen, 1994. - 206 S. - (Deutsche Hochschulschriften ; 547). - Zugl.: Augsburg, Univ., Diss., 199

    Modeling attention and saccade programming in realworld contexts

    No full text
    International audienceno abstrac

    Readers move their eyes mindlessly using midbrain visuo-motor principles

    No full text
    Saccadic eye movements rapidly shift our gaze over 100,000 times daily, enabling countless tasks ranging from driving to reading. Long regarded as a window to the mind and human information processing, they are thought to be cortically/cognitively controlled movements aimed at objects/words of interest. Saccades however involve a complex cerebral network wherein the contribution of phylogenetically older sensory-motor pathways remains unclear. Here we show using a neuro-computational approach that mindless visuo-motor computations, akin to reflexive orienting responses in neonates and vertebrates with little neocortex, guide humans’ eye movements in a quintessentially cognitive task, reading. These computations occur in the superior colliculus, an ancestral midbrain structure, that integrates retinal and (sub)cortical afferent signals over retinotopically organized, and size-invariant, neuronal populations. Simply considering retinal and primary-visual-cortex afferents, which convey the distribution of luminance contrast over sentences (visual-saliency map), we find that collicular population-averaging principles capture readers’ prototypical word-based oculomotor behavior, leaving essentially rereading behavior unexplained. These principles reveal that inter-word spacing is unnecessary, explaining metadata across languages and writing systems using only print size as a predictor. Our findings demonstrate that saccades, rather than being a window into cognitive/linguistic processes, primarily reflect rudimentary visuo-motor mechanisms in the midbrain that survived brain-evolution pressure
    corecore