1,129 research outputs found

    Developmental Robots - A New Paradigm

    Get PDF
    It has been proved to be extremely challenging for humans to program a robot to such a sufficient degree that it acts properly in a typical unknown human environment. This is especially true for a humanoid robot due to the very large number of redundant degrees of freedom and a large number of sensors that are required for a humanoid to work safely and effectively in the human environment. How can we address this fundamental problem? Motivated by human mental development from infancy to adulthood, we present a theory, an architecture, and some experimental results showing how to enable a robot to develop its mind automatically, through online, real time interactions with its environment. Humans mentally “raise” the robot through “robot sitting” and “robot schools” instead of task-specific robot programming

    Transductive Visual Verb Sense Disambiguation

    Get PDF
    Verb Sense Disambiguation is a well-known task in NLP, the aim is to find the correct sense of a verb in a sentence. Recently, this problem has been extended in a multimodal scenario, by exploiting both textual and visual features of ambiguous verbs leading to a new problem, the Visual Verb Sense Disambiguation (VVSD). Here, the sense of a verb is assigned considering the content of an image paired with it rather than a sentence in which the verb appears. Annotating a dataset for this task is more complex than textual disambiguation, because assigning the correct sense to a pair of requires both non-trivial linguistic and visual skills. In this work, differently from the literature, the VVSD task will be performed in a transductive semi-supervised learning (SSL) setting, in which only a small amount of labeled information is required, reducing tremendously the need for annotated data. The disambiguation process is based on a graph-based label propagation method which takes into account mono or multimodal representations for pairs. Experiments have been carried out on the recently published dataset VerSe, the only available dataset for this task. The achieved results outperform the current state-of-the-art by a large margin while using only a small fraction of labeled samples per sens

    Disambiguating Visual Verbs

    Get PDF

    Multi modal multi-semantic image retrieval

    Get PDF
    PhDThe rapid growth in the volume of visual information, e.g. image, and video can overwhelm users’ ability to find and access the specific visual information of interest to them. In recent years, ontology knowledge-based (KB) image information retrieval techniques have been adopted into in order to attempt to extract knowledge from these images, enhancing the retrieval performance. A KB framework is presented to promote semi-automatic annotation and semantic image retrieval using multimodal cues (visual features and text captions). In addition, a hierarchical structure for the KB allows metadata to be shared that supports multi-semantics (polysemy) for concepts. The framework builds up an effective knowledge base pertaining to a domain specific image collection, e.g. sports, and is able to disambiguate and assign high level semantics to ‘unannotated’ images. Local feature analysis of visual content, namely using Scale Invariant Feature Transform (SIFT) descriptors, have been deployed in the ‘Bag of Visual Words’ model (BVW) as an effective method to represent visual content information and to enhance its classification and retrieval. Local features are more useful than global features, e.g. colour, shape or texture, as they are invariant to image scale, orientation and camera angle. An innovative approach is proposed for the representation, annotation and retrieval of visual content using a hybrid technique based upon the use of an unstructured visual word and upon a (structured) hierarchical ontology KB model. The structural model facilitates the disambiguation of unstructured visual words and a more effective classification of visual content, compared to a vector space model, through exploiting local conceptual structures and their relationships. The key contributions of this framework in using local features for image representation include: first, a method to generate visual words using the semantic local adaptive clustering (SLAC) algorithm which takes term weight and spatial locations of keypoints into account. Consequently, the semantic information is preserved. Second a technique is used to detect the domain specific ‘non-informative visual words’ which are ineffective at representing the content of visual data and degrade its categorisation ability. Third, a method to combine an ontology model with xi a visual word model to resolve synonym (visual heterogeneity) and polysemy problems, is proposed. The experimental results show that this approach can discover semantically meaningful visual content descriptions and recognise specific events, e.g., sports events, depicted in images efficiently. Since discovering the semantics of an image is an extremely challenging problem, one promising approach to enhance visual content interpretation is to use any associated textual information that accompanies an image, as a cue to predict the meaning of an image, by transforming this textual information into a structured annotation for an image e.g. using XML, RDF, OWL or MPEG-7. Although, text and image are distinct types of information representation and modality, there are some strong, invariant, implicit, connections between images and any accompanying text information. Semantic analysis of image captions can be used by image retrieval systems to retrieve selected images more precisely. To do this, a Natural Language Processing (NLP) is exploited firstly in order to extract concepts from image captions. Next, an ontology-based knowledge model is deployed in order to resolve natural language ambiguities. To deal with the accompanying text information, two methods to extract knowledge from textual information have been proposed. First, metadata can be extracted automatically from text captions and restructured with respect to a semantic model. Second, the use of LSI in relation to a domain-specific ontology-based knowledge model enables the combined framework to tolerate ambiguities and variations (incompleteness) of metadata. The use of the ontology-based knowledge model allows the system to find indirectly relevant concepts in image captions and thus leverage these to represent the semantics of images at a higher level. Experimental results show that the proposed framework significantly enhances image retrieval and leads to narrowing of the semantic gap between lower level machinederived and higher level human-understandable conceptualisation

    Bitter taste stimuli induce differential neural codes in mouse brain.

    Get PDF
    A growing literature suggests taste stimuli commonly classified as "bitter" induce heterogeneous neural and perceptual responses. Here, the central processing of bitter stimuli was studied in mice with genetically controlled bitter taste profiles. Using these mice removed genetic heterogeneity as a factor influencing gustatory neural codes for bitter stimuli. Electrophysiological activity (spikes) was recorded from single neurons in the nucleus tractus solitarius during oral delivery of taste solutions (26 total), including concentration series of the bitter tastants quinine, denatonium benzoate, cycloheximide, and sucrose octaacetate (SOA), presented to the whole mouth for 5 s. Seventy-nine neurons were sampled; in many cases multiple cells (2 to 5) were recorded from a mouse. Results showed bitter stimuli induced variable gustatory activity. For example, although some neurons responded robustly to quinine and cycloheximide, others displayed concentration-dependent activity (p<0.05) to quinine but not cycloheximide. Differential activity to bitter stimuli was observed across multiple neurons recorded from one animal in several mice. Across all cells, quinine and denatonium induced correlated spatial responses that differed (p<0.05) from those to cycloheximide and SOA. Modeling spatiotemporal neural ensemble activity revealed responses to quinine/denatonium and cycloheximide/SOA diverged during only an early, at least 1 s wide period of the taste response. Our findings highlight how temporal features of sensory processing contribute differences among bitter taste codes and build on data suggesting heterogeneity among "bitter" stimuli, data that challenge a strict monoguesia model for the bitter quality

    Languages adapt to their contextual niche

    Get PDF

    Perceptual Strategies and Neuronal Underpinnings underlying Pattern Recognition through Visual and Tactile Sensory Modalities in Rats

    Get PDF
    The aim of my PhD project was to investigate multisensory perception and multimodal recognition abilities in the rat, to better understand the underlying perceptual strategies and neuronal mechanisms. I have chosen to carry out this project on the laboratory rat, for two reasons. First, the rat is a flexible and highly accessible experimental model, where it is possible to combine state-of-the-art neurophysiological approaches (such as multi-electrode neuronal recordings) with behavioral investigation of perception and (more in general) cognition. Second, extensive research concerning multimodal integration has already been conducted in this species, both at the neurophysiological and behavioral level. My thesis work has been organized in two projects: a psychophysical assessment of object categorization abilities in rats, and a neurophysiological study of neuronal tuning in the primary visual cortex of anaesthetized rats. In both experiments, unisensory (visual and tactile) and multisensory (visuo-tactile) stimulation has been used for training and testing, depending on the task. The first project has required development of a new experimental rig for the study of object categorization in rat, using solid objects, so as to be able to assess their recognition abilities under different modalities: vision, touch and both together. The second project involved an electrophysiological study of rat primary visual cortex, during visual, tactile and visuo-tactile stimulation, with the aim of understanding whether any interaction between these modalities exists, in an area that is mainly deputed to one of them. The results of both of the studies are still preliminary, but they already offer some interesting insights on the defining features of these abilities
    corecore