233 research outputs found

    Psychophysical investigations of visual density discrimination

    Get PDF
    Work in spatial vision is reviewed and a new effect of spatial averaging is reported. This shows that dot separation discriminations are improved if the cue is represented in the intervals within a collection of dots arranged in a lattice, compared to simple 2 dot separation discriminations. This phenomenon may be related to integrative processes that mediate texture density estimation. Four models for density discrimination are described. One involves measurements of spatial filter outputs. Computer simulations show that in principle, density cues can be encoded by a system of four DOG filters with peak sensitivities spanning a range of 3 octaves. Alternative models involve operations performed over representations in which spatial features are made explicit. One of these involves estimations of numerosity or coverage of the texture elements. Another involves averaging of the interval values between adjacent elements. A neural model for measuring the relevant intervals is described. It is argued that in principle the input to a density processor does not require the full sequence of operations in the MIRAGE transformation (eg. Watt and Morgan 1985). In particular, the regions of activity in the second derivative do not need to be interpreted in terms of edges, bars and blobs in order for density estimation to commence. This also implies that explicit coding of texture elements may be unnecessary. Data for density discrimination in regular and random dot patterns are reported. These do not support the coverage and counting models and observed performance shows significant departures from predictions based on an analysis of the statistics of the interval distribution in the stimuli. But this result can be understood in relation to other factors in the interval averaging process, and there is empirical support for the hypothesized method for measuring the intervals. Other experiments show that density is scaled according to stimulus size and possibly perceived depth. It is also shown that information from density analysis can be combined with size estimations to produce highly accurate discriminations of image expansion or object depth changes

    Evaluation of preprocessors for neural network speaker verification

    Get PDF

    The Role of Knowledge in Visual Shape Representation

    Get PDF
    This report shows how knowledge about the visual world can be built into a shape representation in the form of a descriptive vocabulary making explicit the important geometrical relationships comprising objects' shapes. Two computational tools are offered: (1) Shapestokens are placed on a Scale-Space Blackboard, (2) Dimensionality-reduction captures deformation classes in configurations of tokens. Knowledge lies in the token types and deformation classes tailored to the constraints and regularities ofparticular shape worlds. A hierarchical shape vocabulary has been implemented supporting several later visual tasks in the two-dimensional shape domain of the dorsal fins of fishes

    Pre-processing, classification and semantic querying of large-scale Earth observation spaceborne/airborne/terrestrial image databases: Process and product innovations.

    Get PDF
    By definition of Wikipedia, “big data is the term adopted for a collection of data sets so large and complex that it becomes difficult to process using on-hand database management tools or traditional data processing applications. The big data challenges typically include capture, curation, storage, search, sharing, transfer, analysis and visualization”. Proposed by the intergovernmental Group on Earth Observations (GEO), the visionary goal of the Global Earth Observation System of Systems (GEOSS) implementation plan for years 2005-2015 is systematic transformation of multisource Earth Observation (EO) “big data” into timely, comprehensive and operational EO value-adding products and services, submitted to the GEO Quality Assurance Framework for Earth Observation (QA4EO) calibration/validation (Cal/Val) requirements. To date the GEOSS mission cannot be considered fulfilled by the remote sensing (RS) community. This is tantamount to saying that past and existing EO image understanding systems (EO-IUSs) have been outpaced by the rate of collection of EO sensory big data, whose quality and quantity are ever-increasing. This true-fact is supported by several observations. For example, no European Space Agency (ESA) EO Level 2 product has ever been systematically generated at the ground segment. By definition, an ESA EO Level 2 product comprises a single-date multi-spectral (MS) image radiometrically calibrated into surface reflectance (SURF) values corrected for geometric, atmospheric, adjacency and topographic effects, stacked with its data-derived scene classification map (SCM), whose thematic legend is general-purpose, user- and application-independent and includes quality layers, such as cloud and cloud-shadow. Since no GEOSS exists to date, present EO content-based image retrieval (CBIR) systems lack EO image understanding capabilities. Hence, no semantic CBIR (SCBIR) system exists to date either, where semantic querying is synonym of semantics-enabled knowledge/information discovery in multi-source big image databases. In set theory, if set A is a strict superset of (or strictly includes) set B, then A B. This doctoral project moved from the working hypothesis that SCBIR computer vision (CV), where vision is synonym of scene-from-image reconstruction and understanding EO image understanding (EO-IU) in operating mode, synonym of GEOSS ESA EO Level 2 product human vision. Meaning that necessary not sufficient pre-condition for SCBIR is CV in operating mode, this working hypothesis has two corollaries. First, human visual perception, encompassing well-known visual illusions such as Mach bands illusion, acts as lower bound of CV within the multi-disciplinary domain of cognitive science, i.e., CV is conditioned to include a computational model of human vision. Second, a necessary not sufficient pre-condition for a yet-unfulfilled GEOSS development is systematic generation at the ground segment of ESA EO Level 2 product. Starting from this working hypothesis the overarching goal of this doctoral project was to contribute in research and technical development (R&D) toward filling an analytic and pragmatic information gap from EO big sensory data to EO value-adding information products and services. This R&D objective was conceived to be twofold. First, to develop an original EO-IUS in operating mode, synonym of GEOSS, capable of systematic ESA EO Level 2 product generation from multi-source EO imagery. EO imaging sources vary in terms of: (i) platform, either spaceborne, airborne or terrestrial, (ii) imaging sensor, either: (a) optical, encompassing radiometrically calibrated or uncalibrated images, panchromatic or color images, either true- or false color red-green-blue (RGB), multi-spectral (MS), super-spectral (SS) or hyper-spectral (HS) images, featuring spatial resolution from low (> 1km) to very high (< 1m), or (b) synthetic aperture radar (SAR), specifically, bi-temporal RGB SAR imagery. The second R&D objective was to design and develop a prototypical implementation of an integrated closed-loop EO-IU for semantic querying (EO-IU4SQ) system as a GEOSS proof-of-concept in support of SCBIR. The proposed closed-loop EO-IU4SQ system prototype consists of two subsystems for incremental learning. A primary (dominant, necessary not sufficient) hybrid (combined deductive/top-down/physical model-based and inductive/bottom-up/statistical model-based) feedback EO-IU subsystem in operating mode requires no human-machine interaction to automatically transform in linear time a single-date MS image into an ESA EO Level 2 product as initial condition. A secondary (dependent) hybrid feedback EO Semantic Querying (EO-SQ) subsystem is provided with a graphic user interface (GUI) to streamline human-machine interaction in support of spatiotemporal EO big data analytics and SCBIR operations. EO information products generated as output by the closed-loop EO-IU4SQ system monotonically increase their value-added with closed-loop iterations

    Sound change and social meaning: the perception and production of phonetic change in York, Northern England

    Get PDF
    This thesis investigates the relationship between social meaning and linguistic change. An important observation regarding spoken languages is that they are constantly changing: the way we speak differs from generation to generation. A second important observation is that spoken utterances convey social as well as denotational meaning: the way we speak communicates something about who we are. How, if at all, are these two characteristics of spoken languages related? Many sociolinguistic studies have argued that the social meaning of linguistic features is central to explaining the spread of linguistic innovations. A novel form might be heard as more prestigious than the older form, or it may become associated with specific social stereotypes relevant to the community in which the change occurs. It is argued that this association between a linguistic variant and social meaning leads speakers to adopt or reject the innovation, inhibiting or facilitating the spread of the change. In contrast, a number of scholars have argued that social meaning is epiphenomenal to many linguistic changes, which are instead driven by an automatic process of convergence in face-to-face interaction. The issue that such arguments raise is that many studies proposing a role of social meaning in the spread of linguistic innovations rely on production data as their primary source of evidence. Observing the variable adoption of innovations across different groups of speakers (e.g. by gender, ethnicity, or socioeconomic status), a researcher might draw on their knowledge of the social history of the community under study to infer the role of social meaning in that change. In many cases, the observed patterns of could equally be explained by the social structure of the community under study, which constrains who speaks to whom. Are linguistic changes facilitated and inhibited by social meaning? Or is it rather the case that social meaning arises as a consequence of linguistic change, without necessarily influencing the change itself? This thesis explores these questions through a study of vocalic change in York, Northern England, focusing on the fronting and diphthongization of the tense back vowels /u/ and /o/. It presents a systematic comparison of the social meanings listeners assign to innovations (captured using perceptual methods), their social attitudes with regard to those meanings (captured through sociolinguistic interviews), and their use of those forms in production (captured through acoustic analysis). It is argued that evidence of a consistent relationship between these factors would support the proposal that social meaning plays a role in linguistic change. The results of this combined analysis of sociolinguistic perception, social attitudes and speech production provide clear evidence of diachronic /u/ and /o/ fronting in this community, and show that variation in these two vowels is associated with a range of social meanings in perception. These meanings are underpinned by the notion of ‘Broad Yorkshire’ speech, a socially-recognized speech register linked to notions of authentic local identity and social class. Monophthongal /o/, diphthongal /u/, and back variants of both vowels are shown to be associated with this register, implying that a speaker who adopts an innovative form will likely be heard as less ‘Broad’. However, there is no clear evidence that speakers’ attitudes toward regional identity or social class have any influence on their adoption of innovations, nor that that their ability to recognise the social meaning of fronting in perception is related to their production behaviour. The fronting of /u/ is spreading in a socially-uniform manner in production, unaffected by any social factor tested except for age. The fronting of /o/ is conditioned by social network structure — speakers with more diverse social networks are more likely to adopt the innovative form, while speakers with closer social ties to York are more likely to retain a back variant. These findings demonstrate that York speakers hear back forms of /u/ and /o/ as more ‘local’ and ‘working class’ than fronter realizations, and express strong attitudes toward the values and practices associated with regional identity and social class. However, these factors do not appear to influence their adoption of linguistic innovations in any straightforward manner, contrasting the predictions of an account of linguistic change where social meaning plays a central role in facilitating or inhibiting the propagation of linguistic innovations. Based on these results, the thesis argues that many linguistic changes may spread through the production patterns of a speech community without the direct influence of social meaning, and advocates for the combined analysis of sociolinguistic perception, social attitudes and speech production in future work

    A Multiple-Systems Approach in the Symbolic Modelling of Human Vision

    Get PDF
    For most of the thirty years or so of machine vision research, activity has been concentrated mainly in the domain of metric-based approaches: there has been negligible attention to the psychological factors in human vision. With the recent resurgence of interest in neural systems, that is now changing. This thesis discusses relevant aspects of basic visual neuroanatomy, and psychological phenomena, in an attempt to relate the concepts to a model of human vision and the prospective goals of future machine vision systems. It is suggested that, while biological vision is complex, the underlying mechanisms of human vision are more tractable than is often believed. We also argue here that the controversial subject of direct vision plays a crucial role in natural vision, and we attempt to relate this to the model. The recognition of massive parallelism in natural vision has led to proposals for emulating aspects of neural networks in technology. The systems model developed in this work demonstrates software-simulated cellular automata (CAs) in the role of mainly low-level image processing. It is shown that CAs are able to efficiently provide both conventional and neurally-inspired vision functions. The thesis also discusses the use of Prolog as the means of realising higher level image understanding. The symbolic processing developed is basic, but is nevertheless sufficient for the purposes of the present. demonstrations. Extensions to the concepts can be easily achieved. The modular systems approach adopted blends together several ideas and processes, and results in a more robust model of human vision that is able to translate a noisy real image into an accessible symbolic form for expert-domain interpretation

    Acoustical measurements on stages of nine U.S. concert halls

    Get PDF
    • 

    corecore