42 research outputs found

    A creative exchange” for enterprise and employability -

    Get PDF
    In the Artificial Bee Colony (ABC) algorithm, the employed bee and the onlooker bee phase involve updating the candidate solutions by changing a value in one dimension, dubbed one-dimension update process. For some problems which the number of dimensions is very high, the one-dimension update process can cause the solution quality and convergence speed drop. This paper proposes a new algorithm, using reinforcement learning for solution updating in ABC algorithm, called R-ABC. After updating a solution by an employed bee, the new solution results in positive or negative reinforcement applied to the solution dimensions in the onlooker bee phase. Positive reinforcement is given when the candidate solution from the employed bee phase provides a better fitness value. The more often a dimension provides a better fitness value when changed, the higher the value of update becomes in the onlooker bee phase. Conversely, negative reinforcement is given when the candidate solution does not provide a better fitness value. The performance of the proposed algorithm is assessed on eight basic numerical benchmark functions in four categories with 100, 500, 700, and 900 dimensions, seven CEC2005's shifted functions with 100, 500, 700, and 900 dimensions, and six CEC2014's hybrid functions with 100 dimensions. The results show that the proposed algorithm provides solutions which are significantly better than all other algorithms for all tested dimensions on basic benchmark functions. The number of solutions provided by the R-ABC algorithm which are significantly better than those of other algorithms increases when the number of dimensions increases on the CEC2005's shifted functions. The R-ABC algorithm is at least comparable to the state-of-the-art ABC variants on the CEC2014's hybrid functions

    Acoustic Cues to Perceived Prominence Levels:Evidence from German Spontaneous Speech

    Get PDF
    The iambic-trochaic law (ITL) states that a louder sound signals the beginning of a group, while a longer sound signals its end. Although the ITL has been empirically supported in experiments with a variety of stimuli, it is not clear whether it is due to universal cognitive mechanisms or the outcome of language-specific prosodic properties. We tested the law with speakers of English, Greek and Korean who heard sequences of tones varied in duration and/or intensity. The results revealed neither significant differences among languages nor a strong bias shared by speakers of all languages. Significantly, listenersďż˝ grouping preferences were influenced by the duration of the inter-stimulus interval (ISI), with longer ISI resulting in stronger trochaic preferences, indicating that specific experimental conditions may be responsible for differences in listener responses across experiments testing the ITL

    Modelling English diphthongs with dynamic articulatory targets

    Get PDF
    The nature of English diphthongs has been much disputed. By now, the most influential account argues that diphthongs are phoneme entities rather than vowel combinations. However, mixed results have been reported regarding whether the rate of formant transition is the most reliable attribute in the perception and production of diphthongs. Here, we used computational modelling to explore the underlying forms of diphthongs. We tested the assumption that diphthongs have dynamic articulatory targets by training an articulatory synthesiser with a three-dimensional (3D) vocal tract model to learn English words. An automatic phoneme recogniser was constructed to guide the learning of the diphthongs. Listening experiments by native listeners indicated that the model succeeded in learning highly intelligible diphthongs, providing support for the dynamic target assumption. The modelling approach paves a new way for validating hypotheses of speech perception and production

    Simulating vocal learning of spoken language: Beyond imitation

    Get PDF
    Computational approaches have an important role to play in understanding the complex process of speech acquisition, in general, and have recently been popular in studies of vocal learning in particular. In this article we suggest that two significant problems associated with imitative vocal learning of spoken language, the speaker normalisation and phonological correspondence problems, can be addressed by linguistically grounded auditory perception. In particular, we show how the articulation of consonant-vowel syllables may be learnt from auditory percepts that can represent either individual utterances by speakers with different vocal tract characteristics or ideal phonetic realisations. The result is an optimisation-based implementation of vocal exploration – incorporating semantic, auditory, and articulatory signals – that can serve as a basis for simulating vocal learning beyond imitation

    Explaining the PENTA model: a reply to Arvaniti and Ladd

    Get PDF
    This paper presents an overview of the Parallel Encoding and Target Approximation (PENTA) model of speech prosody, in response to an extensive critique by Arvaniti & Ladd (2009). PENTA is a framework for conceptually and computationally linking communicative meanings to fine-grained prosodic details, based on an articulatory-functional view of speech. Target Approximation simulates the articulatory realisation of underlying pitch targets – the prosodic primitives in the framework. Parallel Encoding provides an operational scheme that enables simultaneous encoding of multiple communicative functions. We also outline how PENTA can be computationally tested with a set of software tools. With the help of one of the tools, we offer a PENTA-based hypothetical account of the Greek intonational patterns reported by Arvaniti & Ladd, showing how it is possible to predict the prosodic shapes of an utterance based on the lexical and postlexical meanings it conveys

    DISCOVERING UNDERLYING TONAL REPRESENTATIONS BY COMPUTATIONAL MODELING: A CASE STUDY OF THAI

    No full text
    In the present study we test a computational method for investigating underlying tonal representations. The representation explored is in the form of simple linear functions as ideal pitch targets, with which close-to-natural F0 contours can be computationally generated. The estimation of the pitch targets is done with PENTAtrainer2, a hypothesisdriven prosody-modeling tool that combines functional annotation, quantitative Target Approximation and global stochastic optimization. In this study we applied PENTAtrainer2 in an investigation of Thai tones. We applied PENTAtrainer2 on a functionally annotated multi-speaker Thai corpus. The pitch targets learned from the corpus showed clear separation between tonal categories, and the F0 contours synthesized with these targets showed close resemblance to those of natural speech of different speakers, whether or not a particular speaker’s data were used in the training. The results demonstrate that it is possible to establish highly economical tonal representations (three parameters per target per tone) that are both fully contrastive and capable of capturing fine phonetic details of Thai tones. Also demonstrated by the study are the effectiveness of PENTAtrainer2 as a prosody research tool, and the potential of computational modeling in general as a new means of basic research in linguistic science. SUBJECT KEYWORD
    corecore