499,082 research outputs found

    Generative Adversarial Networks for Diverse and Explainable Text-to-Image Generation

    Get PDF
    This thesis focuses on algorithms for text-to-image generation, which aim at yielding photo-realistic and semantically consistent pictures on the basis of a natural-language description. Chapter 1 provides a brief general introduction of research in image synthesis based on linguistic (textual) descriptions. In Chapter 2, we propose the Dual-Attention Generative-Adversarial Network (DTGAN) which can produce perceptually plausible pictures from given natural-language descriptions, only employing a single generator/discriminator network pair. Chapter 3 intends to deal with the lack-of-diversity issue present in current single-stage text-to-image generation models. To tackle this problem, we improve on DTGAN with an efficient and effective single-stage framework (DiverGAN) to yield diverse, photo-realistic and semantically related images according to a single natural-language description and different latent vectors. In Chapter 4, we constructed novel data sets for ‘Good vs Bad’ data consisting of successful as well as unsuccessful synthesized samples of birds and of human faces. For these, special classifiers were trained to ensure that generated images are natural, realistic and believable. In Chapter 5 and Chapter 6, we investigate the latent space and the linguistic space of a conditional text-to-image GAN model for an improved explainability of the results in the generation process. More specifically, we explore the relationship between the latent control space and the obtained image variation by conducting an independent-component analysis algorithm on pretrained weight values of the generator. Furthermore, we qualitatively analyze the roles played by ‘linguistic’ embeddings in the synthetic-image semantic space, by using linear and triangular interpolation between keywords

    The biological origin of linguistic diversity

    Get PDF
    In contrast with animal communication systems, diversity is characteristic of almost every aspect of human language. Languages variously employ tones, clicks, or manual signs to signal differences in meaning; some languages lack the noun-verb distinction (e.g., Straits Salish), whereas others have a proliferation of fine-grained syntactic categories (e.g., Tzeltal); and some languages do without morphology (e.g., Mandarin), while others pack a whole sentence into a single word (e.g., Cayuga). A challenge for evolutionary biology is to reconcile the diversity of languages with the high degree of biological uniformity of their speakers. Here, we model processes of language change and geographical dispersion and find a consistent pressure for flexible learning, irrespective of the language being spoken. This pressure arises because flexible learners can best cope with the observed high rates of linguistic change associated with divergent cultural evolution following human migration. Thus, rather than genetic adaptations for specific aspects of language, such as recursion, the coevolution of genes and fast-changing linguistic structure provides the biological basis for linguistic diversity. Only biological adaptations for flexible learning combined with cultural evolution can explain how each child has the potential to learn any human language

    Dialogue based interfaces for universal access.

    Get PDF
    Conversation provides an excellent means of communication for almost all people. Consequently, a conversational interface is an excellent mechanism for allowing people to interact with systems. Conversational systems are an active research area, but a wide range of systems can be developed with current technology. More sophisticated interfaces can take considerable effort, but simple interfaces can be developed quite rapidly. This paper gives an introduction to the current state of the art of conversational systems and interfaces. It describes a methodology for developing conversational interfaces and gives an example of an interface for a state benefits web site. The paper discusses how this interface could improve access for a wide range of people, and how further development of this interface would allow a larger range of people to use the system and give them more functionality
    corecore