499,082 research outputs found
Generative Adversarial Networks for Diverse and Explainable Text-to-Image Generation
This thesis focuses on algorithms for text-to-image generation, which aim at yielding photo-realistic and semantically consistent pictures on the basis of a natural-language description. Chapter 1 provides a brief general introduction of research in image synthesis based on linguistic (textual) descriptions. In Chapter 2, we propose the Dual-Attention Generative-Adversarial Network (DTGAN) which can produce perceptually plausible pictures from given natural-language descriptions, only employing a single generator/discriminator network pair. Chapter 3 intends to deal with the lack-of-diversity issue present in current single-stage text-to-image generation models. To tackle this problem, we improve on DTGAN with an efficient and effective single-stage framework (DiverGAN) to yield diverse, photo-realistic and semantically related images according to a single natural-language description and different latent vectors. In Chapter 4, we constructed novel data sets for ‘Good vs Bad’ data consisting of successful as well as unsuccessful synthesized samples of birds and of human faces. For these, special classifiers were trained to ensure that generated images are natural, realistic and believable. In Chapter 5 and Chapter 6, we investigate the latent space and the linguistic space of a conditional text-to-image GAN model for an improved explainability of the results in the generation process. More specifically, we explore the relationship between the latent control space and the obtained image variation by conducting an independent-component analysis algorithm on pretrained weight values of the generator. Furthermore, we qualitatively analyze the roles played by ‘linguistic’ embeddings in the synthetic-image semantic space, by using linear and triangular interpolation between keywords
Recommended from our members
Introduction to the Special Issue on Software Architecture for Language Engineering
Every building, and every computer program, has an architecture: structural and organisational principles that underpin its design and construction. The garden shed
once built by one of the authors had an ad hoc architecture, extracted (somewhat painfully) from the imagination during a slow and non-deterministic process that, luckily, resulted in a structure which keeps the rain on the outside and the mower on the inside (at least for the time being). As well as being ad hoc (i.e. not informed by analysis of similar practice or relevant science or engineering) this architecture is implicit: no explicit design was made, and no records or documentation kept of the construction process. The pyramid in the courtyard of the Louvre, by contrast, was constructed in a process involving explicit design performed by qualified engineers with a wealth of theoretical and practical knowledge of the properties of materials, the relative merits and strengths of different construction techniques, et cetera. So it is with software: sometimes it is thrown together by enthusiastic amateurs; sometimes it is architected, built to last, and intended to be 'not something you finish, but something you start' (to paraphrase Brand (1994). A number of researchers argued in the early and middle 1990s that the field of computational infrastructure or architecture for human language computation merited an increase in attention. The reasoning was that the increasingly large-scale and technologically significant nature of language processing science was placing increasing burdens of an engineering nature on research and development workers seeking robust and practical methods (as was the increasingly collaborative nature of research in this field, which puts a large premium on software integration and interoperation). Over the intervening period a number of significant systems and practices have been developed in what we may call Software Architecture for Language Engineering (SALE). This special issue represented an opportunity for practitioners in this area to report their work in a coordinated setting, and to present a snapshot of the state-ofthe-art in infrastructural work, which may indicate where further development and further take-up of these systems can be of benefit
The biological origin of linguistic diversity
In contrast with animal communication systems, diversity is characteristic of almost every aspect of human language. Languages variously employ tones, clicks, or manual signs to signal differences in meaning; some languages lack the noun-verb distinction (e.g., Straits Salish), whereas others have a proliferation of fine-grained syntactic categories (e.g., Tzeltal); and some languages do without morphology (e.g., Mandarin), while others pack a whole sentence into a single word (e.g., Cayuga). A challenge for evolutionary biology is to reconcile the diversity of languages with the high degree of biological uniformity of their speakers. Here, we model processes of language change and geographical dispersion and find a consistent pressure for flexible learning, irrespective of the language being spoken. This pressure arises because flexible learners can best cope with the observed high rates of linguistic change associated with divergent cultural evolution following human migration. Thus, rather than genetic adaptations for specific aspects of language, such as recursion, the coevolution of genes and fast-changing linguistic structure provides the biological basis for linguistic diversity. Only biological adaptations for flexible learning combined with cultural evolution can explain how each child has the potential to learn any human language
Dialogue based interfaces for universal access.
Conversation provides an excellent means of communication for almost all people. Consequently, a conversational interface is an excellent mechanism for allowing people to interact with systems. Conversational systems are an active research area, but a wide range of systems can be developed with current technology. More sophisticated interfaces can take considerable effort, but simple interfaces can be developed quite rapidly. This paper gives an introduction to the current state of the art of conversational systems and interfaces. It describes a methodology for developing conversational interfaces and gives an example of an interface for a state benefits web site. The paper discusses how this interface could improve access for a wide range of people, and how further development of this interface would allow a larger range of people to use the system and give them more functionality
- …