1,276 research outputs found

    LoGAN: Generating Logos with a Generative Adversarial Neural Network Conditioned on color

    Get PDF
    Designing a logo is a long, complicated, and expensive process for any designer. However, recent advancements in generative algorithms provide models that could offer a possible solution. Logos are multi-modal, have very few categorical properties, and do not have a continuous latent space. Yet, conditional generative adversarial networks can be used to generate logos that could help designers in their creative process. We propose LoGAN: an improved auxiliary classifier Wasserstein generative adversarial neural network (with gradient penalty) that is able to generate logos conditioned on twelve different colors. In 768 generated instances (12 classes and 64 logos per class), when looking at the most prominent color, the conditional generation part of the model has an overall precision and recall of 0.8 and 0.7 respectively. LoGAN's results offer a first glance at how artificial intelligence can be used to assist designers in their creative process and open promising future directions, such as including more descriptive labels which will provide a more exhaustive and easy-to-use system.Comment: 6 page, ICMLA1

    Massive Open Online Courses Temporal Profiling for Dropout Prediction

    Get PDF
    Massive Open Online Courses (MOOCs) are attracting the attention of people all over the world. Regardless the platform, numbers of registrants for online courses are impressive but in the same time, completion rates are disappointing. Understanding the mechanisms of dropping out based on the learner profile arises as a crucial task in MOOCs, since it will allow intervening at the right moment in order to assist the learner in completing the course. In this paper, the dropout behaviour of learners in a MOOC is thoroughly studied by first extracting features that describe the behavior of learners within the course and then by comparing three classifiers (Logistic Regression, Random Forest and AdaBoost) in two tasks: predicting which users will have dropped out by a certain week and predicting which users will drop out on a specific week. The former has showed to be considerably easier, with all three classifiers performing equally well. However, the accuracy for the second task is lower, and Logistic Regression tends to perform slightly better than the other two algorithms. We found that features that reflect an active attitude of the user towards the MOOC, such as submitting their assignment, posting on the Forum and filling their Profile, are strong indicators of persistence.Comment: 8 pages, ICTAI1

    Adapting End-to-End Speech Recognition for Readable Subtitles

    Full text link
    Automatic speech recognition (ASR) systems are primarily evaluated on transcription accuracy. However, in some use cases such as subtitling, verbatim transcription would reduce output readability given limited screen size and reading time. Therefore, this work focuses on ASR with output compression, a task challenging for supervised approaches due to the scarcity of training data. We first investigate a cascaded system, where an unsupervised compression model is used to post-edit the transcribed speech. We then compare several methods of end-to-end speech recognition under output length constraints. The experiments show that with limited data far less than needed for training a model from scratch, we can adapt a Transformer-based ASR model to incorporate both transcription and compression capabilities. Furthermore, the best performance in terms of WER and ROUGE scores is achieved by explicitly modeling the length constraints within the end-to-end ASR system.Comment: IWSLT 202

    A retrieval-based dialogue system utilizing utterance and context embeddings

    Get PDF
    Finding semantically rich and computer-understandable representations for textual dialogues, utterances and words is crucial for dialogue systems (or conversational agents), as their performance mostly depends on understanding the context of conversations. Recent research aims at finding distributed vector representations (embeddings) for words, such that semantically similar words are relatively close within the vector-space. Encoding the "meaning" of text into vectors is a current trend, and text can range from words, phrases and documents to actual human-to-human conversations. In recent research approaches, responses have been generated utilizing a decoder architecture, given the vector representation of the current conversation. In this paper, the utilization of embeddings for answer retrieval is explored by using Locality-Sensitive Hashing Forest (LSH Forest), an Approximate Nearest Neighbor (ANN) model, to find similar conversations in a corpus and rank possible candidates. Experimental results on the well-known Ubuntu Corpus (in English) and a customer service chat dataset (in Dutch) show that, in combination with a candidate selection method, retrieval-based approaches outperform generative ones and reveal promising future research directions towards the usability of such a system.Comment: A shorter version is accepted at ICMLA2017 conference; acknowledgement added; typos correcte

    Accumulated Gradient Normalization

    Get PDF
    This work addresses the instability in asynchronous data parallel optimization. It does so by introducing a novel distributed optimizer which is able to efficiently optimize a centralized model under communication constraints. The optimizer achieves this by pushing a normalized sequence of first-order gradients to a parameter server. This implies that the magnitude of a worker delta is smaller compared to an accumulated gradient, and provides a better direction towards a minimum compared to first-order gradients, which in turn also forces possible implicit momentum fluctuations to be more aligned since we make the assumption that all workers contribute towards a single minima. As a result, our approach mitigates the parameter staleness problem more effectively since staleness in asynchrony induces (implicit) momentum, and achieves a better convergence rate compared to other optimizers such as asynchronous EASGD and DynSGD, which we show empirically.Comment: 16 pages, 12 figures, ACML201

    Low-Latency Sequence-to-Sequence Speech Recognition and Translation by Partial Hypothesis Selection

    Full text link
    Encoder-decoder models provide a generic architecture for sequence-to-sequence tasks such as speech recognition and translation. While offline systems are often evaluated on quality metrics like word error rates (WER) and BLEU, latency is also a crucial factor in many practical use-cases. We propose three latency reduction techniques for chunk-based incremental inference and evaluate their efficiency in terms of accuracy-latency trade-off. On the 300-hour How2 dataset, we reduce latency by 83% to 0.8 second by sacrificing 1% WER (6% rel.) compared to offline transcription. Although our experiments use the Transformer, the hypothesis selection strategies are applicable to other encoder-decoder models. To avoid expensive re-computation, we use a unidirectionally-attending encoder. After an adaptation procedure to partial sequences, the unidirectional model performs on-par with the original model. We further show that our approach is also applicable to low-latency speech translation. On How2 English-Portuguese speech translation, we reduce latency to 0.7 second (-84% rel.) while incurring a loss of 2.4 BLEU points (5% rel.) compared to the offline system

    Visual perception of colourful petals reminds us of classical fragments

    Get PDF
    Colour has attracted the interest and attention of many of the most gifted intellects of all time. Ideas of early thinkers were not -and could not have been- grasped on a scientific level without knowledge of a kind that lay far in the future. One character that is being considered is the colourful surfaces of living tissues, which could hardly have been visualized without a corresponding reference to the microscale parallel. Millions of years before man made manipulated synthetic structures, biological systems were using nanoscale architecture to produce striking optical effects. Here we show the microsculpture of the adaxial surface of flower petals from the asphodel, the Stork's-bill and the common poppy by using optical, scanning electron and atomic force microscopy. Microsculpture has been studied in leaves and pollen grains of higher plants. To the best of our knowledge imaging and nanoscale morphometry of petals has not been reported hitherto. Our findings on flower petals' microsculpture may be linked with aspects on colour revealed from ancient literature

    Tagging Scientific Publications using Wikipedia and Natural Language Processing Tools. Comparison on the ArXiv Dataset

    Full text link
    In this work, we compare two simple methods of tagging scientific publications with labels reflecting their content. As a first source of labels Wikipedia is employed, second label set is constructed from the noun phrases occurring in the analyzed corpus. We examine the statistical properties and the effectiveness of both approaches on the dataset consisting of abstracts from 0.7 million of scientific documents deposited in the ArXiv preprint collection. We believe that obtained tags can be later on applied as useful document features in various machine learning tasks (document similarity, clustering, topic modelling, etc.)

    Social Emotion Mining Techniques for Facebook Posts Reaction Prediction

    Full text link
    As of February 2016 Facebook allows users to express their experienced emotions about a post by using five so-called `reactions'. This research paper proposes and evaluates alternative methods for predicting these reactions to user posts on public pages of firms/companies (like supermarket chains). For this purpose, we collected posts (and their reactions) from Facebook pages of large supermarket chains and constructed a dataset which is available for other researches. In order to predict the distribution of reactions of a new post, neural network architectures (convolutional and recurrent neural networks) were tested using pretrained word embeddings. Results of the neural networks were improved by introducing a bootstrapping approach for sentiment and emotion mining on the comments for each post. The final model (a combination of neural network and a baseline emotion miner) is able to predict the reaction distribution on Facebook posts with a mean squared error (or misclassification rate) of 0.135.Comment: 10 pages, 13 figures and accepted at ICAART 2018. (Dataset: https://github.com/jerryspan/FacebookR
    corecore