4,429 research outputs found

    Multi-Content GAN for Few-Shot Font Style Transfer

    Full text link
    In this work, we focus on the challenge of taking partial observations of highly-stylized text and generalizing the observations to generate unobserved glyphs in the ornamented typeface. To generate a set of multi-content images following a consistent style from very few examples, we propose an end-to-end stacked conditional GAN model considering content along channels and style along network layers. Our proposed network transfers the style of given glyphs to the contents of unseen ones, capturing highly stylized fonts found in the real-world such as those on movie posters or infographics. We seek to transfer both the typographic stylization (ex. serifs and ears) as well as the textual stylization (ex. color gradients and effects.) We base our experiments on our collected data set including 10,000 fonts with different styles and demonstrate effective generalization from a very small number of observed glyphs

    Deep Image Compression Using Scene Text Quality Assessment

    Full text link
    Image compression is a fundamental technology for Internet communication engineering. However, a high compression rate with general methods may degrade images, resulting in unreadable texts. In this paper, we propose an image compression method for maintaining text quality. We developed a scene text image quality assessment model to assess text quality in compressed images. The assessment model iteratively searches for the best-compressed image holding high-quality text. Objective and subjective results showed that the proposed method was superior to existing methods. Furthermore, the proposed assessment model outperformed other deep-learning regression models.Comment: Accepted by Pattern Recognition, 202

    Linking Representations with Multimodal Contrastive Learning

    Full text link
    Many applications require grouping instances contained in diverse document datasets into classes. Most widely used methods do not employ deep learning and do not exploit the inherently multimodal nature of documents. Notably, record linkage is typically conceptualized as a string-matching problem. This study develops CLIPPINGS, (Contrastively Linking Pooled Pre-trained Embeddings), a multimodal framework for record linkage. CLIPPINGS employs end-to-end training of symmetric vision and language bi-encoders, aligned through contrastive language-image pre-training, to learn a metric space where the pooled image-text representation for a given instance is close to representations in the same class and distant from representations in different classes. At inference time, instances can be linked by retrieving their nearest neighbor from an offline exemplar embedding index or by clustering their representations. The study examines two challenging applications: constructing comprehensive supply chains for mid-20th century Japan through linking firm level financial records - with each firm name represented by its crop in the document image and the corresponding OCR - and detecting which image-caption pairs in a massive corpus of historical U.S. newspapers came from the same underlying photo wire source. CLIPPINGS outperforms widely used string matching methods by a wide margin and also outperforms unimodal methods. Moreover, a purely self-supervised model trained on only image-OCR pairs also outperforms popular string-matching methods without requiring any labels

    Choreographic and Somatic Approaches for the Development of Expressive Robotic Systems

    Full text link
    As robotic systems are moved out of factory work cells into human-facing environments questions of choreography become central to their design, placement, and application. With a human viewer or counterpart present, a system will automatically be interpreted within context, style of movement, and form factor by human beings as animate elements of their environment. The interpretation by this human counterpart is critical to the success of the system's integration: knobs on the system need to make sense to a human counterpart; an artificial agent should have a way of notifying a human counterpart of a change in system state, possibly through motion profiles; and the motion of a human counterpart may have important contextual clues for task completion. Thus, professional choreographers, dance practitioners, and movement analysts are critical to research in robotics. They have design methods for movement that align with human audience perception, can identify simplified features of movement for human-robot interaction goals, and have detailed knowledge of the capacity of human movement. This article provides approaches employed by one research lab, specific impacts on technical and artistic projects within, and principles that may guide future such work. The background section reports on choreography, somatic perspectives, improvisation, the Laban/Bartenieff Movement System, and robotics. From this context methods including embodied exercises, writing prompts, and community building activities have been developed to facilitate interdisciplinary research. The results of this work is presented as an overview of a smattering of projects in areas like high-level motion planning, software development for rapid prototyping of movement, artistic output, and user studies that help understand how people interpret movement. Finally, guiding principles for other groups to adopt are posited.Comment: Under review at MDPI Arts Special Issue "The Machine as Artist (for the 21st Century)" http://www.mdpi.com/journal/arts/special_issues/Machine_Artis

    Processing punctuation and word changes in different editions of prose fiction

    Get PDF
    The digital era has brought with it a shift in the field of literary editing in terms of the amount and kind of textual variation that can reasonably be annotated by editors. However, questions remain about how far readers engage with textual variants, especially minor ones such as small-scale changes to punctuation. In this study we present an eye-tracking experiment investigating reader sensitivity to variations in surface textual features of prose fiction. We monitored eye movements while participants read textual variants from Dickens and James, hypothesising that readers may pay more attention to lexical rather than punctuation changes. We found longer reading times for both types, but only lexical changes also increased reading times for the rest of the sentence. In addition, eye movement behaviour and conscious ability to report changes were highly correlated. We discuss the implications for how such methods might be applied to questions of “literary” significance and textual processing

    Designing a nutritional packaging system for end stage renal disease patients on hemodialysis to maintain their diet and health

    Get PDF
    End Stage renal disease; or ESRD is a kidney disease that requires a strict dietary regimen in order to limit the life-threatening effects of a poor diet. This dietary regimen requires careful regulation of 8 essential nutrients: calories, proteins, sodium, phosphorous, calcium, potassium, vitamins, and minerals. This strict diet creates a number of complications in patients\u27 lives, including malnutrition, fatigue, other diseases, and death.;The purpose of this study was to develop an appropriate packaging system that provided nutritional information, was appetizing, and focused on the cognitive, affective, and emotional dietary needs of ESRD patients.;The design research for this study included the examination of packaging design, labeling systems, principles of effective packaging design and typography. In addition, this study researched the unique dietary and physical needs of ESRD patients.;The methodology includes case studies with three nutritional systems from the food industry and the development of an evaluation matrix that allows for an analysis of food packaging based on design principles and nutritional information

    Chinmoku MQP

    Get PDF
    This report details the developmental process of Chinmoku (“silence”), an educational game developed to fulfill the Major Qualifying Project requirement for Worcester Polytechnic Institute’s Interactive Media and Game Development (IMGD) and Computer Science majors. This project was developed over a three month period at Ritsumeikan University’s Biwako-Kusatsu Campus in Shiga Prefecture, Japan. The game seeks to teach Hiragana, one of the Japanese writing systems, to a target audience of young adults familiar with gaming. This report covers all aspects of the team’s development process, research, playtesting, and the possibilities of future work on this project

    Computer-assisted animation creation techniques for hair animation and shade, highlight, and shadow

    Get PDF
    制度:新 ; 報告番号:甲3062号 ; 学位の種類:博士(工学) ; 授与年月日:2010/2/25 ; 早大学位記番号:新532

    Reviving Lucan : Marlowe, Tamburlaine, and Lucans First Booke

    Get PDF
    PostprintPeer reviewe
    corecore