4,429 research outputs found
Multi-Content GAN for Few-Shot Font Style Transfer
In this work, we focus on the challenge of taking partial observations of
highly-stylized text and generalizing the observations to generate unobserved
glyphs in the ornamented typeface. To generate a set of multi-content images
following a consistent style from very few examples, we propose an end-to-end
stacked conditional GAN model considering content along channels and style
along network layers. Our proposed network transfers the style of given glyphs
to the contents of unseen ones, capturing highly stylized fonts found in the
real-world such as those on movie posters or infographics. We seek to transfer
both the typographic stylization (ex. serifs and ears) as well as the textual
stylization (ex. color gradients and effects.) We base our experiments on our
collected data set including 10,000 fonts with different styles and demonstrate
effective generalization from a very small number of observed glyphs
Deep Image Compression Using Scene Text Quality Assessment
Image compression is a fundamental technology for Internet communication
engineering. However, a high compression rate with general methods may degrade
images, resulting in unreadable texts. In this paper, we propose an image
compression method for maintaining text quality. We developed a scene text
image quality assessment model to assess text quality in compressed images. The
assessment model iteratively searches for the best-compressed image holding
high-quality text. Objective and subjective results showed that the proposed
method was superior to existing methods. Furthermore, the proposed assessment
model outperformed other deep-learning regression models.Comment: Accepted by Pattern Recognition, 202
Linking Representations with Multimodal Contrastive Learning
Many applications require grouping instances contained in diverse document
datasets into classes. Most widely used methods do not employ deep learning and
do not exploit the inherently multimodal nature of documents. Notably, record
linkage is typically conceptualized as a string-matching problem. This study
develops CLIPPINGS, (Contrastively Linking Pooled Pre-trained Embeddings), a
multimodal framework for record linkage. CLIPPINGS employs end-to-end training
of symmetric vision and language bi-encoders, aligned through contrastive
language-image pre-training, to learn a metric space where the pooled
image-text representation for a given instance is close to representations in
the same class and distant from representations in different classes. At
inference time, instances can be linked by retrieving their nearest neighbor
from an offline exemplar embedding index or by clustering their
representations. The study examines two challenging applications: constructing
comprehensive supply chains for mid-20th century Japan through linking firm
level financial records - with each firm name represented by its crop in the
document image and the corresponding OCR - and detecting which image-caption
pairs in a massive corpus of historical U.S. newspapers came from the same
underlying photo wire source. CLIPPINGS outperforms widely used string matching
methods by a wide margin and also outperforms unimodal methods. Moreover, a
purely self-supervised model trained on only image-OCR pairs also outperforms
popular string-matching methods without requiring any labels
Choreographic and Somatic Approaches for the Development of Expressive Robotic Systems
As robotic systems are moved out of factory work cells into human-facing
environments questions of choreography become central to their design,
placement, and application. With a human viewer or counterpart present, a
system will automatically be interpreted within context, style of movement, and
form factor by human beings as animate elements of their environment. The
interpretation by this human counterpart is critical to the success of the
system's integration: knobs on the system need to make sense to a human
counterpart; an artificial agent should have a way of notifying a human
counterpart of a change in system state, possibly through motion profiles; and
the motion of a human counterpart may have important contextual clues for task
completion. Thus, professional choreographers, dance practitioners, and
movement analysts are critical to research in robotics. They have design
methods for movement that align with human audience perception, can identify
simplified features of movement for human-robot interaction goals, and have
detailed knowledge of the capacity of human movement. This article provides
approaches employed by one research lab, specific impacts on technical and
artistic projects within, and principles that may guide future such work. The
background section reports on choreography, somatic perspectives,
improvisation, the Laban/Bartenieff Movement System, and robotics. From this
context methods including embodied exercises, writing prompts, and community
building activities have been developed to facilitate interdisciplinary
research. The results of this work is presented as an overview of a smattering
of projects in areas like high-level motion planning, software development for
rapid prototyping of movement, artistic output, and user studies that help
understand how people interpret movement. Finally, guiding principles for other
groups to adopt are posited.Comment: Under review at MDPI Arts Special Issue "The Machine as Artist (for
the 21st Century)"
http://www.mdpi.com/journal/arts/special_issues/Machine_Artis
Processing punctuation and word changes in different editions of prose fiction
The digital era has brought with it a shift in the field of literary editing in terms of the amount and kind of textual variation that can reasonably be annotated by editors. However, questions remain about how far readers engage with textual variants, especially minor ones such as small-scale changes to punctuation. In this study we present an eye-tracking experiment investigating reader sensitivity to variations in surface textual features of prose fiction. We monitored eye movements while participants read textual variants from Dickens and James, hypothesising that readers may pay more attention to lexical rather than punctuation changes. We found longer reading times for both types, but only lexical changes also increased reading times for the rest of the sentence. In addition, eye movement behaviour and conscious ability to report changes were highly correlated. We discuss the implications for how such methods might be applied to questions of “literary” significance and textual processing
Designing a nutritional packaging system for end stage renal disease patients on hemodialysis to maintain their diet and health
End Stage renal disease; or ESRD is a kidney disease that requires a strict dietary regimen in order to limit the life-threatening effects of a poor diet. This dietary regimen requires careful regulation of 8 essential nutrients: calories, proteins, sodium, phosphorous, calcium, potassium, vitamins, and minerals. This strict diet creates a number of complications in patients\u27 lives, including malnutrition, fatigue, other diseases, and death.;The purpose of this study was to develop an appropriate packaging system that provided nutritional information, was appetizing, and focused on the cognitive, affective, and emotional dietary needs of ESRD patients.;The design research for this study included the examination of packaging design, labeling systems, principles of effective packaging design and typography. In addition, this study researched the unique dietary and physical needs of ESRD patients.;The methodology includes case studies with three nutritional systems from the food industry and the development of an evaluation matrix that allows for an analysis of food packaging based on design principles and nutritional information
Chinmoku MQP
This report details the developmental process of Chinmoku (“silence”), an educational game developed to fulfill the Major Qualifying Project requirement for Worcester Polytechnic Institute’s Interactive Media and Game Development (IMGD) and Computer Science majors. This project was developed over a three month period at Ritsumeikan University’s Biwako-Kusatsu Campus in Shiga Prefecture, Japan. The game seeks to teach Hiragana, one of the Japanese writing systems, to a target audience of young adults familiar with gaming. This report covers all aspects of the team’s development process, research, playtesting, and the possibilities of future work on this project
Computer-assisted animation creation techniques for hair animation and shade, highlight, and shadow
制度:新 ; 報告番号:甲3062号 ; 学位の種類:博士(工学) ; 授与年月日:2010/2/25 ; 早大学位記番号:新532
Reviving Lucan : Marlowe, Tamburlaine, and Lucans First Booke
PostprintPeer reviewe
- …