Search CORE

11 research outputs found

STEFANN: Scene Text Editor using Font Adaptive Neural Network

Author: Bhattacharya Saumik
Ghosh Subhankar
Pal Umapada
Roy Prasun
Publication venue
Publication date: 25/04/2020
Field of study

Textual information in a captured scene plays an important role in scene interpretation and decision making. Though there exist methods that can successfully detect and interpret complex text regions present in a scene, to the best of our knowledge, there is no significant prior work that aims to modify the textual information in an image. The ability to edit text directly on images has several advantages including error correction, text restoration and image reusability. In this paper, we propose a method to modify text in an image at character-level. We approach the problem in two stages. At first, the unobserved character (target) is generated from an observed character (source) being modified. We propose two different neural network architectures - (a) FANnet to achieve structural consistency with source font and (b) Colornet to preserve source color. Next, we replace the source character with the generated character maintaining both geometric and visual consistency with neighboring characters. Our method works as a unified platform for modifying text in images. We present the effectiveness of our method on COCO-Text and ICDAR datasets both qualitatively and quantitatively.Comment: Accepted in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 202

arXiv.org e-Print Archive

Crossref

Multi-Content GAN for Few-Shot Font Style Transfer

Author: Azadi Samaneh
Darrell Trevor
Fisher Matthew
Kim Vladimir
Shechtman Eli
Wang Zhaowen
Publication venue
Publication date: 01/12/2017
Field of study

In this work, we focus on the challenge of taking partial observations of highly-stylized text and generalizing the observations to generate unobserved glyphs in the ornamented typeface. To generate a set of multi-content images following a consistent style from very few examples, we propose an end-to-end stacked conditional GAN model considering content along channels and style along network layers. Our proposed network transfers the style of given glyphs to the contents of unseen ones, capturing highly stylized fonts found in the real-world such as those on movie posters or infographics. We seek to transfer both the typographic stylization (ex. serifs and ears) as well as the textual stylization (ex. color gradients and effects.) We base our experiments on our collected data set including 10,000 fonts with different styles and demonstrate effective generalization from a very small number of observed glyphs

arXiv.org e-Print Archive

Crossref

A software automation framework for image-typeface matching in graphic design

Author: Morris Taylor Javier
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/2013
Field of study

Thesis: S.M., Massachusetts Institute of Technology, Department of Mechanical Engineering, 2013.Cataloged from PDF version of thesis.Includes bibliographical references (pages 43-44).This research proposes the framework for an automation tool that facilitates the graphic design process of image-font pairing or matching. Considering traditional graphic design principles, a multi-step software algorithm was developed to emulate the process of determining proportions and visual axes of both images and fonts. The algorithm then matches these visual markers using a decision hierarchy to produce a ranking of appropriate fonts from an existing font dataset. To test the algorithm, 8 benchmark images were selected with varying proportions and visual axes. To build the font data set, each image was manually analyzed through a traditional graphic design process and then two fonts per image with similar, matching characteristics were manually selected. The 8 benchmark images and 16 fonts were then used as inputs into the proposed matching software program. The results of the manually prescribed font-image pairings and calculated matches were then compared. Two images had the intended font in the top 4, two images had one of the intended fonts in the top 4, and 4 images had neither of the intended fonts in the top 4. An additional step in image-font pairing includes detail matching by determining curvature similarities. This detail analysis will affect the pairing outcomes and should be further investigated. This research began to analyze these details, and makes recommendations for continuing this work. Additional future directions for this work include incorporating a user-interface to the matching algorithm, introducing expert testing, and down-selecting the first font pool based on deviation.by Taylor Javier Morris.S.M

DSpace@MIT

AutoGraff: towards a computational understanding of graffiti writing and related art forms.

Author: Berio Daniel
Publication venue: Goldsmiths, University of London
Publication date
Field of study

The aim of this thesis is to develop a system that generates letters and pictures with a style that is immediately recognizable as graffiti art or calligraphy. The proposed system can be used similarly to, and in tight integration with, conventional computer-aided geometric design tools and can be used to generate synthetic graffiti content for urban environments in games and in movies, and to guide robotic or fabrication systems that can materialise the output of the system with physical drawing media. The thesis is divided into two main parts. The first part describes a set of stroke primitives, building blocks that can be combined to generate different designs that resemble graffiti or calligraphy. These primitives mimic the process typically used to design graffiti letters and exploit well known principles of motor control to model the way in which an artist moves when incrementally tracing stylised letter forms. The second part demonstrates how these stroke primitives can be automatically recovered from input geometry defined in vector form, such as the digitised traces of writing made by a user, or the glyph outlines in a font. This procedure converts the input geometry into a seed that can be transformed into a variety of calligraphic and graffiti stylisations, which depend on parametric variations of the strokes

Goldsmiths Research Online

Recommended from our members

Bridging the Gap Between People, Mobile Devices, and the Physical World

Author: Xiao Chang
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2021
Field of study

Human-computer interaction (HCI) is being revolutionized by computational design and artificial intelligence. As the diversity of user interfaces shifts from personal desktops to mobile and wearable devices, yesterday’s tools and interfaces are insufficient to meet the demands of tomorrow’s devices. This dissertation describes my research on leveraging different physical channels (e.g., vibration, light, capacitance) to enable novel interaction opportunities. We first introduce FontCode, an information embedding technique for text documents. Given a text document with specific fonts, our method can embed user-specified information (e.g., URLs, meta data, etc) in the text by perturbing the glyphs of text characters while preserving the text content. The embedded information can later be retrieved using a smartphone in real time. Then, we present Vidgets, a family of mechanical widgets, specifically push buttons and rotary knobs that augment mobile devices with tangible user interfaces. When these widgets are attached to a mobile device and a user interacts with them, the nonlinear mechanical response of the widgets shifts the device slightly and quickly. Subsequently, this subtle motion can be detected by the Inertial Measurement Units (IMUs), which is commonly installed on mobile devices. Next, we propose BackTrack, a trackpad placed on the back of a smartphone to track finegrained finger motions. Our system has a small form factor, with all the circuits encapsulated in a thin layer attached to a phone case. It can be used with any off-the-shelf smartphone, requiring no power supply or modification of the operating systems. BackTrack simply extends the finger tracking area of the front screen, without interrupting the use of the front screen. Lastly, we demonstrate MoiréBoard, a new camera tracking method that leverages a seemingly irrelevant visual phenomenon, the moiré effect. Based on a systematic analysis of the moiré effect under camera projection, MoiréBoard requires no power nor camera calibration. It can easily be made at a low cost (e.g., through 3D printing) and ready to use with any stock mobile device with a camera. Its tracking algorithm is computationally efficient and can run at a high frame rate. It is not only simple to implement, but also tracks devices at a high accuracy, comparable to the state-of-the-art commercial VR tracking systems

Columbia University Academic Commons