319 research outputs found

    Integration of an actor-critic model and generative adversarial networks for a Chinese calligraphy robot

    Get PDF
    As a combination of robotic motion planning and Chinese calligraphy culture, robotic calligraphy plays a significant role in the inheritance and education of Chinese calligraphy culture. Most existing calligraphy robots focus on enabling the robots to learn writing through human participation, such as human–robot interactions and manually designed evaluation functions. However, because of the subjectivity of art aesthetics, these existing methods require a large amount of implementation work from human engineers. In addition, the written results cannot be accurately evaluated. To overcome these limitations, in this paper, we propose a robotic calligraphy model that combines a generative adversarial network (GAN) and deep reinforcement learning to enable a calligraphy robot to learn to write Chinese character strokes directly from images captured from Chinese calligraphic textbooks. In our proposed model, to automatically establish an aesthetic evaluation system for Chinese calligraphy, a GAN is first trained to understand and reconstruct stroke images. Then, the discriminator network is independently extracted from the trained GAN and embedded into a variant of the reinforcement learning method, the “actor-critic model”, as a reward function. Thus, a calligraphy robot adopts the improved actor-critic model to learn to write multiple character strokes. The experimental results demonstrate that the proposed model successfully allows a calligraphy robot to write Chinese character strokes based on input stroke images. The performance of our model, compared with the state-of-the-art deep reinforcement learning method, shows the efficacy of the combination approach. In addition, the key technology in this work shows promise as a solution for robotic autonomous assembly

    GANCCRobot:Generative Adversarial Nets based Chinese Calligraphy Robot

    Get PDF
    Robotic calligraphy, as a typical application of robot movement planning, is of great significance for the inheritance and education of calligraphy culture. The existing implementations of such robots often suffer from its limited ability for font generation and evaluation, leading to poor writing style diversity and writing quality. This paper proposes a calligraphic robotic framework based on the generative adversarial nets (GAN) to address such limitation. The robot implemented using such framework is able to learn to write fundamental Chinese character strokes with rich diversities and good quality that is close to the human level, without the requirement of specifically designed evaluation functions thanks to the employment of the revised GAN. In particular, the type information of the stroke is introduced as condition information, and the latent codes are applied to maximize the style quality of the generated strokes. Experimental results demonstrate that the proposed model enables a calligraphic robot to successfully write fundamental Chinese strokes based on a given type and style, with overall good quality. Although the proposed model was evaluated in this report using calligraphy writing, the underpinning research is readily applicable to many other applications, such as robotic graffiti and character style conversion

    A data-driven robotic Chinese calligraphy system using convolutional auto-encoder and differential evolution

    Get PDF
    The Chinese stroke evaluation and generation systems required in an autonomous calligraphy robot play a crucial role in producing high-quality writing results with good diversity. These systems often suffer from inefficiency and non-optima despite of intensive research effort investment by the robotic community. This paper proposes a new learning system to allow a robot to automatically learn to write Chinese calligraphy effectively. In the proposed system, the writing quality evaluation subsystem assesses written strokes using a convolutional auto-encoder network (CAE), which enables the generation of aesthetic strokes with various writing styles. The trained CAE network effectively excludes poorly written strokes through stroke reconstruction, but guarantees the inheritance of information from well-written ones. With the support of the evaluation subsystem, the writing trajectory model generation subsystem is realized by multivariate normal distributions optimized by differential evolution (DE), a type of heuristic optimization search algorithm. The proposed approach was validated and evaluated using a dataset of nine stroke categories; high-quality written strokes have been resulted with good diversity which shows the robustness and efficacy of the proposed approach and its potential in autonomous action-state space exploration for other real-world applications

    A Robot Calligraphy System: From Simple to Complex Writing by Human Gestures

    Get PDF
    Robotic writing is a very challenging task and involves complicated kinematic control algorithms and image processing work. This paper, alternatively, proposes a robot calligraphy system that firstly applies human arm gestures to establish a font database of Chinese character elementary strokes and English letters, then uses the created database and human gestures to write Chinese characters and English words. A three-dimensional motion sensing input device is deployed to capture the human arm trajectories, which are used to build the font database and to train a classifier ensemble. 26 types of human gesture are used for writing English letters, and 5 types of gesture are used to generate 5 elementary strokes for writing Chinese characters. By using the font database, the robot calligraphy system acquires a basic writing ability to write simple strokes and letters. Then, the robot can develop to write complex Chinese characters and English words by following human body movements. The classifier ensemble, which is used to identify each gesture, is implemented through using feature selection techniques and the harmony search algorithm, thereby achieving better classification performance. The experimental evaluations are carried out to demonstrate the feasibility and performance of the proposed method. By following the motion trajectories of the human right arm, the end-effector of the robot can successfully write the English words or Chinese characters that correspond to the arm trajectories

    Stroke Extraction of Chinese Character Based on Deep Structure Deformable Image Registration

    Full text link
    Stroke extraction of Chinese characters plays an important role in the field of character recognition and generation. The most existing character stroke extraction methods focus on image morphological features. These methods usually lead to errors of cross strokes extraction and stroke matching due to rarely using stroke semantics and prior information. In this paper, we propose a deep learning-based character stroke extraction method that takes semantic features and prior information of strokes into consideration. This method consists of three parts: image registration-based stroke registration that establishes the rough registration of the reference strokes and the target as prior information; image semantic segmentation-based stroke segmentation that preliminarily separates target strokes into seven categories; and high-precision extraction of single strokes. In the stroke registration, we propose a structure deformable image registration network to achieve structure-deformable transformation while maintaining the stable morphology of single strokes for character images with complex structures. In order to verify the effectiveness of the method, we construct two datasets respectively for calligraphy characters and regular handwriting characters. The experimental results show that our method strongly outperforms the baselines. Code is available at https://github.com/MengLi-l1/StrokeExtraction.Comment: 10 pages, 8 figures, published to AAAI-23 (oral

    Solving Robotic Trajectory Sequential Writing Problem via Learning Character’s Structural and Sequential Information

    Get PDF
    The writing sequence of numerals or letters often affects aesthetic aspects of the writing outcomes. As such, it remains a challenge for robotic calligraphy systems to perform, mimicking human writers’ implicit intention. This article presents a new robot calligraphy system that is able to learn writing sequences with limited sequential information, producing writing results compatible to human writers with good diversity. In particular, the system innovatively applies a gated recurrent unit (GRU) network to generate robotic writing actions with the support of a prelabeled trajectory sequence vector. Also, a new evaluation method is proposed that considers the shape, trajectory sequence, and structural information of the writing outcome, thereby helping ensure the writing quality. A swarm optimization algorithm is exploited to create an optimal set of parameters of the proposed system. The proposed approach is evaluated using Arabic numerals, and the experimental results demonstrate the competitive writing performance of the system against state-of-the-art approaches regarding multiple criteria (including FID, MAE, PSNR, SSIM, and PerLoss), as well as diversity performance concerning variance and entropy. Importantly, the proposed GRU-based robotic motion planning system, supported with swarm optimization can learn from a small dataset, while producing calligraphy writing with diverse and aesthetically pleasing outcomes

    A Review on Human-Computer Interaction and Intelligent Robots

    Get PDF
    In the field of artificial intelligence, human–computer interaction (HCI) technology and its related intelligent robot technologies are essential and interesting contents of research. From the perspective of software algorithm and hardware system, these above-mentioned technologies study and try to build a natural HCI environment. The purpose of this research is to provide an overview of HCI and intelligent robots. This research highlights the existing technologies of listening, speaking, reading, writing, and other senses, which are widely used in human interaction. Based on these same technologies, this research introduces some intelligent robot systems and platforms. This paper also forecasts some vital challenges of researching HCI and intelligent robots. The authors hope that this work will help researchers in the field to acquire the necessary information and technologies to further conduct more advanced research

    DEsignBench: Exploring and Benchmarking DALL-E 3 for Imagining Visual Design

    Full text link
    We introduce DEsignBench, a text-to-image (T2I) generation benchmark tailored for visual design scenarios. Recent T2I models like DALL-E 3 and others, have demonstrated remarkable capabilities in generating photorealistic images that align closely with textual inputs. While the allure of creating visually captivating images is undeniable, our emphasis extends beyond mere aesthetic pleasure. We aim to investigate the potential of using these powerful models in authentic design contexts. In pursuit of this goal, we develop DEsignBench, which incorporates test samples designed to assess T2I models on both "design technical capability" and "design application scenario." Each of these two dimensions is supported by a diverse set of specific design categories. We explore DALL-E 3 together with other leading T2I models on DEsignBench, resulting in a comprehensive visual gallery for side-by-side comparisons. For DEsignBench benchmarking, we perform human evaluations on generated images in DEsignBench gallery, against the criteria of image-text alignment, visual aesthetic, and design creativity. Our evaluation also considers other specialized design capabilities, including text rendering, layout composition, color harmony, 3D design, and medium style. In addition to human evaluations, we introduce the first automatic image generation evaluator powered by GPT-4V. This evaluator provides ratings that align well with human judgments, while being easily replicable and cost-efficient. A high-resolution version is available at https://github.com/design-bench/design-bench.github.io/raw/main/designbench.pdf?download=Comment: Project page at https://design-bench.github.io

    Machine learning model selection with multi-objective Bayesian optimization and reinforcement learning

    Get PDF
    A machine learning system, including when used in reinforcement learning, is usually fed with only limited data, while aimed at training a model with good predictive performance that can generalize to an underlying data distribution. Within certain hypothesis classes, model selection chooses a model based on selection criteria calculated from available data, which usually serve as estimators of generalization performance of the model. One major challenge for model selection that has drawn increasing attention is the discrepancy between the data distribution where training data is sampled from and the data distribution at deployment. The model can over-fit in the training distribution, and fail to extrapolate in unseen deployment distributions, which can greatly harm the reliability of a machine learning system. Such a distribution shift challenge can become even more pronounced in high-dimensional data types like gene expression data, functional data and image data, especially in a decentralized learning scenario. Another challenge for model selection is efficient search in the hypothesis space. Since training a machine learning model usually takes a fair amount of resources, searching for an appropriate model with favorable configurations is by inheritance an expensive process, thus calling for efficient optimization algorithms. To tackle the challenge of distribution shift, novel resampling methods for the evaluation of robustness of neural network was proposed, as well as a domain generalization method using multi-objective bayesian optimization in decentralized learning scenario and variational inference in a domain unsupervised manner. To tackle the expensive model search problem, combining bayesian optimization and reinforcement learning in an interleaved manner was proposed for efficient search in a hierarchical conditional configuration space. Additionally, the effectiveness of using multi-objective bayesian optimization for model search in a decentralized learning scenarios was proposed and verified. A model selection perspective to reinforcement learning was proposed with associated contributions in tackling the problem of exploration in high dimensional state action spaces and sparse reward. Connections between statistical inference and control was summarized. Additionally, contributions in open source software development in related machine learning sub-topics like feature selection and functional data analysis with advanced tuning method and abundant benchmarking were also made
    • …
    corecore