Search CORE

4,132 research outputs found

Open Set Chinese Character Recognition using Multi-typed Attributes

Author: He Sheng
Schomaker Lambert
Publication venue
Publication date: 27/08/2018
Field of study

Dissertations of the University of Groningen

Open Set Chinese Character Recognition using Multi-typed Attributes

Author: He Sheng
Schomaker Lambert
Publication venue
Publication date: 27/08/2018
Field of study

ARTS repository - University of Groningen

Open Set Chinese Character Recognition using Multi-typed Attributes

Author: He Sheng
Schomaker Lambert
Publication venue
Publication date: 27/08/2018
Field of study

Recognition of Off-line Chinese characters is still a challenging problem, especially in historical documents, not only in the number of classes extremely large in comparison to contemporary image retrieval methods, but also new unseen classes can be expected under open learning conditions (even for CNN). Chinese character recognition with zero or a few training samples is a difficult problem and has not been studied yet. In this paper, we propose a new Chinese character recognition method by multi-type attributes, which are based on pronunciation, structure and radicals of Chinese characters, applied to character recognition in historical books. This intermediate attribute code has a strong advantage over the common `one-hot' class representation because it allows for understanding complex and unseen patterns symbolically using attributes. First, each character is represented by four groups of attribute types to cover a wide range of character possibilities: Pinyin label, layout structure, number of strokes, three different input methods such as Cangjie, Zhengma and Wubi, as well as a four-corner encoding method. A convolutional neural network (CNN) is trained to learn these attributes. Subsequently, characters can be easily recognized by these attributes using a distance metric and a complete lexicon that is encoded in attribute space. We evaluate the proposed method on two open data sets: printed Chinese character recognition for zero-shot learning, historical characters for few-shot learning and a closed set: handwritten Chinese characters. Experimental results show a good general classification of seen classes but also a very promising generalization ability to unseen characters.Comment: 29 pages, submitted to Pattern Recognitio

arXiv.org e-Print Archive

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

Dissertations of the University of Groningen

Violence Detection in Social Media-Review

Author: Dikwatta U.
Fernando T.G.I.
Publication venue: 'University of Sri Jayewardenepura'
Publication date: 10/12/2019
Field of study

Social media has become a vital part of humans’ day to day life. Different users engage with social media differently. With the increased usage of social media, many researchers have investigated different aspects of social media. Many examples in the recent past show, content in the social media can generate violence in the user community. Violence in social media can be categorised into aggregation in comments, cyber-bullying and incidents like protests, murders. Identifying violent content in social media is a challenging task: social media posts contain both the visual and text as well as these posts may contain hidden meaning according to the users’ context and other background information. This paper summarizes the different social media violent categories and existing methods to detect the violent content.Keywords: Machine learning, natural language processing, violence, social media, convolution neural networ

University of Sri Jayewardenepura: Journals & Proceedings