5,558 research outputs found

    Scene text dataset in Turkish

    Get PDF
    25th Signal Processing and Communications Applications Conference, SIU 2017; Antalya; Turkey; 15 May 2017 through 18 May 2017Manzara metni tespit etme ve tanıma gerçek dünya imgelerinden anlamlı içerik çıkarmada ve bu imgelerden olu-şan büyük veri kümelerinde aranan metni bulup getirmede büyük kolaylık sağladığından araştırmacılardan artan bir ilgi görmektedir. Buna rağmen, bu alanda yaygın olarak kullanılan imge kümelerinin büyük kısmı İngilizce metinler içerdiklerinden yapılan çalışmalar tek dil ile kısıtlı kalmıştır. Buradan yola çıkarak, geliştirilen manzara metni tespit etme ve tanıma teknolojilerinin Türkçe metinlere uygulanması, performanslarının analiz edilmesi ve Türkçe’ye özgün algoritmaların geliştirilmesi amacıyla literatürde ilk kez bir Türkçe manzara metni veri kümesi oluşturulmuştur. Bu bildiride, kısaca STRIT (Scene Text Recognition In Turkish) olarak adlandırılan veri kümesinin içeriği anlatılmaktadır. Buna ek olarak, Türkçe manzara tespiti ve tanınması için iki taban çizgisi yöntemi denenmekte ve ön sonuçlar takdim edilmektedir.Scene text localization and recognition keeps attracting an increasing interest from researchers due to its valuable advantage in extracting content from real world images and in image retrieval via text search. Nevertheless, due to the fact that the majority of the image datasets that are commonly used in this field is comprised of text in English, the related studies have mostly been limited to a single language. On that account, in order to apply the technologies developed for scene text detection and recognition to Turkish scene text, analyze their performances and to develop Turkish language specific algorithms, a Turkish scene text database is collected for the first time in the literature. In this paper, the contents of this database, shortly called STRIT (Scene Text Recognition In Turkish), are detailed. Additionally, two baseline methods are tested to detect and recognize scene text in Turkish and the preliminary results are presented.TUBITAK (BIDEB-114C025

    Integrating Document Clustering and Topic Modeling

    Full text link
    Document clustering and topic modeling are two closely related tasks which can mutually benefit each other. Topic modeling can project documents into a topic space which facilitates effective document clustering. Cluster labels discovered by document clustering can be incorporated into topic models to extract local topics specific to each cluster and global topics shared by all clusters. In this paper, we propose a multi-grain clustering topic model (MGCTM) which integrates document clustering and topic modeling into a unified framework and jointly performs the two tasks to achieve the overall best performance. Our model tightly couples two components: a mixture component used for discovering latent groups in document collection and a topic model component used for mining multi-grain topics including local topics specific to each cluster and global topics shared across clusters.We employ variational inference to approximate the posterior of hidden variables and learn model parameters. Experiments on two datasets demonstrate the effectiveness of our model.Comment: Appears in Proceedings of the Twenty-Ninth Conference on Uncertainty in Artificial Intelligence (UAI2013

    Rotation-invariant features for multi-oriented text detection in natural images.

    Get PDF
    Texts in natural scenes carry rich semantic information, which can be used to assist a wide range of applications, such as object recognition, image/video retrieval, mapping/navigation, and human computer interaction. However, most existing systems are designed to detect and recognize horizontal (or near-horizontal) texts. Due to the increasing popularity of mobile-computing devices and applications, detecting texts of varying orientations from natural images under less controlled conditions has become an important but challenging task. In this paper, we propose a new algorithm to detect texts of varying orientations. Our algorithm is based on a two-level classification scheme and two sets of features specially designed for capturing the intrinsic characteristics of texts. To better evaluate the proposed method and compare it with the competing algorithms, we generate a comprehensive dataset with various types of texts in diverse real-world scenes. We also propose a new evaluation protocol, which is more suitable for benchmarking algorithms for detecting texts in varying orientations. Experiments on benchmark datasets demonstrate that our system compares favorably with the state-of-the-art algorithms when handling horizontal texts and achieves significantly enhanced performance on variant texts in complex natural scenes

    Visual Affect Around the World: A Large-scale Multilingual Visual Sentiment Ontology

    Get PDF
    Every culture and language is unique. Our work expressly focuses on the uniqueness of culture and language in relation to human affect, specifically sentiment and emotion semantics, and how they manifest in social multimedia. We develop sets of sentiment- and emotion-polarized visual concepts by adapting semantic structures called adjective-noun pairs, originally introduced by Borth et al. (2013), but in a multilingual context. We propose a new language-dependent method for automatic discovery of these adjective-noun constructs. We show how this pipeline can be applied on a social multimedia platform for the creation of a large-scale multilingual visual sentiment concept ontology (MVSO). Unlike the flat structure in Borth et al. (2013), our unified ontology is organized hierarchically by multilingual clusters of visually detectable nouns and subclusters of emotionally biased versions of these nouns. In addition, we present an image-based prediction task to show how generalizable language-specific models are in a multilingual context. A new, publicly available dataset of >15.6K sentiment-biased visual concepts across 12 languages with language-specific detector banks, >7.36M images and their metadata is also released.Comment: 11 pages, to appear at ACM MM'1
    corecore