311,756 research outputs found
Extraction of Semantic Relations from Wikipedia Text Corpus
This paper proposes the algorithm for automatic extraction of semantic relations using the rule-based approach. The authors suggest identifying certain verbs (predicates) between a subject and an object of expressions to obtain a sequence of semantic relations in the designed text corpus of Wikipedia articles. The synsets from WordNet are applied to extract semantic relations between concepts and their synonyms from the text corpus
Object Proposals for Text Extraction in the Wild
Object Proposals is a recent computer vision technique receiving increasing
interest from the research community. Its main objective is to generate a
relatively small set of bounding box proposals that are most likely to contain
objects of interest. The use of Object Proposals techniques in the scene text
understanding field is innovative. Motivated by the success of powerful while
expensive techniques to recognize words in a holistic way, Object Proposals
techniques emerge as an alternative to the traditional text detectors.
In this paper we study to what extent the existing generic Object Proposals
methods may be useful for scene text understanding. Also, we propose a new
Object Proposals algorithm that is specifically designed for text and compare
it with other generic methods in the state of the art. Experiments show that
our proposal is superior in its ability of producing good quality word
proposals in an efficient way. The source code of our method is made publicly
available.Comment: 13th International Conference on Document Analysis and Recognition
(ICDAR 2015
PDF Text Extraction
Práce se zabĂ˝vá extrakcĂ textu z dokumentu PDF, obsahujĂcĂ pĹ™edevšĂm vĂcesloupcovĂ˝ text. Je zde popsána struktura PDF a rozbor zĂskánĂ textu z PDF. Práce se dále zaměřuje na návrh a implementaci algoritmu vylepšujicĂ extrakci textu.Bachelor's thesis is concerned with text extraction from PDF dokument which contains mainly multi-column text. There's a description of PDF structure and analysis of text extraction from PDF document. Thesis is focused on suggestion of algorithm's implementation of improving text extraction.
- …