Skip to main content
Article thumbnail
Location of Repository

Ανάκτηση εγγράφων βάσει περιεχομένου και mpeg-7 μεταδομένων

By Κωνσταντίνος Ζαγόρης

Abstract

inside the image. From every such block a descriptor is extracted which it is constructed from a set of document structures elements. Also, the length of the descriptor can be reduced from the 510 initial DSEs to any number using an algorithm called Feature Standard Deviation Analysis of Structure Elements (FSDASE). Finally, the output of the SVM is using the descriptors to classify each block as text or not and extract those blocks from the original image or locate them on it. The proposed technique has the ability to adapt to the peculiarities of each document images database since the features adjust to it. It provides, also, the ability to increase or decrease text localization speed by the manipulation of the block descriptor length. The fourth technique encounters the document retrieval problem using a word matching procedure. This technique performs the word matching directly in the document images bypassing OCR and using word-images as queries. The entire system consists of the Offline and the Online procedures. In the Offline procedure which it is transparent to the user, the document images are analyzed and the results are stored in a database. This procedure consists of three main stages. Initially, the document images pass the preprocessing stage which consists of a Median filter, in order to face the existence of noise e.g in case of historical or badly maintained documents, and the Otsu binarization method. The word segmentation stage follows the preprocessing stage. Its primary goal is to detect the word limits. This is accomplished by using the Connected Components Labeling and Filtering method. A set of features, capable of capturing the word shape and discard detailed differences due to noise or font differences are used for the word-matching process. These features are: Width to Height Ratio, Word Area Density, Center of Gravity, Vertical Projection, Top - Bottom Shape Projections, Upper Grid Features, Down Grid Features. Finally, these features create a 93-dimention vector that is the word descriptor and it is stored in a database. In the Online procedure, the user enters a query word and the proposed system creates an image from it with font height equal to the average height of all the word-boxes obtained through Offline operation. Then, the system calculates the descriptor of the query word image. Finally, the system using the Minkowski L1 distance presents the documents that contain the words which their descriptors are closest to the query descriptor. The experimental results show that the proposed system performs better than a commercial OCR package. The last method involves a MPEG-like compact shape descriptor that contains conventional contour and region shape features with a wide applicability from any arbitrary shape to document retrieval through word spotting. It is called Compact Shape Portrayal Descriptor and its computation can be easily parallize as each feature can be calculated separately. These features are the Width to Height Ratio, Vertical - Horizontal Projections, Top - Bottom Shape Projections which construct a 41 dimension descriptor.

Topics: Ανάκτηση εικόνων, Ανάκτηση εγγράφων, Μείωση χρωματικών αποχρώσεων, Ανάδραση με βάση τη συνάφεια, Νευρω - ασαφής ταξινομητής, Image retrieval, Document retrieval, Color reduction, Neuro - fuzzy classifiers, Relevance feedback
Publisher: Democritus University of Thrace (DUTH)
Year: 2009
DOI identifier: 10.12681/eadd/18470
OAI identifier:
Download PDF:
Sorry, we are unable to provide the full text but you may find it at the following location(s):
  • http://hdl.handle.net/10442/he... (external link)
  • Suggested articles


    To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.