9 research outputs found

    Обнаружение текстовых областей в видеопоследовательностях

    Get PDF
    В статье рассматривается задача обнаружения текстовых областей на неоднородном фоне в видео-последовательностях. Предлагается двухэтапная схема, алгоритм и методика обнаружения текстовых областей с использованием непрерывного вейвлет-преобразования с автоматическим выбором масштаба или итерационной обработкой с разными масштабами.У статті розглядається задача виявлення текстових областей на неоднорідному фоні в відеопослідовності. Пропонується двоетапна схема, алгоритм і методика виявлення текстових областей з використанням безперервного вейвлет-перетворення з автоматичним вибором масштабу або ітераційною обробкою з різними масштабами.In the article, the problem of text region detection on the non-uniform background in video frames is considered. The two-stage scheme, the algorithm and the technique of detection of text regions using a continuous wavelet transform with automatic scale selection or iteration processing at different scales are poposed

    Text Localization in Video Data Using Discrete Wavelet Transform

    Get PDF
    Abstract: Text provides important information about images or video sequences in a documented image, but it always remains difficult to modify the static documented image. To carry out modification in any of the text matter the text must be segmented out from the documented image, which can be used for further analysis. Taking consideration to video image sequence the isolation of text data from the isolated frame becomes more difficult due to its variable nature. Various methods were proposed for the isolation of text data from the documented image. Among which Wavelet transforms have been widely used as effective tool in text segmentation. Document images usually contain three types of texture information. various wavelet transformation have been proposed for the decomposition of these images into their fundamentals feature. Onto these wavelet families, it is one of the difficult tasks in selecting a proper wavelet transformation with proper scale level for text isolation. This paper work implements an efficient text isolation algorithm for the extraction of text data from the documented video clips. The implemented system carries out a performance analysis on various wavelet transforms for the proper selection of wavelet transform with multi level decomposition. Of the selected wavelet transform the obtained wavelet a coefficient are applied with morphological operators for text isolation and evaluates the contribution of decomposition levels and wavelet functions to the segmentation result in documented video image. The proposed task implements neural network for the recognition of text characters from the isolated text image for making it

    Content-Adaptive Non-Stationary Projector Resolution Enhancement

    Get PDF
    For any projection system, one goal will surely be to maximize the quality of projected imagery at a minimized hardware cost, which is considered a challenging engineering problem. Experience in applying different image filters and enhancements to projected video suggests quite clearly that the quality of a projected enhanced video is very much a function of the content of the video itself. That is, to first order, whether the video contains content which is moving as opposed to still plays an important role in the video quality, since the human visual system tolerates much more blur in moving imagery but at the same time is significantly sensitive to the flickering and aliasing caused by moving sharp textures. Furthermore, the spatial and statistical characteristics of text and non-text images are quite distinct. We would, therefore, assert that the text-like, moving and background pixels of a given video stream should be enhanced differently using class-dependent video enhancement filters to achieve maximum visual quality. In this thesis, we present a novel text-dependent content enhancement scheme, a novel motion-dependent content enhancement scheme and a novel content-adaptive resolution enhancement scheme based on a text-like / non-text-like classification and a pixel-wise moving / non-moving classification, with the actual enhancement obtained via class--dependent Wiener deconvolution filtering. Given an input image, the text and motion detection methods are used to generate binary masks to indicate the location of the text and moving regions in the video stream. Then enhanced images are obtained by applying a plurality of class-dependent enhancement filters, with text-like regions sharpened more than the background and moving regions sharpened less than the background. Later, one or more resulting enhanced images are combined into a composite output image based on the corresponding mask of different features. Finally, a higher resolution projected video stream is conducted by controlling one or more projectors to project the plurality of output frame streams in a rapid overlapping way. Experimental results on the test images and videos show that the proposed schemes all offer improved visual quality over projection without enhancement as well as compared to a recent state-of-the-art enhancement method. Particularly, the proposed content-adaptive resolution enhancement scheme increases the PSNR value by at least 18.2% and decreases MSE value by at least 25%

    Finding Objects of Interest in Images using Saliency and Superpixels

    Get PDF
    The ability to automatically find objects of interest in images is useful in the areas of compression, indexing and retrieval, re-targeting, and so on. There are two classes of such algorithms – those that find any object of interest with no prior knowledge, independent of the task, and those that find specific objects of interest known a priori. The former class of algorithms tries to detect objects in images that stand-out, i.e. are salient, by virtue of being different from the rest of the image and consequently capture our attention. The detection is generic in this case as there is no specific object we are trying to locate. The latter class of algorithms detects specific known objects of interest and often requires training using features extracted from known examples. In this thesis we address various aspects of finding objects of interest under the topics of saliency detection and object detection. We present two saliency detection algorithms that rely on the principle of center-surround contrast. These two algorithms are shown to be superior to several state-of-the-art techniques in terms of precision and recall measures with respect to a ground truth. They output full-resolution saliency maps, are simpler to implement, and are computationally more efficient than most existing algorithms. We further establish the relevance of our saliency detection algorithms by using them for the known applications of object segmentation and image re-targeting. We first present three different techniques for salient object segmentation using our saliency maps that are based on clustering, graph-cuts, and geodesic distance based labeling. We then demonstrate the use of our saliency maps for a popular technique of content-aware image resizing and compare the result with that of existing methods. Our saliency maps prove to be a much more effective replacement for conventional gradient maps for providing automatic content-awareness. Just as it is important to find regions of interest in images, it is also important to find interesting images within a large collection of images. We therefore extend the notion of saliency detection in images to image databases. We propose an algorithm for finding salient images in a database. Apart from finding such images we also present two novel techniques for creating visually appealing summaries in the form of collages and mosaics. Finally, we address the problem of finding specific known objects of interest in images. Specifically, we deal with the feature extraction step that is a pre-requisite for any technique in this domain. In this context, we first present a superpixel segmentation algorithm that outperforms previous algorithms in terms quantitative measures of under-segmentation error and boundary recall. Our superpixel segmentation algorithm also offers several other advantages over existing algorithms like compactness, uniform size, control on the number of superpixels, and computational efficiency. We prove the effectiveness of our superpixels by deploying them in existing algorithms, specifically, an object class detection technique and a graph based algorithm, and improving their performance. We also present the result of using our superpixels in a technique for detecting mitochondria in noisy medical images

    Computergestützte Inhaltsanalyse von digitalen Videoarchiven

    Full text link
    Der Übergang von analogen zu digitalen Videos hat in den letzten Jahren zu großen Veränderungen innerhalb der Filmarchive geführt. Insbesondere durch die Digitalisierung der Filme ergeben sich neue Möglichkeiten für die Archive. Eine Abnutzung oder Alterung der Filmrollen ist ausgeschlossen, so dass die Qualität unverändert erhalten bleibt. Zudem wird ein netzbasierter und somit deutlich einfacherer Zugriff auf die Videos in den Archiven möglich. Zusätzliche Dienste stehen den Archivaren und Anwendern zur Verfügung, die erweiterte Suchmöglichkeiten bereitstellen und die Navigation bei der Wiedergabe erleichtern. Die Suche innerhalb der Videoarchive erfolgt mit Hilfe von Metadaten, die weitere Informationen über die Videos zur Verfügung stellen. Ein großer Teil der Metadaten wird manuell von Archivaren eingegeben, was mit einem großen Zeitaufwand und hohen Kosten verbunden ist. Durch die computergestützte Analyse eines digitalen Videos ist es möglich, den Aufwand bei der Erzeugung von Metadaten für Videoarchive zu reduzieren. Im ersten Teil dieser Dissertation werden neue Verfahren vorgestellt, um wichtige semantische Inhalte der Videos zu erkennen. Insbesondere werden neu entwickelte Algorithmen zur Erkennung von Schnitten, der Analyse der Kamerabewegung, der Segmentierung und Klassifikation von Objekten, der Texterkennung und der Gesichtserkennung vorgestellt. Die automatisch ermittelten semantischen Informationen sind sehr wertvoll, da sie die Arbeit mit digitalen Videoarchiven erleichtern. Die Informationen unterstützen nicht nur die Suche in den Archiven, sondern führen auch zur Entwicklung neuer Anwendungen, die im zweiten Teil der Dissertation vorgestellt werden. Beispielsweise können computergenerierte Zusammenfassungen von Videos erzeugt oder Videos automatisch an die Eigenschaften eines Abspielgerätes angepasst werden. Ein weiterer Schwerpunkt dieser Dissertation liegt in der Analyse historischer Filme. Vier europäische Filmarchive haben eine große Anzahl historischer Videodokumentationen zur Verfügung gestellt, welche Anfang bis Mitte des letzten Jahrhunderts gedreht und in den letzten Jahren digitalisiert wurden. Durch die Lagerung und Abnutzung der Filmrollen über mehrere Jahrzehnte sind viele Videos stark verrauscht und enthalten deutlich sichtbare Bildfehler. Die Bildqualität der historischen Schwarz-Weiß-Filme unterscheidet sich signifikant von der Qualität aktueller Videos, so dass eine verlässliche Analyse mit bestehenden Verfahren häufig nicht möglich ist. Im Rahmen dieser Dissertation werden neue Algorithmen vorgestellt, um eine zuverlässige Erkennung von semantischen Inhalten auch in historischen Videos zu ermöglichen

    A video text detection and recognition system

    No full text
    In this paper, we propose a new system for text information extraction from news videos. First of all, a method that integrates text detecting and text tracking is developed to locate text areas in the key-frames (images), together with a scheme to evaluate the performance of this approach. To get better recognition results, we then enhance the quality of the detected text blocks by multi-frame averaging. Finally, we use an adaptive thresholding method to binarize the text blocks and recognize the text using an off-the-shelf OCR module. The detection and recognition rate of the proposed system are 94.7 % and 67.5 % respectively. 1
    corecore