3 research outputs found

    Automatic Identification of Text In Digital Video Key Frames

    No full text
    Scene and graphic text can provide important supplemental index information in video sequences. In this paper we address the problem automatically identifying text regions in digital video key frames. The text contained in video frames is typically very noisy because it is aliased and/or digitized at a much lower resolution than typical document images, making identification, extraction and recognition difficult. The proposed method is based on the use of a hybrid wavelet/neural network segmenter on a series of overlapping small windows to classify regions which contain text. To detect text over a wide range of font sizes, the method is applied to a pyramid of images and the regions identified at each level are integrated. 1. Introduction The increasing availability of online digital imagery and video has rekindled interest in the problems of how to index multimedia information sources automatically and how to browse and manipulate them efficiently. Traditionally, images and video seq..

    Scene text area detection from video

    No full text
    Text detection from videos is a well known research area. Especially the detection of static superimposed text such as captions has been researched successfully, but makes many assumptions that question the applicability of those algorithms for moving scene text. In this dissertation, I propose a scene text area detection approach that includes a simple key frame extraction, feature extraction, feature code generation and text area classification. Common edge and variance based features of scene text areas are evaluated and comprised in a "combined feature scheme". For the detection of text areas, two classifiers using a self-organising map and a feed-forward neural network are compared. Ground truth video data with different characteristics is used to compare the neural computing methods. A combination of detection performance measures and changing features shows the applicability of edge and variance based features and leads to the proposal of improvements of the "combined feature scheme". Car license plates serve as sample text areas in this research.UnpublishedBerthold, M. and Hand, D.J. (2003). Intelligent Data Analysis - Second Edition, Springer, Berlin, 2003. Bishop, C. M. (1995). Neural Networks for Pattern Recognition, Oxford University Press, New York, 1995. Calic, J., and Ezquierdo, E. (2002). "Efficient Key-Frame Extraction and Video Analysis" , Proc. IEEE Int. Telecom. Conf., 2002. Calic, J., and Thomas, B.T. (2004). "Spatial Analysis in Key-Frame Extraction Using Video Segmentation", Workshop on Image Analysis for Multimedia Interactive Services, April 2004. Chen, X. and Zhang, H. (2001). "Text Area Detection from Video Frames", Proceedings of the second IEEE Pacific Rim Conference on Multimedia, Beijing, China, pp. 222-228. Cui, H. and Huang, Q. (1997). "Character Extraction of License Plates from Video", Proceedings of the 1997 Conference on Computer Vision and Pattern Recognition, pp. 502 - 507. Heath, M.D., Sarkar, S., Sanocki, T., and Bowyer, K.W. (1997). "A Robust Visual Method for Assessing the Relative Performance of Edge-Detection Algorithms", IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 19, No. 12, 1997. Jain, A.K. and Yu, B. (1998). "Automatic Text Location in Images and Video Frames" , Pattern recognition, Vol. 31, No. 12, pp. 2055 - 2076, 1998. Kohonen, T. (1997). Self-Organizing Maps - Second Edition, Springer, Berlin, 1997. Scene Text Area Detection from Videos Research Report INF0480 Kovesi, P.D. (2002). "Edges are not just steps", Proceedings of the Fifth Asian Conference on Computer Vision, pp. 822-827, Melbourne, 2002. Li, H. and Doermann, D. (1998). "Automatic Identification of Text in Digital Video Key Frames", Proceedings of ICPR'98, pages 129-132, 1998. Li, H. and Doermann, D. (1999). "Text Enhancement in Digital Video Using Multiple Frame Integration", ACM Multimedia, 10/99. Lienhardt, R. and Effelsberg, W. (2000). "Automatic text segmentation and text recognition for video indexing", Multimedia Systems 8: 69 - 81 (2000). Lienhardt, R. and Werncke, A. (2002). "Localizing and Segmenting Text in Images and Videos", IEEE Transactions on Circuits and Systems for Video Technology, Vol. 12, No. 4, pp. 236 - 268, April 2002. Liu, T.M., Zhang, H.J., and Qi, F.H. (2002). "A Novel Video Key Frame Extraction Algorithm", IEEE International Symposium on Circuits and System, vol. 4, 2002, pp. 149-152. Ohya, J., Shio, A. and Akamatsu, S. (1994). "Recognizing Characters in Scene Images", IEEE Transactions on Pattern Analysis and Machine Intelligence, 16(2): 214-220, 1994. Ojala, T., Pietikinen, M., Menp, T. (2002). "Multiresolution Gray-Scale and Rotation Invariant Texture Classification with Local Binary Patterns", IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(7): 971-987, 2002. Ripley, B.D. (1996). Pattern recognition and neural networks, Cambridge University Press, Cambridge, UK, 1996. da Rocha Gesualdi, A., de Seixas, J. M., de Alburquerque, Marcelo P. and de Alburquerque, Marcio P. (2002). "Character Recognition in Car License Plates Based on Principal Components and Neural Processing", Proceedings of the VII Brazilian Symposium on Neural Networks (SBRN' 2002), pp. 206 - 211. Sato, T., Kanade, T., Hughes, E.K., Smith, A.M, and Satoh, S. (1999). "Video OCR: Indexing digital news libraries by recognition of superimposed captions", Multimedia Systems 7: 385 - 399. Scene Text Area Detection Videos Research Report INF0480 Shen, H. and Tang, X. (2003). "Generic Sign Board Detection in Images", Proceedings of the 5th ACM SICMM International Workshop on Multimedia Information Retrieval, 2003, pp. 144 - 149. Transport for London (no date). "Camera Enforcement", Congestion Charging - Fact sheets, Retrieved July 5th 2004. http://www.tfl.gov.uk/tfl/cclondon/cc_fact_sheet_enforcement.shtml Transport for London (2004). "TIT., Publish C-Charge Annual Report", Press Centre - Press Statement, April 26th 2004, Retrieved July 5th 2004. http://www.tfl.gov.uk/tfl/press-releases/2004/april/press-1009.shtml Wiegand, D. (2004). "Erkennungsdienst - OCR-Software far Windows and Mac", c't Magazin fur Computertechnik, 2/2004, pp. 142 - 147 (in German). Zhang, D., Rajendran, R. K., and Chang, S. F. (2002). "General and domainspecific techniques for recognizing superimposed text in videos", Proceedings of International Conference on Image Processing, Rochester, USA, pp. 593 - 596. Zhong, Y., Karu, K. and Jain, A.K. (1995). "Locating text in complex color images", Pattern recognition, Vol. 28, pp. 1523 - 1525, Oct. 1995