3 research outputs found
An effective trajectory-based algorithm for ball detection and tracking with application to the analysis of broadcast sports video
Ph.DDOCTOR OF PHILOSOPH
Automatic Identification of Text In Digital Video Key Frames
Scene and graphic text can provide important supplemental index information in video sequences. In this paper we address the problem automatically identifying text regions in digital video key frames. The text contained in video frames is typically very noisy because it is aliased and/or digitized at a much lower resolution than typical document images, making identification, extraction and recognition difficult. The proposed method is based on the use of a hybrid wavelet/neural network segmenter on a series of overlapping small windows to classify regions which contain text. To detect text over a wide range of font sizes, the method is applied to a pyramid of images and the regions identified at each level are integrated. 1. Introduction The increasing availability of online digital imagery and video has rekindled interest in the problems of how to index multimedia information sources automatically and how to browse and manipulate them efficiently. Traditionally, images and video seq..
Scene text area detection from video
Text detection from videos is a well known research area. Especially the detection of static superimposed text such as captions has been researched successfully, but makes many assumptions that question the applicability of those algorithms for moving scene text. In this dissertation, I propose a scene text area detection approach that includes a simple key frame extraction, feature extraction, feature code generation and text area classification.
Common edge and variance based features of scene text areas are evaluated and comprised in a "combined feature scheme". For the detection of text areas, two classifiers using a self-organising map and a feed-forward neural
network are compared. Ground truth video data with different characteristics is used to compare the neural computing methods. A combination of detection performance measures and changing features shows the applicability of edge and variance based features and leads to the proposal of
improvements of the "combined feature scheme". Car license plates serve as sample text areas in this research.UnpublishedBerthold, M. and Hand, D.J. (2003). Intelligent Data Analysis - Second Edition,
Springer, Berlin, 2003.
Bishop, C. M. (1995). Neural Networks for Pattern Recognition, Oxford University
Press, New York, 1995.
Calic, J., and Ezquierdo, E. (2002). "Efficient Key-Frame Extraction and Video
Analysis" , Proc. IEEE Int. Telecom. Conf., 2002.
Calic, J., and Thomas, B.T. (2004). "Spatial Analysis in Key-Frame Extraction Using
Video Segmentation", Workshop on Image Analysis for Multimedia Interactive
Services, April 2004.
Chen, X. and Zhang, H. (2001). "Text Area Detection from Video Frames", Proceedings
of the second IEEE Pacific Rim Conference on Multimedia, Beijing, China,
pp. 222-228.
Cui, H. and Huang, Q. (1997). "Character Extraction of License Plates from Video",
Proceedings of the 1997 Conference on Computer Vision and Pattern Recognition,
pp. 502 - 507.
Heath, M.D., Sarkar, S., Sanocki, T., and Bowyer, K.W. (1997). "A Robust Visual
Method for Assessing the Relative Performance of Edge-Detection Algorithms",
IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 19, No. 12,
1997.
Jain, A.K. and Yu, B. (1998). "Automatic Text Location in Images and Video
Frames" , Pattern recognition, Vol. 31, No. 12, pp. 2055 - 2076, 1998.
Kohonen, T. (1997). Self-Organizing Maps - Second Edition, Springer, Berlin, 1997.
Scene Text Area Detection from Videos Research Report INF0480
Kovesi, P.D. (2002). "Edges are not just steps", Proceedings of the Fifth Asian
Conference on Computer Vision, pp. 822-827, Melbourne, 2002.
Li, H. and Doermann, D. (1998). "Automatic Identification of Text in Digital
Video Key Frames", Proceedings of ICPR'98, pages 129-132, 1998.
Li, H. and Doermann, D. (1999). "Text Enhancement in Digital Video Using
Multiple Frame Integration", ACM Multimedia, 10/99.
Lienhardt, R. and Effelsberg, W. (2000). "Automatic text segmentation and text
recognition for video indexing", Multimedia Systems 8: 69 - 81 (2000).
Lienhardt, R. and Werncke, A. (2002). "Localizing and Segmenting Text in Images
and Videos", IEEE Transactions on Circuits and Systems for Video Technology,
Vol. 12, No. 4, pp. 236 - 268, April 2002.
Liu, T.M., Zhang, H.J., and Qi, F.H. (2002). "A Novel Video Key Frame Extraction
Algorithm", IEEE International Symposium on Circuits and System, vol. 4,
2002, pp. 149-152.
Ohya, J., Shio, A. and Akamatsu, S. (1994). "Recognizing Characters in Scene
Images", IEEE Transactions on Pattern Analysis and Machine Intelligence, 16(2):
214-220, 1994.
Ojala, T., Pietikinen, M., Menp, T. (2002). "Multiresolution Gray-Scale and Rotation
Invariant Texture Classification with Local Binary Patterns", IEEE Transactions
on Pattern Analysis and Machine Intelligence, 24(7): 971-987, 2002.
Ripley, B.D. (1996). Pattern recognition and neural networks, Cambridge University
Press, Cambridge, UK, 1996.
da Rocha Gesualdi, A., de Seixas, J. M., de Alburquerque, Marcelo P. and de
Alburquerque, Marcio P. (2002). "Character Recognition in Car License Plates
Based on Principal Components and Neural Processing", Proceedings of the VII
Brazilian Symposium on Neural Networks (SBRN' 2002), pp. 206 - 211.
Sato, T., Kanade, T., Hughes, E.K., Smith, A.M, and Satoh, S. (1999). "Video
OCR: Indexing digital news libraries by recognition of superimposed captions",
Multimedia Systems 7: 385 - 399.
Scene Text Area Detection Videos Research Report INF0480
Shen, H. and Tang, X. (2003). "Generic Sign Board Detection in Images", Proceedings
of the 5th ACM SICMM International Workshop on Multimedia Information
Retrieval, 2003, pp. 144 - 149.
Transport for London (no date). "Camera Enforcement", Congestion Charging -
Fact sheets, Retrieved July 5th 2004.
http://www.tfl.gov.uk/tfl/cclondon/cc_fact_sheet_enforcement.shtml
Transport for London (2004). "TIT., Publish C-Charge Annual Report", Press Centre
- Press Statement, April 26th 2004, Retrieved July 5th 2004.
http://www.tfl.gov.uk/tfl/press-releases/2004/april/press-1009.shtml
Wiegand, D. (2004). "Erkennungsdienst - OCR-Software far Windows and Mac",
c't Magazin fur Computertechnik, 2/2004, pp. 142 - 147 (in German).
Zhang, D., Rajendran, R. K., and Chang, S. F. (2002). "General and domainspecific
techniques for recognizing superimposed text in videos", Proceedings of
International Conference on Image Processing, Rochester, USA, pp. 593 - 596.
Zhong, Y., Karu, K. and Jain, A.K. (1995). "Locating text in complex color
images", Pattern recognition, Vol. 28, pp. 1523 - 1525, Oct. 1995