4,254 research outputs found

    Multi-scale Discriminant Saliency with Wavelet-based Hidden Markov Tree Modelling

    Full text link
    The bottom-up saliency, an early stage of humans' visual attention, can be considered as a binary classification problem between centre and surround classes. Discriminant power of features for the classification is measured as mutual information between distributions of image features and corresponding classes . As the estimated discrepancy very much depends on considered scale level, multi-scale structure and discriminant power are integrated by employing discrete wavelet features and Hidden Markov Tree (HMT). With wavelet coefficients and Hidden Markov Tree parameters, quad-tree like label structures are constructed and utilized in maximum a posterior probability (MAP) of hidden class variables at corresponding dyadic sub-squares. Then, a saliency value for each square block at each scale level is computed with discriminant power principle. Finally, across multiple scales is integrated the final saliency map by an information maximization rule. Both standard quantitative tools such as NSS, LCC, AUC and qualitative assessments are used for evaluating the proposed multi-scale discriminant saliency (MDIS) method against the well-know information based approach AIM on its released image collection with eye-tracking data. Simulation results are presented and analysed to verify the validity of MDIS as well as point out its limitation for further research direction.Comment: arXiv admin note: substantial text overlap with arXiv:1301.396

    Unsupervised Texture Segmentation

    Get PDF

    Modeling of evolving textures using granulometries

    Get PDF
    This chapter describes a statistical approach to classification of dynamic texture images, called parallel evolution functions (PEFs). Traditional classification methods predict texture class membership using comparisons with a finite set of predefined texture classes and identify the closest class. However, where texture images arise from a dynamic texture evolving over time, estimation of a time state in a continuous evolutionary process is required instead. The PEF approach does this using regression modeling techniques to predict time state. It is a flexible approach which may be based on any suitable image features. Many textures are well suited to a morphological analysis and the PEF approach uses image texture features derived from a granulometric analysis of the image. The method is illustrated using both simulated images of Boolean processes and real images of corrosion. The PEF approach has particular advantages for training sets containing limited numbers of observations, which is the case in many real world industrial inspection scenarios and for which other methods can fail or perform badly. [41] G.W. Horgan, Mathematical morphology for analysing soil structure from images, European Journal of Soil Science, vol. 49, pp. 161–173, 1998. [42] G.W. Horgan, C.A. Reid and C.A. Glasbey, Biological image processing and enhancement, Image Processing and Analysis, A Practical Approach, R. Baldock and J. Graham, eds., Oxford University Press, Oxford, UK, pp. 37–67, 2000. [43] B.B. Hubbard, The World According to Wavelets: The Story of a Mathematical Technique in the Making, A.K. Peters Ltd., Wellesley, MA, 1995. [44] H. Iversen and T. Lonnestad. An evaluation of stochastic models for analysis and synthesis of gray-scale texture, Pattern Recognition Letters, vol. 15, pp. 575–585, 1994. [45] A.K. Jain and F. Farrokhnia, Unsupervised texture segmentation using Gabor filters, Pattern Recognition, vol. 24(12), pp. 1167–1186, 1991. [46] T. Jossang and F. Feder, The fractal characterization of rough surfaces, Physica Scripta, vol. T44, pp. 9–14, 1992. [47] A.K. Katsaggelos and T. Chun-Jen, Iterative image restoration, Handbook of Image and Video Processing, A. Bovik, ed., Academic Press, London, pp. 208–209, 2000. [48] M. K¨oppen, C.H. Nowack and G. R¨osel, Pareto-morphology for color image processing, Proceedings of SCIA99, 11th Scandinavian Conference on Image Analysis 1, Kangerlussuaq, Greenland, pp. 195–202, 1999. [49] S. Krishnamachari and R. Chellappa, Multiresolution Gauss-Markov random field models for texture segmentation, IEEE Transactions on Image Processing, vol. 6(2), pp. 251–267, 1997. [50] T. Kurita and N. Otsu, Texture classification by higher order local autocorrelation features, Proceedings of ACCV93, Asian Conference on Computer Vision, Osaka, pp. 175–178, 1993. [51] S.T. Kyvelidis, L. Lykouropoulos and N. Kouloumbi, Digital system for detecting, classifying, and fast retrieving corrosion generated defects, Journal of Coatings Technology, vol. 73(915), pp. 67–73, 2001. [52] Y. Liu, T. Zhao and J. Zhang, Learning multispectral texture features for cervical cancer detection, Proceedings of 2002 IEEE International Symposium on Biomedical Imaging: Macro to Nano, pp. 169–172, 2002. [53] G. McGunnigle and M.J. Chantler, Modeling deposition of surface texture, Electronics Letters, vol. 37(12), pp. 749–750, 2001. [54] J. McKenzie, S. Marshall, A.J. Gray and E.R. Dougherty, Morphological texture analysis using the texture evolution function, International Journal of Pattern Recognition and Artificial Intelligence, vol. 17(2), pp. 167–185, 2003. [55] J. McKenzie, Classification of dynamically evolving textures using evolution functions, Ph.D. Thesis, University of Strathclyde, UK, 2004. [56] S.G. Mallat, Multiresolution approximations and wavelet orthonormal bases of L2(R), Transactions of the American Mathematical Society, vol. 315, pp. 69–87, 1989. [57] S.G. Mallat, A theory for multiresolution signal decomposition: the wavelet representation, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 11, pp. 674–693, 1989. [58] B.S. Manjunath and W.Y. Ma, Texture features for browsing and retrieval of image data, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 18, pp. 837–842, 1996. [59] B.S. Manjunath, G.M. Haley and W.Y. Ma, Multiband techniques for texture classification and segmentation, Handbook of Image and Video Processing, A. Bovik, ed., Academic Press, London, pp. 367–381, 2000. [60] G. Matheron, Random Sets and Integral Geometry, Wiley Series in Probability and Mathematical Statistics, John Wiley and Sons, New York, 1975

    Audio-visual football video analysis, from structure detection to attention analysis

    Get PDF
    Sport video is an important video genre. Content-based sports video analysis attracts great interest from both industry and academic fields. A sports video is characterised by repetitive temporal structures, relatively plain contents, and strong spatio-temporal variations, such as quick camera switches and swift local motions. It is necessary to develop specific techniques for content-based sports video analysis to utilise these characteristics. For an efficient and effective sports video analysis system, there are three fundamental questions: (1) what are key stories for sports videos; (2) what incurs viewer’s interest; and (3) how to identify game highlights. This thesis is developed around these questions. We approached these questions from two different perspectives and in turn three research contributions are presented, namely, replay detection, attack temporal structure decomposition, and attention-based highlight identification. Replay segments convey the most important contents in sports videos. It is an efficient approach to collect game highlights by detecting replay segments. However, replay is an artefact of editing, which improves with advances in video editing tools. The composition of replay is complex, which includes logo transitions, slow motions, viewpoint switches and normal speed video clips. Since logo transition clips are pervasive in game collections of FIFA World Cup 2002, FIFA World Cup 2006 and UEFA Championship 2006, we take logo transition detection as an effective replacement of replay detection. A two-pass system was developed, including a five-layer adaboost classifier and a logo template matching throughout an entire video. The five-layer adaboost utilises shot duration, average game pitch ratio, average motion, sequential colour histogram and shot frequency between two neighbouring logo transitions, to filter out logo transition candidates. Subsequently, a logo template is constructed and employed to find all transition logo sequences. The precision and recall of this system in replay detection is 100% in a five-game evaluation collection. An attack structure is a team competition for a score. Hence, this structure is a conceptually fundamental unit of a football video as well as other sports videos. We review the literature of content-based temporal structures, such as play-break structure, and develop a three-step system for automatic attack structure decomposition. Four content-based shot classes, namely, play, focus, replay and break were identified by low level visual features. A four-state hidden Markov model was trained to simulate transition processes among these shot classes. Since attack structures are the longest repetitive temporal unit in a sports video, a suffix tree is proposed to find the longest repetitive substring in the label sequence of shot class transitions. These occurrences of this substring are regarded as a kernel of an attack hidden Markov process. Therefore, the decomposition of attack structure becomes a boundary likelihood comparison between two Markov chains. Highlights are what attract notice. Attention is a psychological measurement of “notice ”. A brief survey of attention psychological background, attention estimation from vision and auditory, and multiple modality attention fusion is presented. We propose two attention models for sports video analysis, namely, the role-based attention model and the multiresolution autoregressive framework. The role-based attention model is based on the perception structure during watching video. This model removes reflection bias among modality salient signals and combines these signals by reflectors. The multiresolution autoregressive framework (MAR) treats salient signals as a group of smooth random processes, which follow a similar trend but are filled with noise. This framework tries to estimate a noise-less signal from these coarse noisy observations by a multiple resolution analysis. Related algorithms are developed, such as event segmentation on a MAR tree and real time event detection. The experiment shows that these attention-based approach can find goal events at a high precision. Moreover, results of MAR-based highlight detection on the final game of FIFA 2002 and 2006 are highly similar to professionally labelled highlights by BBC and FIFA

    General highlight detection in sport videos

    Get PDF
    Attention is a psychological measurement of human reflection against stimulus. We propose a general framework of highlight detection by comparing attention intensity during the watching of sports videos. Three steps are involved: adaptive selection on salient features, unified attention estimation and highlight identification. Adaptive selection computes feature correlation to decide an optimal set of salient features. Unified estimation combines these features by the technique of multi-resolution autoregressive (MAR) and thus creates a temporal curve of attention intensity. We rank the intensity of attention to discriminate boundaries of highlights. Such a framework alleviates semantic uncertainty around sport highlights and leads to an efficient and effective highlight detection. The advantages are as follows: (1) the capability of using data at coarse temporal resolutions; (2) the robustness against noise caused by modality asynchronism, perception uncertainty and feature mismatch; (3) the employment of Markovian constrains on content presentation, and (4) multi-resolution estimation on attention intensity, which enables the precise allocation of event boundaries
    corecore