4 research outputs found

    Redes neurais convolucionais de múltiplos canais para reconhecimento de ações em sequências de vídeos baseado em informações espaço-temporais

    Get PDF
    Orientador: Hélio PedriniDissertação (mestrado) - Universidade Estadual de Campinas, Instituto de ComputaçãoResumo: Avanços na tecnologia digital aumentaram as capacidades de reconhecimento de eventos por meio do desenvolvimento de dispositivos com alta resolução, pequenas dimensões físicas e altas taxas de amostragem. O reconhecimento de eventos complexos em vídeos possui várias aplicações relevantes, particularmente devido à grande disponibilidade de câmeras digitais em ambientes como aeroportos, bancos, estradas, entre outros. A grande quantidade de dados produzidos é o cenário ideal para o desenvolvimento de métodos automáticos baseados em aprendizado de máquina profundo. Apesar do progresso significativo alcançado com as redes neurais profundas aplicadas a imagens, a compreensão do conteúdo de vídeos ainda enfrenta desafios na modelagem de relações espaço-temporais. Nesta dissertação, o problema do reconhecimento de ações humanas em vídeos foi investigada. Uma rede de múltiplos canais é a arquitetura de escolha para incorporar informações temporais, uma vez que se pode beneficiar de redes profundas pré-treinadas para imagens e de características tradicionais para inicialização. Além disso, seu custo de treinamento é geralmente menor do que o das redes neurais para vídeos. Imagens de ritmo visual são exploradas, pois codificam informações de longo prazo quando comparadas a quadros estáticos e fluxo ótico. Um novo método baseado em rastreamento de pontos é deesnvolvido para decidir a melhor direção do ritmo visual para cada vídeo. Além disso, redes neurais recorrentes foram treinadas a partir das características extraídas dos canais da arquitetura proposta. Experimentos conduzidos nas desafiadoras bases de dados públicas UCF101 e HMDB51 mostraram que a abordagem é capaz de melhorar o desempenho da rede, alcançando taxas de acurácia comparáveis aos métodos da literatura. Embora os ritmos visuais sejam originalmente criados a partir de imagens RGB, outros tipos de fontes e estratégias para sua criação são explorados e discutidos, tais como fluxo ótico, gradientes de imagem e histogramas de coresAbstract: Advances in digital technology have increased event recognition capabilities through the development of devices with high resolution, small physical dimensions and high sampling rates. The recognition of complex events in videos has several relevant applications, particularly due to the large availability of digital cameras in environments such as airports, banks, roads, among others. The large amount of data produced is the ideal scenario for the development of automatic methods based on deep learning. Despite the significant progress achieved through image-based deep neural networks, video content understanding still faces challenges in modeling spatio-temporal relations. In this dissertation, we address the problem of human action recognition in videos. A multi-stream network is our architecture of choice to incorporate temporal information, since it may benefit from pre-trained deep networks for images and from hand-crafted features for initialization. Furthermore, its training cost is usually lower than video-based networks. We explore visual rhythm images since they encode longer-term information when compared to still frames and optical flow. We propose a novel method based on point tracking for deciding the best visual rhythm direction for each video. In addition, we experimented with recurrent neural networks trained from the features extracted from the streams of the previous architecture. Experiments conducted on the challenging UCF101 and HMDB51 public datasets demonstrated that our approach is able to improve network performance, achieving accuracy rates comparable to the state-of-the-art methods. Even though the visual rhythms are originally created from RGB images, other types of source and strategies for their creation are explored and discussed, such as optical flow, image gradients and color histogramsMestradoCiência da ComputaçãoMestre em Ciência da Computação1736920CAPE

    Rapid Cut Detection On Compressed Video

    No full text
    The temporal segmentation of a video sequence is one of the most important aspects for video processing, analysis, indexing, and retrieval. Most of existing techniques to address the problem of identifying the boundary between consecutive shots have focused on the uncompressed domain. However, decoding and analyzing of a video sequence are two extremely time-consuming tasks. Since video data are usually available in compressed form, it is desirable to directly process video material without decoding. In this paper, we present a novel approach for video cut detection that works in the compressed domain. The proposed method is based on both exploiting visual features extracted from the video stream and on using a simple and fast algorithm to detect the video transitions. Experiments on a real-world video dataset with several genres show that our approach presents high accuracy relative to the state-of-the-art solutions and in a computational time that makes it suitable for online usage. © 2011 Springer-Verlag.7042 LNCS7178Universidad de La Frontera (UFRO),The International Association for Pattern Recognition (IAPR),Asociacon Chilena de Reconocimiento de Patrones (AChiRP),Asociacion Cubana de Reconocimiento de Patrones (ACPR),Mex. Assoc. Comput. Vis., Neural Comput. Rob. (MACVNR)Almeida, J., Leite, N.J., Torres, R.S., Comparison of video sequences with histograms of motion patterns (2011) Int. Conf. Image Processing (ICIP 2011)Almeida, J., Minetto, R., Almeida, T.A., Torres, R.S., Leite, N.J., Robust estimation of camera motion using optical flow models (2009) LNCS, 5875, pp. 435-446. , Bebis, G., Boyle, R., Parvin, B., Koracin, D., Kuno, Y., Wang, J., Wang, J.-X., Wang, J., Pajarola, R., Lindstrom, P., Hinkenjann, A., Encarnação, M.L., Silva, C.T., Coming, D. (eds.) ISVC 2009. Springer, HeidelbergAlmeida, J., Minetto, R., Almeida, T.A., Torres, R.S., Leite, N.J., Estimation of camera parameters in video sequences with a large amount of scene motion (2010) Proc. of Int. Conf. Syst. Signals Image (IWSSIP 2010), pp. 348-351Almeida, J., Rocha, A., Torres, R.S., Goldenstein, S., Making colors worth more than a thousand words (2008) Int. Symp. Applied Comput. (ACM SAC 2008), pp. 1180-1186Bezerra, F.N., Leite, N.J., Using string matching to detect video transitions (2007) Pattern Anal. Appl., 10 (1), pp. 45-54Bouch, A., Kuchinsky, A., Bhatti, N.T., Quality is in the eye of the beholder: Meeting users' requirements for internet quality of service (2000) Int. Conf. Human Factors Comput. Syst. (CHI 2000), pp. 297-304Guimarães, S.J.F., Patrocínio Jr., Z.K.G., Paula, H.B., Silva, H.B., A new dissimilarity measure for cut detection using bipartite graph matching (2009) Int. J. Semantic Computing, 3 (2), pp. 155-181Hanjalic, A., Shot-boundary detection: Unraveled and resolved? (2002) IEEE Trans. Circuits Syst. Video Techn., 12 (2), pp. 90-105Koprinska, I., Carrato, S., Temporal video segmentation: A survey (2001) Signal Processing: Image Communication, 16 (5), pp. 477-500Lee, S.W., Kim, Y.M., Choi, S.W., Fast scene change detection using direct feature extraction from MPEG compressed videos (2000) IEEE Trans. Multimedia, 2 (4), pp. 240-254Lienhart, R., Reliable transition detection in videos: A survey and practitioner's guide (2001) Int. J. Image Graphics, 1 (3), pp. 469-486Pei, S.C., Chou, Y.Z., Efficient MPEG compressed video analysis using macroblock type information (1999) IEEE Trans. Multimedia, 1 (4), pp. 321-333Pfeiffer, S., Lienhart, R., Kühne, G., Effelsberg, W., The MoCA project - Movie content analysis research at the University of Mannheim (1998) GI Jahrestagung, pp. 329-338Whitehead, A., Bose, P., Laganière, R., Feature based cut detection with automatic threshold selection (2004) LNCS, 3115, pp. 410-418. , Enser, P.G.B., Kompatsiaris, Y., O'Connor, N.E., Smeaton, A., Smeulders, A.W.M. (eds.) CIVR 2004. Springer, HeidelbergYeo, B.L., Liu, B., Rapid scene analysis on compressed video (1995) IEEE Trans. Circuits Syst. Video Techn., 5 (6), pp. 533-544Zhang, H., Kankanhalli, A., Smoliar, S.W., Automatic partitioning of full-motion video (1993) Multimedia Syst., 1 (1), pp. 10-2

    Visual Rhythm-based Time Series Analysis For Phenology Studies

    No full text
    Plant phenology has gained importance in the context of global change research, stimulating the development of new technologies for phenological observation. In this context, digital cameras have been successfully used as multi-channel imaging sensors, providing measures to estimate changes on phenological events, such as leaf flushing and senescence. We monitored leaf-changing patterns of a cerrado-savanna vegetation by taken daily digital images. For that, we extract leaf color information and correlated with phenological changes. In this way, time series associated with plant species are obtained, raising the need of using appropriate tools for mining patterns of interest. In this paper, we present a novel approach for representing phenological patterns of plant species. The proposed method is based on encoding time series as a visual rhythm, which is characterized by color description algorithms. A comparative analysis of different descriptors is conducted and discussed. Experimental results show that our approach presents high accuracy on identifying plant species. © 2013 IEEE.44124416The Institute of Electrical and Electronics Engineers (IEEE) Signal Processing SocietyWalther, G.R., Post, E., Convey, P., Menzel, A., Parmesan, C., Beebee, T.J.C., Fromentin, J.M., Bairlein, F., Ecological responses to recent climate change (2002) Nature, 416, pp. 389-395Parmesan, C., Yohe, G.A., A globally coherent fingerprint to climate change impacts accross natural systems (2003) Nature, 421, pp. 37-42Walther, G.R., Plants in a warmer world (2004) Perspectives in Plant Ecology Evolution and Systematics, 6, pp. 169-185Rosenzweig, C., Karoly, D., Vicarelli, M., Neofotis, P., Wu, Q., Casassa, G., Menzel, A., Imeson, A., Attributing physical and biological impacts to anthropogenic climate change (2008) Nature, 453, pp. 353-357Richardson, A.D., Braswell, B.H., Hollinger, D.Y., Jenkins, J.P., Ollinger, S.V., Near-surface remote sensing of spatial and temporal variation in canopy phenology (2009) Ecological Applications, 19, pp. 1417-1428Richardson, A.D., Jenkins, J.P., Braswell, B.H., Hollinger, D.Y., Ollinger, S.V., Smith, M.L., Use of digital webcam images to track spring greep-up in a deciduous broadleaf forest (2007) Oecologia, 152, pp. 323-334Ahrends, H., Etzold, S., Kutsch, W., Stoeckli, R., Bruegger, R., Jeanneret, F., Wanner, H., Eugster, W., Tree phenology and carbon dioxide fluxes: Use of digital photography for process-based interpretation at the ecosystem scale (2009) Climate Research, 39, pp. 261-274Ide, R., Oguma, H., Use of digital cameras for phenological observations (2010) Ecological Informatics, 5, pp. 339-347Kurc, S., Benton, L., Digital image-derived greenness links deep soil moisture to carbon uptake in a creosotebushdominated shrubland (2010) Journal of Arid Environments, 74, pp. 585-594Nagai, S., Maeda, T., Gamo, M., Muraoka, H., Suzuki, R., Nasahara, K.N., Using digital camera images to detect canopy condition of deciduous broad-leaved trees (2011) Plant Ecology and Diversity, 4, pp. 79-89Alberton, B., Almeida, J., Henneken, R., Da Torres S, R., Menzel, A., Morellato, L.P.C., Near remote phenology: Applying digital images to monitor leaf phenology in a brazilian cerrado savanna (2012) Int. Conf. Phenology (Phenology'12), p. 2Forster, M., Schmidt, T., Schuster, C., Kleinschmit, B., Multi-temporal detection of grassland vegetation with rapideye imagery and a spectral-temporal library (2012) IEEE Int. Symp. Geoscience and Remote Sensing (IGARSS'12), pp. 4930-4933Rodrigues, A., Marcal, A.R.S., Cunha, M., Phenology parameter extraction from time-series of satellite vegetation index data using phenosat (2012) IEEE Int. Symp. Geoscience and Remote Sensing (IGARSS'12), pp. 4926-4929Brooks, E.B., Thomas, V.A., Wynne, R.H., Coulston, J.W., Fitting the multitemporal curve: A fourier series approach to the missing data problem in remote sensing analysis (2012) IEEE Transactions on Geoscience and Remote Sensing, 50 (9), pp. 3340-3353Petitjean, F., Kurtz, C., Passat, N., Gançarski, P., Spatiotemporal reasoning for the classification of satellite image time series (2012) Pattern Recognition Letters, 33 (13), pp. 1805-1815Ardila, J.P., Bijker, W., Tolpekin, V.A., Stein, A., Multitemporal change detection of urban trees using localized regionbased active contours in vhr images (2012) Remote Sensing of Environment, 124, pp. 413-426Almeida, J., Dos Santos, J.A., Alberton, B., Da Torres S, R., Morellato, L.P.C., Remote phenology: Applying machine learning to detect phenological patterns in a cerrado savanna (2012) IEEE Int. Conf. EScience (eScience'12), pp. 1-8Ngo, C.W., Pong, T.C., Chin, R.T., Detection of gradual transitions through temporal slice analysis (1999) IEEE Int. Conf. Computer Vision and Pattern Recognition (CVPR'99), pp. 1036-1041Lee, J.-S., Ebrahimi, T., Perceptual video compression: A survey (2012) IEEE Journal of Selected Topics in Signal Processing, 6 (6), pp. 684-697Huang, J., Kumar, R., Mitra, M., Zhu, W.-J., Zabih, R., Image indexing using color correlograms (1997) IEEE Int. Conf. Computer Vision and Pattern Recognition (CVPR'97), pp. 762-768Pass, G., Zabih, R., Miller, J., Comparing images using color coherence vectors (1996) ACMInt. Conf.Multimedia (ACMMM' 96), pp. 65-73Stehling, R.O., Nascimento, M.A., Falcao, A.X., A compact and efficient image retrieval approach based on border/ interior pixel classification (2002) ACMInt. Conf. Information and Knowledge Management (CIKM'02), pp. 102-109Swain, M.J., Ballard, B.H., Color indexing (1991) International Journal of Computer Vision, 7 (1), pp. 11-32Guigues, L., Cocquerez, J., Le Men, H., Scale-sets image analysis (2006) International Journal of Computer Vision, 68, pp. 289-317Jain, R., The Art of computer systems performance analysis: Techniques for experimental design (1991) Measurement, Simulation, and Modeling, , John Wiley and Sons, IncAlmeida, J., Rocha, A., Da Torres S, R., Goldenstein, S., Making colors worth more than a thousand words (2008) ACM Int. Symp. Applied Computing (ACM-SAC'08), pp. 1180-1186Almeida, J., Leite, N.J., Da Torres S, R., Rapid cut detection on compressed video (2011) Iberoamerican Congress on Pattern Recognition (CIARP'11), pp. 71-78Almeida, J., Da Torres S, R., Leite, N.J., Rapid video summarization on compressed video (2010) IEEE Int. Symp. Multimedia (ISM'10), pp. 113-120Almeida, J., Leite, N.J., Da Torres S, R., VISON: VIdeo Summarization for ONline applications (2012) Pattern Recognition Letters, 33 (4), pp. 397-409Andrade, F.S.P., Almeida, J., Pedrini, H., Da Torres S, R., Fusion of local and global descriptors for content-based image and video retrieval (2012) Iberoamerican Congress on Pattern Recognition (CIARP'12), pp. 845-85
    corecore