323 research outputs found

    Detecção de eventos complexos em vídeos baseada em ritmos visuais

    Get PDF
    Orientador: Hélio PedriniDissertação (mestrado) - Universidade Estadual de Campinas, Instituto de ComputaçãoResumo: O reconhecimento de eventos complexos em vídeos possui várias aplicações práticas relevantes, alavancadas pela grande disponibilidade de câmeras digitais instaladas em aeroportos, estações de ônibus e trens, centros de compras, estádios, hospitais, escolas, prédios, estradas, entre vários outros locais. Avanços na tecnologia digital têm aumentado as capacidades dos sistemas em reconhecer eventos em vídeos por meio do desenvolvimento de dispositivos com alta resolução, dimensões físicas pequenas e altas taxas de amostragem. Muitos trabalhos disponíveis na literatura têm explorado o tema a partir de diferentes pontos de vista. Este trabalho apresenta e avalia uma metodologia para extrair características dos ritmos visuais no contexto de detecção de eventos em vídeos. Um ritmo visual pode ser visto com a projeção de um vídeo em uma imagem, tal que a tarefa de análise de vídeos é reduzida a um problema de análise de imagens, beneficiando-se de seu baixo custo de processamento em termos de tempo e complexidade. Para demonstrar o potencial do ritmo visual na análise de vídeos complexos, três problemas da área de visão computacional são selecionados: detecção de eventos anômalos, classificação de ações humanas e reconhecimento de gestos. No primeiro problema, um modelo e? aprendido com situações de normalidade a partir dos rastros deixados pelas pessoas ao andar, enquanto padro?es representativos das ações são extraídos nos outros dois problemas. Nossa hipo?tese e? de que vídeos similares produzem padro?es semelhantes, tal que o problema de classificação de ações pode ser reduzido a uma tarefa de classificação de imagens. Experimentos realizados em bases públicas de dados demonstram que o método proposto produz resultados promissores com baixo custo de processamento, tornando-o possível aplicar em tempo real. Embora os padro?es dos ritmos visuais sejam extrai?dos como histograma de gradientes, algumas tentativas para adicionar características do fluxo o?tico são discutidas, além de estratégias para obter ritmos visuais alternativosAbstract: The recognition of complex events in videos has currently several important applications, particularly due to the wide availability of digital cameras in environments such as airports, train and bus stations, shopping centers, stadiums, hospitals, schools, buildings, roads, among others. Moreover, advances in digital technology have enhanced the capabilities for detection of video events through the development of devices with high resolution, small physical size, and high sampling rates. Many works available in the literature have explored the subject from different perspectives. This work presents and evaluates a methodology for extracting a feature descriptor from visual rhythms of video sequences in order to address the video event detection problem. A visual rhythm can be seen as the projection of a video onto an image, such that the video analysis task can be reduced into an image analysis problem, benefiting from its low processing cost in terms of time and complexity. To demonstrate the potential of the visual rhythm in the analysis of complex videos, three computer vision problems are selected in this work: abnormal event detection, human action classification, and gesture recognition. The former problem learns a normalcy model from the traces that people leave when they walk, whereas the other two problems extract representative patterns from actions. Our hypothesis is that similar videos produce similar patterns, therefore, the action classification problem is reduced into an image classification task. Experiments conducted on well-known public datasets demonstrate that the method produces promising results at high processing rates, making it possible to work in real time. Even though the visual rhythm features are mainly extracted as histogram of gradients, some attempts for adding optical flow features are discussed, as well as strategies for obtaining alternative visual rhythmsMestradoCiência da ComputaçãoMestre em Ciência da Computação1570507, 1406910, 1374943CAPE

    Audio-visual multi-modality driven hybrid feature learning model for crowd analysis and classification

    Get PDF
    The high pace emergence in advanced software systems, low-cost hardware and decentralized cloud computing technologies have broadened the horizon for vision-based surveillance, monitoring and control. However, complex and inferior feature learning over visual artefacts or video streams, especially under extreme conditions confine majority of the at-hand vision-based crowd analysis and classification systems. Retrieving event-sensitive or crowd-type sensitive spatio-temporal features for the different crowd types under extreme conditions is a highly complex task. Consequently, it results in lower accuracy and hence low reliability that confines existing methods for real-time crowd analysis. Despite numerous efforts in vision-based approaches, the lack of acoustic cues often creates ambiguity in crowd classification. On the other hand, the strategic amalgamation of audio-visual features can enable accurate and reliable crowd analysis and classification. Considering it as motivation, in this research a novel audio-visual multi-modality driven hybrid feature learning model is developed for crowd analysis and classification. In this work, a hybrid feature extraction model was applied to extract deep spatio-temporal features by using Gray-Level Co-occurrence Metrics (GLCM) and AlexNet transferrable learning model. Once extracting the different GLCM features and AlexNet deep features, horizontal concatenation was done to fuse the different feature sets. Similarly, for acoustic feature extraction, the audio samples (from the input video) were processed for static (fixed size) sampling, pre-emphasis, block framing and Hann windowing, followed by acoustic feature extraction like GTCC, GTCC-Delta, GTCC-Delta-Delta, MFCC, Spectral Entropy, Spectral Flux, Spectral Slope and Harmonics to Noise Ratio (HNR). Finally, the extracted audio-visual features were fused to yield a composite multi-modal feature set, which is processed for classification using the random forest ensemble classifier. The multi-class classification yields a crowd-classification accurac12529y of (98.26%), precision (98.89%), sensitivity (94.82%), specificity (95.57%), and F-Measure of 98.84%. The robustness of the proposed multi-modality-based crowd analysis model confirms its suitability towards real-world crowd detection and classification tasks

    Sound Classification and Processing of Urban Environments: A Systematic Literature Review

    Get PDF
    Audio recognition can be used in smart cities for security, surveillance, manufacturing, autonomous vehicles, and noise mitigation, just to name a few. However, urban sounds are everyday audio events that occur daily, presenting unstructured characteristics containing different genres of noise and sounds unrelated to the sound event under study, making it a challenging problem. Therefore, the main objective of this literature review is to summarize the most recent works on this subject to understand the current approaches and identify their limitations. Based on the reviewed articles, it can be realized that Deep Learning (DL) architectures, attention mechanisms, data augmentation techniques, and pretraining are the most crucial factors to consider while creating an efficient sound classification model. The best-found results were obtained by Mushtaq and Su, in 2020, using a DenseNet-161 with pretrained weights from ImageNet, and NA-1 and NA-2 as augmentation techniques, which were of 97.98%, 98.52%, and 99.22% for UrbanSound8K, ESC-50, and ESC-10 datasets, respectively. Nonetheless, the use of these models in real-world scenarios has not been properly addressed, so their effectiveness is still questionable in such situations

    Spatiotemporal enabled Content-based Image Retrieval

    Full text link

    The Nexus Between Security Sector Governance/Reform and Sustainable Development Goal-16

    Get PDF
    This Security Sector Reform (SSR) Paper offers a universal and analytical perspective on the linkages between Security Sector Governance (SSG)/SSR (SSG/R) and Sustainable Development Goal-16 (SDG-16), focusing on conflict and post-conflict settings as well as transitional and consolidated democracies. Against the background of development and security literatures traditionally maintaining separate and compartmentalized presence in both academic and policymaking circles, it maintains that the contemporary security- and development-related challenges are inextricably linked, requiring effective measures with an accurate understanding of the nature of these challenges. In that sense, SDG-16 is surely a good step in the right direction. After comparing and contrasting SSG/R and SDG-16, this SSR Paper argues that human security lies at the heart of the nexus between the 2030 Agenda of the United Nations (UN) and SSG/R. To do so, it first provides a brief overview of the scholarly and policymaking literature on the development-security nexus to set the background for the adoption of The Agenda 2030. Next, it reviews the literature on SSG/R and SDGs, and how each concept evolved over time. It then identifies the puzzle this study seeks to address by comparing and contrasting SSG/R with SDG-16. After making a case that human security lies at the heart of the nexus between the UN’s 2030 Agenda and SSG/R, this book analyses the strengths and weaknesses of human security as a bridge between SSG/R and SDG-16 and makes policy recommendations on how SSG/R, bolstered by human security, may help achieve better results on the SDG-16 targets. It specifically emphasizes the importance of transparency, oversight, and accountability on the one hand, and participative approach and local ownership on the other. It concludes by arguing that a simultaneous emphasis on security and development is sorely needed for addressing the issues under the purview of SDG-16

    Big Data in Bioeconomy

    Get PDF
    This edited open access book presents the comprehensive outcome of The European DataBio Project, which examined new data-driven methods to shape a bioeconomy. These methods are used to develop new and sustainable ways to use forest, farm and fishery resources. As a European initiative, the goal is to use these new findings to support decision-makers and producers – meaning farmers, land and forest owners and fishermen. With their 27 pilot projects from 17 countries, the authors examine important sectors and highlight examples where modern data-driven methods were used to increase sustainability. How can farmers, foresters or fishermen use these insights in their daily lives? The authors answer this and other questions for our readers. The first four parts of this book give an overview of the big data technologies relevant for optimal raw material gathering. The next three parts put these technologies into perspective, by showing useable applications from farming, forestry and fishery. The final part of this book gives a summary and a view on the future. With its broad outlook and variety of topics, this book is an enrichment for students and scientists in bioeconomy, biodiversity and renewable resources

    Proceedings of the Detection and Classification of Acoustic Scenes and Events 2016 Workshop (DCASE2016)

    Get PDF
    corecore