11 research outputs found

    Fast and Effective Bag-of-Visual-Word Model to Pornographic Images Recognition Using the FREAK Descriptor

    Get PDF
    Recently, the Bag of Visual Word (BoVW) has gained enormous popularity between researchers to object recognition. Pornographic image recognition with respect to computational complexity, appropriate accuracy, and memory consumption is a major challenge in the applications with time constraints such as the internet pornography filtering. Most of the existing researches based on the Bow, using the very popular SIFT and SURF algorithms to description and match detected keypoints in the image. The main problem of these methods is high computational complexity due to constructing the high dimensional feature vectors. This research proposed a BoVW based model by adopting very fast and simple binary descriptor FREAK to speed-up pornographic recognition process. Meanwhile, the keypoints are detected in the ROI of images which improves the recognition speed due to eliminating many noise keypoints placed in the image background. Finally, in order to find the most representational visual-vocabulary, different vocabularies are generated from size 150 to 500 for BoVW. Compared with the similar works, the experimental results show that the proposed model has gained remarkable improvement in the terms of computational complexity

    An overview of NuDetective Forensic Tool and its usage to combat child pornography in Brazil

    Get PDF
    In many countries, the possession of files containing child and teen pornography is a heinous crime and is desirable for law enforcement be able to detect such files in a timely manner at crime scenes. However, mainly at crime scenes, it is impossible to manually examine all files that can be stored in digital storage devices. The NuDetective Forensic Tool was developed to assist forensic examiners to identify child pornography at crime scenes. NuDetective uses automatic nudity detection in images and videos, file name comparison and also uses hash values to reduce the files to be analyzed by forensic examiners. Despite the high detection rates achieved in past experiments, the authors did not get any formal feedback of NuDetective users about its performance in real forensic cases. So, this work presents a detailed review of the four main features of NuDetective Forensic Tool, including all techniques and methods implemented, and also the results of an unpublished survey conducted to evaluate the real effectiveness of NuDetective by its Brazilian users. The results obtained showed that NuDetective is helping to arrest pedophiles and to combat the child sexual exploitation in the digital era.Sociedad Argentina de Informática e Investigación Operativa (SADIO

    Action Recognition in Videos: from Motion Capture Labs to the Web

    Full text link
    This paper presents a survey of human action recognition approaches based on visual data recorded from a single video camera. We propose an organizing framework which puts in evidence the evolution of the area, with techniques moving from heavily constrained motion capture scenarios towards more challenging, realistic, "in the wild" videos. The proposed organization is based on the representation used as input for the recognition task, emphasizing the hypothesis assumed and thus, the constraints imposed on the type of video that each technique is able to address. Expliciting the hypothesis and constraints makes the framework particularly useful to select a method, given an application. Another advantage of the proposed organization is that it allows categorizing newest approaches seamlessly with traditional ones, while providing an insightful perspective of the evolution of the action recognition task up to now. That perspective is the basis for the discussion in the end of the paper, where we also present the main open issues in the area.Comment: Preprint submitted to CVIU, survey paper, 46 pages, 2 figures, 4 table

    Automatic flagging of offensive video content using Deep Learning

    Get PDF
    Thanks to our visual system, it doesn't take any effort for us humans to tell apart a cat from an eagle, recognize our family´s faces, or reading a sign. But these are actually hard problems to solve with a computer: the difference relies in how the human brain and computers process images. With the rise of the Internet and Mobile Smartphones, the amount of visual content available on the internet has increased to well beyond manual analysis. Offensive classification of images is one of the major tasks for semantic analysis of visual content. In the last few years, the field of machine learning has made tremendous progress on addressing these difficult problems. In particular, we've found that a kind of model called a deep convolutional neural network (CNN) can achieve reasonable performance on hard visual recognition tasks -- matching or exceeding human performance in some domains. CNNs are now being to tackle one of the core problems in computer vision, which is, image classification. In this master thesis, Automatic flagging of offensive video content using Deep Learning, Deep Learning is the key enabler to address offensive video classification challenges posed by the Internet Age. Deep Learning is a new paradigm aiming to overcome the limitations of current approaches, which are complex and require manual intervention. We will design a system that automatically analyses video files and detects violent and/or adult content using a Deep Learning framework. The classification is based on a previous segmentation of the video files where the most representative shot key frames are extracted. The extracted frames will be classified by a deep learning neural structure. This project includes the training and testing of the system. Training process will consist on finding or creating a database of images and adapting the parameters of the neural network

    Making an image worth a thousand visual words

    Get PDF
    The automatic dissimilarity analysis between images depends heavily on the use of descriptors to characterize the images’ content in compact and discriminative features. This work investigates the use of visual dictionaries to represent and retrieve the local image features using the popular Bag-of-Visual-Words modeling approach. We evaluated the impact of different parameters in the construction of this modeling approach, showing that an image can be effectively described using less than a thousand words.FAPESPCAPESSticAMSUDProjeto RESCUER, financiado pela Comissão Européia (Grant 614154) e pelo Conselho Nacional de Desenvolvimento Científico e Tecnológico CNPq/MCT

    An overview of NuDetective Forensic Tool and its usage to combat child pornography in Brazil

    Get PDF
    In many countries, the possession of files containing child and teen pornography is a heinous crime and is desirable for law enforcement be able to detect such files in a timely manner at crime scenes. However, mainly at crime scenes, it is impossible to manually examine all files that can be stored in digital storage devices. The NuDetective Forensic Tool was developed to assist forensic examiners to identify child pornography at crime scenes. NuDetective uses automatic nudity detection in images and videos, file name comparison and also uses hash values to reduce the files to be analyzed by forensic examiners. Despite the high detection rates achieved in past experiments, the authors did not get any formal feedback of NuDetective users about its performance in real forensic cases. So, this work presents a detailed review of the four main features of NuDetective Forensic Tool, including all techniques and methods implemented, and also the results of an unpublished survey conducted to evaluate the real effectiveness of NuDetective by its Brazilian users. The results obtained showed that NuDetective is helping to arrest pedophiles and to combat the child sexual exploitation in the digital era.Sociedad Argentina de Informática e Investigación Operativa (SADIO

    Survey On Nudity Detection: Opportunities And Challenges Based On ‘Awrah Concept In Islamic Shari’a

    Get PDF
    The nudity or nakedness which known as awrah in Islam is part of the human body which in principle should not be seen by other people except those qualified to be her or his mahram or in an emergency or urgent need.Nudity detection technique has long been receiving a lot of attention by researchers worldwide due to its importance particularly to the global Muslim community. In this paper, the techniques were separated into four classifications namely methods based on body structure, image retrieval, the features of skin region, and bag-of-visual-words (BoVW). All of these techniques are applicable to some areas of skin on the body as well as on the sexual organs that should be visible to determine nude or not. While the concept of nakedness in Islamic Shari'a has different rules between men and women, such as the limit of male ‘awrah is between the navel and the knees, while the limit of female ‘awrah is the entire body except the face and hands which should be closed using the hijab. In general, existing techniques can be used to detect nakedness concerned bythe Islamic Shari'a. The selection ofhese techniques are employed based on the areas of skin on the body as well as or the sexual organs to indicate whether it falls to thenude category or not. While in Islamic Shari'a, different 'awrah rules are required for men and women such as the limit 'awrah, the requirements of clothes as cover awrah, and kinds of shapes and shades of Hijabs in various countries (for women only). These problems are the opportunities and challenges for the researcher to propose an ‘awrah detection technique in accordance with the Islamic Shari'a

    Detecção de pornografia em vídeos através de técnicas de aprendizado profundo e informações de movimento

    Get PDF
    Orientadores: Anderson de Rezende Rocha, Vanessa TestoniDissertação (mestrado) - Universidade Estadual de Campinas, Instituto de ComputaçãoResumo: Com o crescimento exponencial de gravações em vídeos disponíveis online, a moderação manual de conteúdos sensíveis, e.g, pornografia, violência e multidões, se tornou impra- ticável, aumentando a necessidade de uma filtragem automatizada. Nesta linha, muitos trabalhos exploraram o problema de detecção de pornografia, usando abordagens que vão desde a detecção de pele e nudez, até o uso de características locais e sacola de pala- vras visuais. Contudo, essas técnicas sofrem com casos ambíguos (e.g., cenas em praia, luta livre), produzindo muitos falsos positivos. Isto está possivelmente relacionado com o fato de que essas abordagens estão desatualizadas, e de que poucos autores usaram a informação de movimento presente nos vídeos, que pode ser crucial para a desambi- guação visual dos casos mencionados. Indo adiante para superar estas questões, neste trabalho, nós exploramos soluções de aprendizado em profundidade para o problema de detecção de pornografia em vídeos, levando em consideração tanto a informação está- tica, quanto a informação de movimento disponível em cada vídeo em questão. Quando combinamos as características estáticas e de movimento, o método proposto supera as soluções existentes na literatura. Apesar de as abordagens de aprendizado em profun- didade, mais especificamente as Redes Neurais Convolucionais (RNC), terem alcançado resultados impressionantes em outros problemas de visão computacional, este método tão promissor ainda não foi explorado suficientemente no problema detecção de pornografia, principalmente no que tange à incorporação de informações de movimento presente no vídeo. Adicionalmente, propomos novas formas de combinar as informações estáticas e de movimento usando RNCs, que ainda não foram exploradas para detecção de pornografia, nem em outras tarefas de reconhecimento de ações. Mais especificamente, nós exploramos duas fontes distintas de informação de movimento: Campos de deslocamento de Fluxo Óptico, que tem sido tradicionalmente usados para classificação de vídeos; e Vetores de Movimento MPEG. Embora Vetores de Movimento já tenham sido utilizados pela litera- tura na tarefa de detecção de pornografia, neste trabalho nós os adaptamos, criando uma representação visual apropriada, antes de passá-los a uma rede neural convolucional para aprendizado e extração de características. Nossos experimentos mostraram que, apesar de a técnica de Vetores de Movimento MPEG possuir uma performance inferior quando utilizada de forma isolada, quando comparada à técnica baseada em Fluxo Óptico, ela consegue uma performance similar ao complementar a informação estática, com a van- tagem de estar presente, por construção, nos vídeos, enquanto se decodifica os frames, evitando a necessidade da computação mais cara do Fluxo Óptico. Nossa melhor aborda- gem proposta supera os métodos existentes na literatura em diferentes datasets. Para o dataset Pornography 800, o método consegue uma acurácia de classificação de 97,9%, uma redução do erro de 64,4% quando comparado com o estado da arte (94,1% de acu- rácia neste dataset). Quando consideramos o dataset Pornography 2k, mais desafiador, nosso melhor método consegue um acurácia de 96,4%, reduzindo o erro de classificação em 14,3% em comparação ao estado da arte (95,8%)Abstract: With the exponential growth of video footage available online, human manual moderation of sensitive scenes, e.g., pornography, violence and crowd, became infeasible, increasing the necessity for automated filtering. In this vein, a great number of works has explored the pornographic detection problem, using approaches ranging from skin and nudity de- tection, to local features and bag of visual words. Yet, these techniques suffer from some ambiguous cases (e.g., beach scenes, wrestling), producing too much false positives. This is possibly related to the fact that these approaches are somewhat outdated, and that few authors have used the motion information present in videos, which could be crucial for the visual disambiguation of these cases. Setting forth to overcome these issues, in this work, we explore deep learning solutions to the problem of pornography detection in videos, tak- ing into account both the static and the motion information available for each questioned video. When incorporating the static and motion complementary features, the proposed method outperforms the existing solutions in the literature. Although Deep Learning ap- proaches, more specifically Convolutional Neural Networks (CNNs), have achieved striking results on other vision-related problems, such promising methods are still not sufficiently explored in pornography detection while incorporating motion information. We also pro- pose novel ways for combining the static and the motion information using CNNs, that have not been explored in pornography detection, nor in other action recognition tasks before. More specifically, we explore two distinct sources of motion information herein: Optical Flow displacement fields, which have been traditionally used for video classifica- tion; and MPEG Motion Vectors. Although Motion Vectors have already been used for pornography detection tasks in the literature, in this work, we adapt them, by finding an appropriate visual representation, before feeding a convolution neural network for feature learning and extraction. Our experiments show that although the MPEG Motion Vectors technique has an inferior performance by itself, than when using its Optical Flow coun- terpart, it yields a similar performance when complementing the static information, with the advantage of being present, by construction, in the video while decoding the frames, avoiding the need for the more expensive Optical Flow calculations. Our best approach outperforms existing methods in the literature when considering different datasets. For the Pornography 800 dataset, it yields a classification accuracy of 97.9%, an error re- duction of 64.4% when compared to the state of the art (94.1% in this dataset). Finally, considering the more challenging Pornography 2k dataset, our best method yields a clas- sification accuracy of 96.4%, reducing the classification error in 14.3% when compared to the state of the art (95.8% in the same dataset)MestradoCiência da ComputaçãoMestre em Ciência da ComputaçãoFuncampCAPE

    Análise de vídeo sensível

    Get PDF
    Orientadores: Anderson de Rezende Rocha, Siome Klein GoldensteinTese (doutorado) - Universidade Estadual de Campinas, Instituto de ComputaçãoResumo: Vídeo sensível pode ser definido como qualquer filme capaz de oferecer ameaças à sua audiência. Representantes típicos incluem ¿ mas não estão limitados a ¿ pornografia, violência, abuso infantil, crueldade contra animais, etc. Hoje em dia, com o papel cada vez mais pervasivo dos dados digitais em nossa vidas, a análise de conteúdo sensível representa uma grande preocupação para representantes da lei, empresas, professores, e pais, devido aos potenciais danos que este tipo de conteúdo pode infligir a menores, estudantes, trabalhadores, etc. Não obstante, o emprego de mediadores humanos, para constantemente analisar grandes quantidades de dados sensíveis, muitas vezes leva a ocorrências de estresse e trauma, o que justifica a busca por análises assistidas por computador. Neste trabalho, nós abordamos este problema em duas frentes. Na primeira, almejamos decidir se um fluxo de vídeo apresenta ou não conteúdo sensível, à qual nos referimos como classificação de vídeo sensível. Na segunda, temos como objetivo encontrar os momentos exatos em que um fluxo começa e termina a exibição de conteúdo sensível, em nível de quadros de vídeo, à qual nos referimos como localização de conteúdo sensível. Para ambos os casos, projetamos e desenvolvemos métodos eficazes e eficientes, com baixo consumo de memória, e adequação à implantação em dispositivos móveis. Neste contexto, nós fornecemos quatro principais contribuições. A primeira é uma nova solução baseada em sacolas de palavras visuais, para a classificação eficiente de vídeos sensíveis, apoiada na análise de fenômenos temporais. A segunda é uma nova solução de fusão multimodal em alto nível semântico, para a localização de conteúdo sensível. A terceira, por sua vez, é um novo detector espaço-temporal de pontos de interesse, e descritor de conteúdo de vídeo. Finalmente, a quarta contribuição diz respeito a uma base de vídeos anotados em nível de quadro, que possui 140 horas de conteúdo pornográfico, e que é a primeira da literatura a ser adequada para a localização de pornografia. Um aspecto relevante das três primeiras contribuições é a sua natureza de generalização, no sentido de poderem ser empregadas ¿ sem modificações no passo a passo ¿ para a detecção de tipos diversos de conteúdos sensíveis, tais como os mencionados anteriormente. Para validação, nós escolhemos pornografia e violência ¿ dois dos tipos mais comuns de material impróprio ¿ como representantes de interesse, de conteúdo sensível. Nestes termos, realizamos experimentos de classificação e de localização, e reportamos resultados para ambos os tipos de conteúdo. As soluções propostas apresentam uma acurácia de 93% em classificação de pornografia, e permitem a correta localização de 91% de conteúdo pornográfico em fluxo de vídeo. Os resultados para violência também são interessantes: com as abordagens apresentadas, nós obtivemos o segundo lugar em uma competição internacional de detecção de cenas violentas. Colocando ambas em perspectiva, nós aprendemos que a detecção de pornografia é mais fácil que a de violência, abrindo várias oportunidades de pesquisa para a comunidade científica. A principal razão para tal diferença está relacionada aos níveis distintos de subjetividade que são inerentes a cada conceito. Enquanto pornografia é em geral mais explícita, violência apresenta um espectro mais amplo de possíveis manifestaçõesAbstract: Sensitive video can be defined as any motion picture that may pose threats to its audience. Typical representatives include ¿ but are not limited to ¿ pornography, violence, child abuse, cruelty to animals, etc. Nowadays, with the ever more pervasive role of digital data in our lives, sensitive-content analysis represents a major concern to law enforcers, companies, tutors, and parents, due to the potential harm of such contents over minors, students, workers, etc. Notwithstanding, the employment of human mediators for constantly analyzing huge troves of sensitive data often leads to stress and trauma, justifying the search for computer-aided analysis. In this work, we tackle this problem in two ways. In the first one, we aim at deciding whether or not a video stream presents sensitive content, which we refer to as sensitive-video classification. In the second one, we aim at finding the exact moments a stream starts and ends displaying sensitive content, at frame level, which we refer to as sensitive-content localization. For both cases, we aim at designing and developing effective and efficient methods, with low memory footprint and suitable for deployment on mobile devices. In this vein, we provide four major contributions. The first one is a novel Bag-of-Visual-Words-based pipeline for efficient time-aware sensitive-video classification. The second is a novel high-level multimodal fusion pipeline for sensitive-content localization. The third, in turn, is a novel space-temporal video interest point detector and video content descriptor. Finally, the fourth contribution comprises a frame-level annotated 140-hour pornographic video dataset, which is the first one in the literature that is appropriate for pornography localization. An important aspect of the first three contributions is their generalization nature, in the sense that they can be employed ¿ without step modifications ¿ to the detection of diverse sensitive content types, such as the previously mentioned ones. For validation, we choose pornography and violence ¿ two of the commonest types of inappropriate material ¿ as target representatives of sensitive content. We therefore perform classification and localization experiments, and report results for both types of content. The proposed solutions present an accuracy of 93% in pornography classification, and allow the correct localization of 91% of pornographic content within a video stream. The results for violence are also compelling: with the proposed approaches, we reached second place in an international competition of violent scenes detection. Putting both in perspective, we learned that pornography detection is easier than its violence counterpart, opening several opportunities for additional investigations by the research community. The main reason for such difference is related to the distinct levels of subjectivity that are inherent to each concept. While pornography is usually more explicit, violence presents a broader spectrum of possible manifestationsDoutoradoCiência da ComputaçãoDoutor em Ciência da Computação1572763, 1197473CAPE
    corecore