31 research outputs found

    Fast and Effective Bag-of-Visual-Word Model to Pornographic Images Recognition Using the FREAK Descriptor

    Get PDF
    Recently, the Bag of Visual Word (BoVW) has gained enormous popularity between researchers to object recognition. Pornographic image recognition with respect to computational complexity, appropriate accuracy, and memory consumption is a major challenge in the applications with time constraints such as the internet pornography filtering. Most of the existing researches based on the Bow, using the very popular SIFT and SURF algorithms to description and match detected keypoints in the image. The main problem of these methods is high computational complexity due to constructing the high dimensional feature vectors. This research proposed a BoVW based model by adopting very fast and simple binary descriptor FREAK to speed-up pornographic recognition process. Meanwhile, the keypoints are detected in the ROI of images which improves the recognition speed due to eliminating many noise keypoints placed in the image background. Finally, in order to find the most representational visual-vocabulary, different vocabularies are generated from size 150 to 500 for BoVW. Compared with the similar works, the experimental results show that the proposed model has gained remarkable improvement in the terms of computational complexity

    Enhancing music information retrieval by incorporating image-based local features

    Get PDF
    This paper presents a novel approach to Music Information Retrieval. Having represented the music tracks in the form of two dimensional images, we apply the "bag of visual words" method from visual IR in order to classify the songs into 19 genres. By switching to visual domain we can abstract from musical concepts such as melody, timbre and rhythm. We obtained classification accuracy of 46% (with 5% theoretical baseline for random classification) which is comparable with existing state-of-the-art approaches. Moreover, the novel features characterize different properties of the signal than standard methods. Therefore, the combination of them should further improve the performance of existing techniques

    Survey On Nudity Detection: Opportunities And Challenges Based On ‘Awrah Concept In Islamic Shari’a

    Get PDF
    The nudity or nakedness which known as awrah in Islam is part of the human body which in principle should not be seen by other people except those qualified to be her or his mahram or in an emergency or urgent need.Nudity detection technique has long been receiving a lot of attention by researchers worldwide due to its importance particularly to the global Muslim community. In this paper, the techniques were separated into four classifications namely methods based on body structure, image retrieval, the features of skin region, and bag-of-visual-words (BoVW). All of these techniques are applicable to some areas of skin on the body as well as on the sexual organs that should be visible to determine nude or not. While the concept of nakedness in Islamic Shari'a has different rules between men and women, such as the limit of male ‘awrah is between the navel and the knees, while the limit of female ‘awrah is the entire body except the face and hands which should be closed using the hijab. In general, existing techniques can be used to detect nakedness concerned bythe Islamic Shari'a. The selection ofhese techniques are employed based on the areas of skin on the body as well as or the sexual organs to indicate whether it falls to thenude category or not. While in Islamic Shari'a, different 'awrah rules are required for men and women such as the limit 'awrah, the requirements of clothes as cover awrah, and kinds of shapes and shades of Hijabs in various countries (for women only). These problems are the opportunities and challenges for the researcher to propose an ‘awrah detection technique in accordance with the Islamic Shari'a

    Smart Content Recognition from Images Using a Mixture of Convolutional Neural Networks

    Full text link
    With rapid development of the Internet, web contents become huge. Most of the websites are publicly available, and anyone can access the contents from anywhere such as workplace, home and even schools. Nevertheless, not all the web contents are appropriate for all users, especially children. An example of these contents is pornography images which should be restricted to certain age group. Besides, these images are not safe for work (NSFW) in which employees should not be seen accessing such contents during work. Recently, convolutional neural networks have been successfully applied to many computer vision problems. Inspired by these successes, we propose a mixture of convolutional neural networks for adult content recognition. Unlike other works, our method is formulated on a weighted sum of multiple deep neural network models. The weights of each CNN models are expressed as a linear regression problem learned using Ordinary Least Squares (OLS). Experimental results demonstrate that the proposed model outperforms both single CNN model and the average sum of CNN models in adult content recognition.Comment: To be published in LNEE, Code: github.com/mundher/NSF

    Análise de vídeo sensível

    Get PDF
    Orientadores: Anderson de Rezende Rocha, Siome Klein GoldensteinTese (doutorado) - Universidade Estadual de Campinas, Instituto de ComputaçãoResumo: Vídeo sensível pode ser definido como qualquer filme capaz de oferecer ameaças à sua audiência. Representantes típicos incluem ¿ mas não estão limitados a ¿ pornografia, violência, abuso infantil, crueldade contra animais, etc. Hoje em dia, com o papel cada vez mais pervasivo dos dados digitais em nossa vidas, a análise de conteúdo sensível representa uma grande preocupação para representantes da lei, empresas, professores, e pais, devido aos potenciais danos que este tipo de conteúdo pode infligir a menores, estudantes, trabalhadores, etc. Não obstante, o emprego de mediadores humanos, para constantemente analisar grandes quantidades de dados sensíveis, muitas vezes leva a ocorrências de estresse e trauma, o que justifica a busca por análises assistidas por computador. Neste trabalho, nós abordamos este problema em duas frentes. Na primeira, almejamos decidir se um fluxo de vídeo apresenta ou não conteúdo sensível, à qual nos referimos como classificação de vídeo sensível. Na segunda, temos como objetivo encontrar os momentos exatos em que um fluxo começa e termina a exibição de conteúdo sensível, em nível de quadros de vídeo, à qual nos referimos como localização de conteúdo sensível. Para ambos os casos, projetamos e desenvolvemos métodos eficazes e eficientes, com baixo consumo de memória, e adequação à implantação em dispositivos móveis. Neste contexto, nós fornecemos quatro principais contribuições. A primeira é uma nova solução baseada em sacolas de palavras visuais, para a classificação eficiente de vídeos sensíveis, apoiada na análise de fenômenos temporais. A segunda é uma nova solução de fusão multimodal em alto nível semântico, para a localização de conteúdo sensível. A terceira, por sua vez, é um novo detector espaço-temporal de pontos de interesse, e descritor de conteúdo de vídeo. Finalmente, a quarta contribuição diz respeito a uma base de vídeos anotados em nível de quadro, que possui 140 horas de conteúdo pornográfico, e que é a primeira da literatura a ser adequada para a localização de pornografia. Um aspecto relevante das três primeiras contribuições é a sua natureza de generalização, no sentido de poderem ser empregadas ¿ sem modificações no passo a passo ¿ para a detecção de tipos diversos de conteúdos sensíveis, tais como os mencionados anteriormente. Para validação, nós escolhemos pornografia e violência ¿ dois dos tipos mais comuns de material impróprio ¿ como representantes de interesse, de conteúdo sensível. Nestes termos, realizamos experimentos de classificação e de localização, e reportamos resultados para ambos os tipos de conteúdo. As soluções propostas apresentam uma acurácia de 93% em classificação de pornografia, e permitem a correta localização de 91% de conteúdo pornográfico em fluxo de vídeo. Os resultados para violência também são interessantes: com as abordagens apresentadas, nós obtivemos o segundo lugar em uma competição internacional de detecção de cenas violentas. Colocando ambas em perspectiva, nós aprendemos que a detecção de pornografia é mais fácil que a de violência, abrindo várias oportunidades de pesquisa para a comunidade científica. A principal razão para tal diferença está relacionada aos níveis distintos de subjetividade que são inerentes a cada conceito. Enquanto pornografia é em geral mais explícita, violência apresenta um espectro mais amplo de possíveis manifestaçõesAbstract: Sensitive video can be defined as any motion picture that may pose threats to its audience. Typical representatives include ¿ but are not limited to ¿ pornography, violence, child abuse, cruelty to animals, etc. Nowadays, with the ever more pervasive role of digital data in our lives, sensitive-content analysis represents a major concern to law enforcers, companies, tutors, and parents, due to the potential harm of such contents over minors, students, workers, etc. Notwithstanding, the employment of human mediators for constantly analyzing huge troves of sensitive data often leads to stress and trauma, justifying the search for computer-aided analysis. In this work, we tackle this problem in two ways. In the first one, we aim at deciding whether or not a video stream presents sensitive content, which we refer to as sensitive-video classification. In the second one, we aim at finding the exact moments a stream starts and ends displaying sensitive content, at frame level, which we refer to as sensitive-content localization. For both cases, we aim at designing and developing effective and efficient methods, with low memory footprint and suitable for deployment on mobile devices. In this vein, we provide four major contributions. The first one is a novel Bag-of-Visual-Words-based pipeline for efficient time-aware sensitive-video classification. The second is a novel high-level multimodal fusion pipeline for sensitive-content localization. The third, in turn, is a novel space-temporal video interest point detector and video content descriptor. Finally, the fourth contribution comprises a frame-level annotated 140-hour pornographic video dataset, which is the first one in the literature that is appropriate for pornography localization. An important aspect of the first three contributions is their generalization nature, in the sense that they can be employed ¿ without step modifications ¿ to the detection of diverse sensitive content types, such as the previously mentioned ones. For validation, we choose pornography and violence ¿ two of the commonest types of inappropriate material ¿ as target representatives of sensitive content. We therefore perform classification and localization experiments, and report results for both types of content. The proposed solutions present an accuracy of 93% in pornography classification, and allow the correct localization of 91% of pornographic content within a video stream. The results for violence are also compelling: with the proposed approaches, we reached second place in an international competition of violent scenes detection. Putting both in perspective, we learned that pornography detection is easier than its violence counterpart, opening several opportunities for additional investigations by the research community. The main reason for such difference is related to the distinct levels of subjectivity that are inherent to each concept. While pornography is usually more explicit, violence presents a broader spectrum of possible manifestationsDoutoradoCiência da ComputaçãoDoutor em Ciência da Computação1572763, 1197473CAPE

    Pornographic Image Recognition via Weighted Multiple Instance Learning

    Full text link
    In the era of Internet, recognizing pornographic images is of great significance for protecting children's physical and mental health. However, this task is very challenging as the key pornographic contents (e.g., breast and private part) in an image often lie in local regions of small size. In this paper, we model each image as a bag of regions, and follow a multiple instance learning (MIL) approach to train a generic region-based recognition model. Specifically, we take into account the region's degree of pornography, and make three main contributions. First, we show that based on very few annotations of the key pornographic contents in a training image, we can generate a bag of properly sized regions, among which the potential positive regions usually contain useful contexts that can aid recognition. Second, we present a simple quantitative measure of a region's degree of pornography, which can be used to weigh the importance of different regions in a positive image. Third, we formulate the recognition task as a weighted MIL problem under the convolutional neural network framework, with a bag probability function introduced to combine the importance of different regions. Experiments on our newly collected large scale dataset demonstrate the effectiveness of the proposed method, achieving an accuracy with 97.52% true positive rate at 1% false positive rate, tested on 100K pornographic images and 100K normal images.Comment: 9 pages, 3 figure

    An overview of NuDetective Forensic Tool and its usage to combat child pornography in Brazil

    Get PDF
    In many countries, the possession of files containing child and teen pornography is a heinous crime and is desirable for law enforcement be able to detect such files in a timely manner at crime scenes. However, mainly at crime scenes, it is impossible to manually examine all files that can be stored in digital storage devices. The NuDetective Forensic Tool was developed to assist forensic examiners to identify child pornography at crime scenes. NuDetective uses automatic nudity detection in images and videos, file name comparison and also uses hash values to reduce the files to be analyzed by forensic examiners. Despite the high detection rates achieved in past experiments, the authors did not get any formal feedback of NuDetective users about its performance in real forensic cases. So, this work presents a detailed review of the four main features of NuDetective Forensic Tool, including all techniques and methods implemented, and also the results of an unpublished survey conducted to evaluate the real effectiveness of NuDetective by its Brazilian users. The results obtained showed that NuDetective is helping to arrest pedophiles and to combat the child sexual exploitation in the digital era.Sociedad Argentina de Informática e Investigación Operativa (SADIO

    Analysis of Using Metric Access Methods for Visual Search of Objects in Video Databases

    Get PDF
    This article presents an approach to object retrieval that searches for and localizes all the occurrences of an object in a video database, given a query image of the object. Our proposal is based on text-retrieval methods in which video key frames are represented by a dense set of viewpoint invariant region descriptors that enable recognition to proceed successfully despite changes in camera viewpoint, lighting, and partial occlusions. Vector quantizing these region descriptors provides a visual analogy of a word - a visual word. Those words are grouped into a visual vocabulary which is used to index all key frames from the video database. Efficient retrieval is then achieved by employing methods from statistical text retrieval, including inverted file systems, and text-document frequency weightings. Though works in the literature have only adopted a simple sequential scan during search, we investigate the use of different metric access methods (MAM): M-tree, Slim-tree, and D-index, in order to accelerate the processing of similarity queries. In addition, a ranking strategy based on the spatial layout of the regions (spatial consistency) is fully described and evaluated. Experimental results have shown that the adoption of MAMs not only has improved the search performance but also has reduced the influence of the vocabulary size over test results, which may improve the scalability of our proposal. Finally, the application of spatial consistency has produced a very significant improvement of the results
    corecore