1,401 research outputs found

    User profiles’ image clustering for digital investigations

    Get PDF
    Sharing images on Social Network (SN) platforms is one of the most widespread behaviors which may cause privacy-intrusive and illegal content to be widely distributed. Clustering the images shared through SN platforms according to the acquisition cameras embedded in smartphones is regarded as a significant task in forensic investigations of cybercrimes. The Sensor Pattern Noise (SPN) caused by camera sensor imperfections due to the manufacturing process has been proved to be an effective and robust camera fingerprint that can be used for several tasks, such as digital evidence analysis, smartphone fingerprinting and user profile linking as well. Clustering the images uploaded by users on their profiles is a way of fingerprinting the camera sources and it is considered a challenging task since users may upload different types of images, i.e., the images taken by users’ smartphones (taken images) and single images from different sources, cropped images, or generic images from the Web (shared images). The shared images make a perturbation in the clustering task, as they do not usually present sufficient characteristics of SPN of their related sources. Moreover, they are not directly referable to the user’s device so they have to be detected and removed from the clustering process. In this paper, we propose a user profiles’ image clustering method without prior knowledge about the type and number of the camera sources. The hierarchical graph-based method clusters both types of images, taken images and shared images. The strengths of our method include overcoming large-scale image datasets, the presence of shared images that perturb the clustering process and the loss of image details caused by the process of content compression on SN platforms. The method is evaluated on the VISION dataset, which is a public benchmark including images from 35 smartphones. The dataset is perturbed by 3000 images, simulating the shared images from different sources except for users’ smartphones. Experimental results confirm the robustness of the proposed method against perturbed datasets and its effectiveness in the image clustering

    Classifiers and machine learning techniques for image processing and computer vision

    Get PDF
    Orientador: Siome Klein GoldensteinTese (doutorado) - Universidade Estadual de Campinas, Instituto da ComputaçãoResumo: Neste trabalho de doutorado, propomos a utilizaçãoo de classificadores e técnicas de aprendizado de maquina para extrair informações relevantes de um conjunto de dados (e.g., imagens) para solução de alguns problemas em Processamento de Imagens e Visão Computacional. Os problemas de nosso interesse são: categorização de imagens em duas ou mais classes, detecçãao de mensagens escondidas, distinção entre imagens digitalmente adulteradas e imagens naturais, autenticação, multi-classificação, entre outros. Inicialmente, apresentamos uma revisão comparativa e crítica do estado da arte em análise forense de imagens e detecção de mensagens escondidas em imagens. Nosso objetivo é mostrar as potencialidades das técnicas existentes e, mais importante, apontar suas limitações. Com esse estudo, mostramos que boa parte dos problemas nessa área apontam para dois pontos em comum: a seleção de características e as técnicas de aprendizado a serem utilizadas. Nesse estudo, também discutimos questões legais associadas a análise forense de imagens como, por exemplo, o uso de fotografias digitais por criminosos. Em seguida, introduzimos uma técnica para análise forense de imagens testada no contexto de detecção de mensagens escondidas e de classificação geral de imagens em categorias como indoors, outdoors, geradas em computador e obras de arte. Ao estudarmos esse problema de multi-classificação, surgem algumas questões: como resolver um problema multi-classe de modo a poder combinar, por exemplo, caracteríisticas de classificação de imagens baseadas em cor, textura, forma e silhueta, sem nos preocuparmos demasiadamente em como normalizar o vetor-comum de caracteristicas gerado? Como utilizar diversos classificadores diferentes, cada um, especializado e melhor configurado para um conjunto de caracteristicas ou classes em confusão? Nesse sentido, apresentamos, uma tecnica para fusão de classificadores e caracteristicas no cenário multi-classe através da combinação de classificadores binários. Nós validamos nossa abordagem numa aplicação real para classificação automática de frutas e legumes. Finalmente, nos deparamos com mais um problema interessante: como tornar a utilização de poderosos classificadores binarios no contexto multi-classe mais eficiente e eficaz? Assim, introduzimos uma tecnica para combinação de classificadores binarios (chamados classificadores base) para a resolução de problemas no contexto geral de multi-classificação.Abstract: In this work, we propose the use of classifiers and machine learning techniques to extract useful information from data sets (e.g., images) to solve important problems in Image Processing and Computer Vision. We are particularly interested in: two and multi-class image categorization, hidden messages detection, discrimination among natural and forged images, authentication, and multiclassification. To start with, we present a comparative survey of the state-of-the-art in digital image forensics as well as hidden messages detection. Our objective is to show the importance of the existing solutions and discuss their limitations. In this study, we show that most of these techniques strive to solve two common problems in Machine Learning: the feature selection and the classification techniques to be used. Furthermore, we discuss the legal and ethical aspects of image forensics analysis, such as, the use of digital images by criminals. We introduce a technique for image forensics analysis in the context of hidden messages detection and image classification in categories such as indoors, outdoors, computer generated, and art works. From this multi-class classification, we found some important questions: how to solve a multi-class problem in order to combine, for instance, several different features such as color, texture, shape, and silhouette without worrying about the pre-processing and normalization of the combined feature vector? How to take advantage of different classifiers, each one custom tailored to a specific set of classes in confusion? To cope with most of these problems, we present a feature and classifier fusion technique based on combinations of binary classifiers. We validate our solution with a real application for automatic produce classification. Finally, we address another interesting problem: how to combine powerful binary classifiers in the multi-class scenario more effectively? How to boost their efficiency? In this context, we present a solution that boosts the efficiency and effectiveness of multi-class from binary techniques.DoutoradoEngenharia de ComputaçãoDoutor em Ciência da Computaçã

    The COST292 experimental framework for TRECVID 2007

    Get PDF
    In this paper, we give an overview of the four tasks submitted to TRECVID 2007 by COST292. In shot boundary (SB) detection task, four SB detectors have been developed and the results are merged using two merging algorithms. The framework developed for the high-level feature extraction task comprises four systems. The first system transforms a set of low-level descriptors into the semantic space using Latent Semantic Analysis and utilises neural networks for feature detection. The second system uses a Bayesian classifier trained with a “bag of subregions”. The third system uses a multi-modal classifier based on SVMs and several descriptors. The fourth system uses two image classifiers based on ant colony optimisation and particle swarm optimisation respectively. The system submitted to the search task is an interactive retrieval application combining retrieval functionalities in various modalities with a user interface supporting automatic and interactive search over all queries submitted. Finally, the rushes task submission is based on a video summarisation and browsing system comprising two different interest curve algorithms and three features

    Classification and Clustering of Shared Images on Social Networks and User Profile Linking

    Get PDF
    The ever increasing prevalence of smartphones and the popularity of social network platforms have facilitated instant sharing of multimedia content through social networks. However, the ease in taking and sharing photos and videos through social networks also allows privacy-intrusive and illegal content to be widely distributed. As such, images captured and shared by users on their profiles are considered as significant digital evidence for social network data analysis. The Sensor Pattern Noise (SPN) caused by camera sensor imperfections during the manufacturing process mainly consists of the Photo-Response Non-Uniformity (PRNU) noise that can be extracted from taken images without hacking the device. It has been proven to be an effective and robust device fingerprint that can be used for different important digital image forensic tasks, such as image forgery detection, source device identification and device linking. Particularly, by fingerprinting the camera sources captured a set of shared images on social networks, User Profile Linking (UPL) can be performed on social network platforms. The aim of this thesis is to present effective and robust methods and algorithms for better fulfilling shared image analysis based on SPN. We propose clustering and classification based methods to achieve Smartphone Identification (SI) and UPL tasks, given a set of images captured by a known number of smartphones and shared on a set of known user profiles. The important outcome of the proposed methods is UPL across different social networks where the clustered images from one social network are applied to fingerprint the related smartphones and link user profiles on the other social network. Also, we propose two methods for large-scale image clustering of different types of the shared images by users, without prior knowledge about the types and number of the smartphones

    Two and three dimensional segmentation of multimodal imagery

    Get PDF
    The role of segmentation in the realms of image understanding/analysis, computer vision, pattern recognition, remote sensing and medical imaging in recent years has been significantly augmented due to accelerated scientific advances made in the acquisition of image data. This low-level analysis protocol is critical to numerous applications, with the primary goal of expediting and improving the effectiveness of subsequent high-level operations by providing a condensed and pertinent representation of image information. In this research, we propose a novel unsupervised segmentation framework for facilitating meaningful segregation of 2-D/3-D image data across multiple modalities (color, remote-sensing and biomedical imaging) into non-overlapping partitions using several spatial-spectral attributes. Initially, our framework exploits the information obtained from detecting edges inherent in the data. To this effect, by using a vector gradient detection technique, pixels without edges are grouped and individually labeled to partition some initial portion of the input image content. Pixels that contain higher gradient densities are included by the dynamic generation of segments as the algorithm progresses to generate an initial region map. Subsequently, texture modeling is performed and the obtained gradient, texture and intensity information along with the aforementioned initial partition map are used to perform a multivariate refinement procedure, to fuse groups with similar characteristics yielding the final output segmentation. Experimental results obtained in comparison to published/state-of the-art segmentation techniques for color as well as multi/hyperspectral imagery, demonstrate the advantages of the proposed method. Furthermore, for the purpose of achieving improved computational efficiency we propose an extension of the aforestated methodology in a multi-resolution framework, demonstrated on color images. Finally, this research also encompasses a 3-D extension of the aforementioned algorithm demonstrated on medical (Magnetic Resonance Imaging / Computed Tomography) volumes

    Automatic region-of-interest extraction in low depth-of-field images

    Get PDF
    PhD ThesisAutomatic extraction of focused regions from images with low depth-of-field (DOF) is a problem without an efficient solution yet. The capability of extracting focused regions can help to bridge the semantic gap by integrating image regions which are meaningfully relevant and generally do not exhibit uniform visual characteristics. There exist two main difficulties for extracting focused regions from low DOF images using high-frequency based techniques: computational complexity and performance. A novel unsupervised segmentation approach based on ensemble clustering is proposed to extract the focused regions from low DOF images in two stages. The first stage is to cluster image blocks in a joint contrast-energy feature space into three constituent groups. To achieve this, we make use of a normal mixture-based model along with standard expectation-maximization (EM) algorithm at two consecutive levels of block size. To avoid the common problem of local optima experienced in many models, an ensemble EM clustering algorithm is proposed. As a result, relevant blocks, i.e., block-based region-of-interest (ROI), closely conforming to image objects are extracted. In stage two, two different approaches have been developed to extract pixel-based ROI. In the first approach, a binary saliency map is constructed from the relevant blocks at the pixel level, which is based on difference of Gaussian (DOG) and binarization methods. Then, a set of morphological operations is employed to create the pixel-based ROI from the map. Experimental results demonstrate that the proposed approach achieves an average segmentation performance of 91.3% and is computationally 3 times faster than the best existing approach. In the second approach, a minimal graph cut is constructed by using the max-flow method and also by using object/background seeds provided by the ensemble clustering algorithm. Experimental results demonstrate an average segmentation performance of 91.7% and approximately 50% reduction of the average computational time by the proposed colour based approach compared with existing unsupervised approaches

    The SDSS-III Baryon Oscillation Spectroscopic Survey: Quasar Target Selection for Data Release Nine

    Full text link
    The SDSS-III Baryon Oscillation Spectroscopic Survey (BOSS), a five-year spectroscopic survey of 10,000 deg^2, achieved first light in late 2009. One of the key goals of BOSS is to measure the signature of baryon acoustic oscillations in the distribution of Ly-alpha absorption from the spectra of a sample of ~150,000 z>2.2 quasars. Along with measuring the angular diameter distance at z\approx2.5, BOSS will provide the first direct measurement of the expansion rate of the Universe at z > 2. One of the biggest challenges in achieving this goal is an efficient target selection algorithm for quasars over 2.2 < z < 3.5, where their colors overlap those of stars. During the first year of the BOSS survey, quasar target selection methods were developed and tested to meet the requirement of delivering at least 15 quasars deg^-2 in this redshift range, out of 40 targets deg^-2. To achieve these surface densities, the magnitude limit of the quasar targets was set at g <= 22.0 or r<=21.85. While detection of the BAO signature in the Ly-alpha absorption in quasar spectra does not require a uniform target selection, many other astrophysical studies do. We therefore defined a uniformly-selected subsample of 20 targets deg^-2, for which the selection efficiency is just over 50%. This "CORE" subsample will be fixed for Years Two through Five of the survey. In this paper we describe the evolution and implementation of the BOSS quasar target selection algorithms during the first two years of BOSS operations. We analyze the spectra obtained during the first year. 11,263 new z>2.2 quasars were spectroscopically confirmed by BOSS. Our current algorithms select an average of 15 z > 2.2 quasars deg^-2 from 40 targets deg^-2 using single-epoch SDSS imaging. Multi-epoch optical data and data at other wavelengths can further improve the efficiency and completeness of BOSS quasar target selection. [Abridged]Comment: 33 pages, 26 figures, 12 tables and a whole bunch of quasars. Submitted to Ap

    Source identification in image forensics

    Get PDF
    Source identification is one of the most important tasks in digital image forensics. In fact, the ability to reliably associate an image with its acquisition device may be crucial both during investigations and before a court of law. For example, one may be interested in proving that a certain photo was taken by his/her camera, in order to claim intellectual property. On the contrary, it may be law enforcement agencies that are interested to trace back the origin of some images, because they violate the law themselves (e.g. do not respect privacy laws), or maybe they point to subjects involved in unlawful and dangerous activities (like terrorism, pedo-pornography, etc). More in general, proving, beyond reasonable doubts, that a photo was taken by a given camera, may be an important element for decisions in court. The key assumption of forensic source identification is that acquisition devices leave traces in the acquired content, and that instances of these traces are specific to the respective (class of) device(s). This kind of traces is present in the so-called device fingerprint. The name stems from the forensic value of human fingerprints. Motivated by the importance of the source identification in digital image forensics community and the need of reliable techniques using device fingerprint, the work developed in the Ph.D. thesis concerns different source identification level, using both feature-based and PRNU-based approach for model and device identification. In addition, it is also shown that counter-forensics methods can easily attack machine learning techniques for image forgery detection. In model identification, an analysis of hand-crafted local features and deep learning ones has been considered for the basic two-class classification problem. In addition, a comparison with the limited knowledge and the blind scenario are presented. Finally, an application of camera model identification on various iris sensor models is conducted. A blind scenario technique that faces the problem of device source identification using the PRNU-based approach is also proposed. With the use of the correlation between single-image sensor noise, a blind two-step source clustering is proposed. In the first step correlation clustering together with ensemble method is used to obtain an initial partition, which is then refined in the second step by means of a Bayesian approach. Experimental results show that this proposal outperforms the state-of-the-art techniques and still give an acceptable performance when considering images downloaded from Facebook
    corecore