148 research outputs found

    Towards visualization and searching :a dual-purpose video coding approach

    Get PDF
    In modern video applications, the role of the decoded video is much more than filling a screen for visualization. To offer powerful video-enabled applications, it is increasingly critical not only to visualize the decoded video but also to provide efficient searching capabilities for similar content. Video surveillance and personal communication applications are critical examples of these dual visualization and searching requirements. However, current video coding solutions are strongly biased towards the visualization needs. In this context, the goal of this work is to propose a dual-purpose video coding solution targeting both visualization and searching needs by adopting a hybrid coding framework where the usual pixel-based coding approach is combined with a novel feature-based coding approach. In this novel dual-purpose video coding solution, some frames are coded using a set of keypoint matches, which not only allow decoding for visualization, but also provide the decoder valuable feature-related information, extracted at the encoder from the original frames, instrumental for efficient searching. The proposed solution is based on a flexible joint Lagrangian optimization framework where pixel-based and feature-based processing are combined to find the most appropriate trade-off between the visualization and searching performances. Extensive experimental results for the assessment of the proposed dual-purpose video coding solution under meaningful test conditions are presented. The results show the flexibility of the proposed coding solution to achieve different optimization trade-offs, notably competitive performance regarding the state-of-the-art HEVC standard both in terms of visualization and searching performance.Em modernas aplicações de vídeo, o papel do vídeo decodificado é muito mais que simplesmente preencher uma tela para visualização. Para oferecer aplicações mais poderosas por meio de sinais de vídeo,é cada vez mais crítico não apenas considerar a qualidade do conteúdo objetivando sua visualização, mas também possibilitar meios de realizar busca por conteúdos semelhantes. Requisitos de visualização e de busca são considerados, por exemplo, em modernas aplicações de vídeo vigilância e comunicações pessoais. No entanto, as atuais soluções de codificação de vídeo são fortemente voltadas aos requisitos de visualização. Nesse contexto, o objetivo deste trabalho é propor uma solução de codificação de vídeo de propósito duplo, objetivando tanto requisitos de visualização quanto de busca. Para isso, é proposto um arcabouço de codificação em que a abordagem usual de codificação de pixels é combinada com uma nova abordagem de codificação baseada em features visuais. Nessa solução, alguns quadros são codificados usando um conjunto de pares de keypoints casados, possibilitando não apenas visualização, mas também provendo ao decodificador valiosas informações de features visuais, extraídas no codificador a partir do conteúdo original, que são instrumentais em aplicações de busca. A solução proposta emprega um esquema flexível de otimização Lagrangiana onde o processamento baseado em pixel é combinado com o processamento baseado em features visuais objetivando encontrar um compromisso adequado entre os desempenhos de visualização e de busca. Os resultados experimentais mostram a flexibilidade da solução proposta em alcançar diferentes compromissos de otimização, nomeadamente desempenho competitivo em relação ao padrão HEVC tanto em termos de visualização quanto de busca

    Irish Machine Vision and Image Processing Conference Proceedings 2017

    Get PDF

    Copy-move forgery detection using combined features and transitive matching

    Get PDF
    Recently, the research of Internet of Things (IoT) and Multimedia Big Data (MBD) has been growing tremendously. Both IoT and MBD have a lot of multimedia data, which can be tampered easily. Therefore, the research of multimedia forensics is necessary. Copy-move is an important branch of multimedia forensics. In this paper, a novel copy-move forgery detection scheme using combined features and transitive matching is proposed. First, SIFT and LIOP are extracted as combined features from the input image. Second, transitive matching is used to improve the matching relationship. Third, a filtering approach using image segmentation is proposed to filter out false matches. Fourth, affine transformations are estimated between these image patches. Finally, duplicated regions are located based on those affine transformations. The experimental results demonstrate that the proposed scheme can achieve much better detection results on the public database under various attacks

    Comparative Analysis of Techniques Used to Detect Copy-Move Tampering for Real-World Electronic Images

    Get PDF
    Evolution of high computational powerful computers, easy availability of several innovative editing software package and high-definition quality-based image capturing tools follows to effortless result in producing image forgery. Though, threats for security and misinterpretation of digital images and scenes have been observed to be happened since a long period and also a lot of research has been established in developing diverse techniques to authenticate the digital images. On the contrary, the research in this region is not limited to checking the validity of digital photos but also to exploring the specific signs of distortion or forgery. This analysis would not require additional prior information of intrinsic content of corresponding digital image or prior embedding of watermarks. In this paper, recent growth in the area of digital image tampering identification have been discussed along with benchmarking study has been shown with qualitative and quantitative results. With variety of methodologies and concepts, different applications of forgery detection have been discussed with corresponding outcomes especially using machine and deep learning methods in order to develop efficient automated forgery detection system. The future applications and development of advanced soft-computing based techniques in digital image forgery tampering has been discussed

    Multi-Directional Multi-Level Dual-Cross Patterns for Robust Face Recognition

    Full text link
    © 1979-2012 IEEE. To perform unconstrained face recognition robust to variations in illumination, pose and expression, this paper presents a new scheme to extract 'Multi-Directional Multi-Level Dual-Cross Patterns' (MDML-DCPs) from face images. Specifically, the MDML-DCPs scheme exploits the first derivative of Gaussian operator to reduce the impact of differences in illumination and then computes the DCP feature at both the holistic and component levels. DCP is a novel face image descriptor inspired by the unique textural structure of human faces. It is computationally efficient and only doubles the cost of computing local binary patterns, yet is extremely robust to pose and expression variations. MDML-DCPs comprehensively yet efficiently encodes the invariant characteristics of a face image from multiple levels into patterns that are highly discriminative of inter-personal differences but robust to intra-personal variations. Experimental results on the FERET, CAS-PERL-R1, FRGC 2.0, and LFW databases indicate that DCP outperforms the state-of-the-art local descriptors (e.g., LBP, LTP, LPQ, POEM, tLBP, and LGXP) for both face identification and face verification tasks. More impressively, the best performance is achieved on the challenging LFW and FRGC 2.0 databases by deploying MDML-DCPs in a simple recognition scheme

    Comparative Analysis of Techniques Used to Detect Copy-Move Tampering for Real-World Electronic Images

    Get PDF
    Evolution of high computational powerful computers, easy availability of several innovative editing software package and high-definition quality-based image capturing tools follows to effortless result in producing image forgery. Though, threats for security and misinterpretation of digital images and scenes have been observed to be happened since a long period and also a lot of research has been established in developing diverse techniques to authenticate the digital images. On the contrary, the research in this region is not limited to checking the validity of digital photos but also to exploring the specific signs of distortion or forgery. This analysis would not require additional prior information of intrinsic content of corresponding digital image or prior embedding of watermarks. In this paper, recent growth in the area of digital image tampering identification have been discussed along with benchmarking study has been shown with qualitative and quantitative results. With variety of methodologies and concepts, different applications of forgery detection have been discussed with corresponding outcomes especially using machine and deep learning methods in order to develop efficient automated forgery detection system. The future applications and development of advanced soft-computing based techniques in digital image forgery tampering has been discussed

    Contributions to the content-based image retrieval using pictorial queries

    Get PDF
    Descripció del recurs: el 02 de novembre de 2010L'accés massiu a les càmeres digitals, els ordinadors personals i a Internet, ha propiciat la creació de grans volums de dades en format digital. En aquest context, cada vegada adquireixen major rellevància totes aquelles eines dissenyades per organitzar la informació i facilitar la seva cerca. Les imatges són un cas particular de dades que requereixen tècniques específiques de descripció i indexació. L'àrea de la visió per computador encarregada de l'estudi d'aquestes tècniques rep el nom de Recuperació d'Imatges per Contingut, en anglès Content-Based Image Retrieval (CBIR). Els sistemes de CBIR no utilitzen descripcions basades en text sinó que es basen en característiques extretes de les pròpies imatges. En contrast a les més de 6000 llengües parlades en el món, les descripcions basades en característiques visuals representen una via d'expressió universal. La intensa recerca en el camp dels sistemes de CBIR s'ha aplicat en àrees de coneixement molt diverses. Així doncs s'han desenvolupat aplicacions de CBIR relacionades amb la medicina, la protecció de la propietat intel·lectual, el periodisme, el disseny gràfic, la cerca d'informació en Internet, la preservació dels patrimoni cultural, etc. Un dels punts importants d'una aplicació de CBIR resideix en el disseny de les funcions de l'usuari. L'usuari és l'encarregat de formular les consultes a partir de les quals es fa la cerca de les imatges. Nosaltres hem centrat l'atenció en aquells sistemes en què la consulta es formula a partir d'una representació pictòrica. Hem plantejat una taxonomia dels sistemes de consulta en composada per quatre paradigmes diferents: Consulta-segons-Selecció, Consulta-segons-Composició-Icònica, Consulta-segons-Esboç i Consulta-segons-Il·lustració. Cada paradigma incorpora un nivell diferent en el potencial expressiu de l'usuari. Des de la simple selecció d'una imatge, fins a la creació d'una il·lustració en color, l'usuari és qui pren el control de les dades d'entrada del sistema. Al llarg dels capítols d'aquesta tesi hem analitzat la influència que cada paradigma de consulta exerceix en els processos interns d'un sistema de CBIR. D'aquesta manera també hem proposat un conjunt de contribucions que hem exemplificat des d'un punt de vista pràctic mitjançant una aplicació final
    corecore