59,627 research outputs found

    An Efficient and Cost Effective FPGA Based Implementation of the Viola-Jones Face Detection Algorithm

    Get PDF
    We present an field programmable gate arrays (FPGA) based implementation of the popular Viola-Jones face detection algorithm, which is an essential building block in many applications such as video surveillance and tracking. Our implementation is a complete system level hardware design described in a hardware description language and validated on the affordable DE2-115 evaluation board. Our primary objective is to study the achievable performance with a low-end FPGA chip based implementation. In addition, we release to the public domain the entire project. We hope that this will enable other researchers to easily replicate and compare their results to ours and that it will encourage and facilitate further research and educational ideas in the areas of image processing, computer vision, and advanced digital design and FPGA prototyping

    An examination of automatic video retrieval technology on access to the contents of an historical video archive

    Get PDF
    Purpose – This paper aims to provide an initial understanding of the constraints that historical video collections pose to video retrieval technology and the potential that online access offers to both archive and users. Design/methodology/approach – A small and unique collection of videos on customs and folklore was used as a case study. Multiple methods were employed to investigate the effectiveness of technology and the modality of user access. Automatic keyframe extraction was tested on the visual content while the audio stream was used for automatic classification of speech and music clips. The user access (search vs browse) was assessed in a controlled user evaluation. A focus group and a survey provided insight on the actual use of the analogue archive. The results of these multiple studies were then compared and integrated (triangulation). Findings – The amateur material challenged automatic techniques for video and audio indexing, thus suggesting that the technology must be tested against the material before deciding on a digitisation strategy. Two user interaction modalities, browsing vs searching, were tested in a user evaluation. Results show users preferred searching, but browsing becomes essential when the search engine fails in matching query and indexed words. Browsing was also valued for serendipitous discovery; however the organisation of the archive was judged cryptic and therefore of limited use. This indicates that the categorisation of an online archive should be thought of in terms of users who might not understand the current classification. The focus group and the survey showed clearly the advantage of online access even when the quality of the video surrogate is poor. The evidence gathered suggests that the creation of a digital version of a video archive requires a rethinking of the collection in terms of the new medium: a new archive should be specially designed to exploit the potential that the digital medium offers. Similarly, users' needs have to be considered before designing the digital library interface, as needs are likely to be different from those imagined. Originality/value – This paper is the first attempt to understand the advantages offered and limitations held by video retrieval technology for small video archives like those often found in special collections

    CHORUS Deliverable 2.1: State of the Art on Multimedia Search Engines

    Get PDF
    Based on the information provided by European projects and national initiatives related to multimedia search as well as domains experts that participated in the CHORUS Think-thanks and workshops, this document reports on the state of the art related to multimedia content search from, a technical, and socio-economic perspective. The technical perspective includes an up to date view on content based indexing and retrieval technologies, multimedia search in the context of mobile devices and peer-to-peer networks, and an overview of current evaluation and benchmark inititiatives to measure the performance of multimedia search engines. From a socio-economic perspective we inventorize the impact and legal consequences of these technical advances and point out future directions of research

    Boosting Image Forgery Detection using Resampling Features and Copy-move analysis

    Full text link
    Realistic image forgeries involve a combination of splicing, resampling, cloning, region removal and other methods. While resampling detection algorithms are effective in detecting splicing and resampling, copy-move detection algorithms excel in detecting cloning and region removal. In this paper, we combine these complementary approaches in a way that boosts the overall accuracy of image manipulation detection. We use the copy-move detection method as a pre-filtering step and pass those images that are classified as untampered to a deep learning based resampling detection framework. Experimental results on various datasets including the 2017 NIST Nimble Challenge Evaluation dataset comprising nearly 10,000 pristine and tampered images shows that there is a consistent increase of 8%-10% in detection rates, when copy-move algorithm is combined with different resampling detection algorithms

    A framework for automatic semantic video annotation

    Get PDF
    The rapidly increasing quantity of publicly available videos has driven research into developing automatic tools for indexing, rating, searching and retrieval. Textual semantic representations, such as tagging, labelling and annotation, are often important factors in the process of indexing any video, because of their user-friendly way of representing the semantics appropriate for search and retrieval. Ideally, this annotation should be inspired by the human cognitive way of perceiving and of describing videos. The difference between the low-level visual contents and the corresponding human perception is referred to as the ‘semantic gap’. Tackling this gap is even harder in the case of unconstrained videos, mainly due to the lack of any previous information about the analyzed video on the one hand, and the huge amount of generic knowledge required on the other. This paper introduces a framework for the Automatic Semantic Annotation of unconstrained videos. The proposed framework utilizes two non-domain-specific layers: low-level visual similarity matching, and an annotation analysis that employs commonsense knowledgebases. Commonsense ontology is created by incorporating multiple-structured semantic relationships. Experiments and black-box tests are carried out on standard video databases for action recognition and video information retrieval. White-box tests examine the performance of the individual intermediate layers of the framework, and the evaluation of the results and the statistical analysis show that integrating visual similarity matching with commonsense semantic relationships provides an effective approach to automated video annotation
    • 

    corecore