3,110 research outputs found

    Pandora: Description of a Painting Database for Art Movement Recognition with Baselines and Perspectives

    Full text link
    To facilitate computer analysis of visual art, in the form of paintings, we introduce Pandora (Paintings Dataset for Recognizing the Art movement) database, a collection of digitized paintings labelled with respect to the artistic movement. Noting that the set of databases available as benchmarks for evaluation is highly reduced and most existing ones are limited in variability and number of images, we propose a novel large scale dataset of digital paintings. The database consists of more than 7700 images from 12 art movements. Each genre is illustrated by a number of images varying from 250 to nearly 1000. We investigate how local and global features and classification systems are able to recognize the art movement. Our experimental results suggest that accurate recognition is achievable by a combination of various categories.To facilitate computer analysis of visual art, in the form of paintings, we introduce Pandora (Paintings Dataset for Recognizing the Art movement) database, a collection of digitized paintings labelled with respect to the artistic movement. Noting that the set of databases available as benchmarks for evaluation is highly reduced and most existing ones are limited in variability and number of images, we propose a novel large scale dataset of digital paintings. The database consists of more than 7700 images from 12 art movements. Each genre is illustrated by a number of images varying from 250 to nearly 1000. We investigate how local and global features and classification systems are able to recognize the art movement. Our experimental results suggest that accurate recognition is achievable by a combination of various categories.Comment: 11 pages, 1 figure, 6 table

    Video genre categorization and representation using audio-visual information

    Get PDF
    International audienceWe propose an audio-visual approach to video genre classification using content descriptors that exploit audio, color, temporal, and contour information. Audio information is extracted at block-level, which has the advantage of capturing local temporal information. At the temporal structure level, we consider action content in relation to human perception. Color perception is quantified using statistics of color distribution, elementary hues, color properties, and relationships between colors. Further, we compute statistics of contour geometry and relationships. The main contribution of our work lies in harnessingn the descriptive power of the combination of these descriptors in genre classification. Validation was carried out on over 91 h of video footage encompassing 7 common video genres, yielding average precision and recall ratios of 87% to 100% and 77% to 100%, respectively, and an overall average correct classification of up to 97%. Also, experimental comparison as part of the MediaEval 2011 benchmarkingn campaign demonstrated the efficiency of the proposed audiovisual descriptors over other existing approaches. Finally, we discuss a 3-D video browsing platform that displays movies using efaturebased coordinates and thus regroups them according to genre

    Closing the loop: assisting archival appraisal and information retrieval in one sweep

    Get PDF
    In this article, we examine the similarities between the concept of appraisal, a process that takes place within the archives, and the concept of relevance judgement, a process fundamental to the evaluation of information retrieval systems. More specifically, we revisit selection criteria proposed as result of archival research, and work within the digital curation communities, and, compare them to relevance criteria as discussed within information retrieval's literature based discovery. We illustrate how closely these criteria relate to each other and discuss how understanding the relationships between the these disciplines could form a basis for proposing automated selection for archival processes and initiating multi-objective learning with respect to information retrieval

    An audio-visual approach to web video categorization

    Get PDF
    International audienceIn this paper we address the issue of automatic video genre categorization of web media using an audio-visual approach. To this end, we propose content descriptors which exploit audio, temporal structure and color information. The potential of our descriptors is experimentally validated both from the perspective of a classification system and as an information retrieval approach. Validation is carried out on a real scenario, namely on more than 288 hours of video footage and 26 video genres specific to blip.tv media platform. Additionally, to reduce semantic gap, we propose a new relevance feedback technique which is based on hierarchical clustering. Experimental tests prove that retrieval performance can be significantly increased in this case, becoming comparable to the one obtained with high level semantic textual descriptors

    Content-Based Video Description for Automatic Video Genre Categorization

    Get PDF
    International audienceIn this paper, we propose an audio-visual approach to video genre categorization. Audio information is extracted at block-level, which has the advantage of capturing local temporal information. At temporal structural level, we asses action contents with respect to human perception. Further, color perception is quantified with statistics of color distribution, elementary hues, color properties and relationship of color. The last category of descriptors determines statistics of contour geometry. An extensive evaluation of this multi-modal approach based on more than 91 hours of video footage is presented. We obtain average precision and recall ratios within [87% − 100%] and [77% − 100%], respectively,nwhile average correct classification is up to 97%. Additionally, movies displayed according to feature-based coordinates in a virtual 3D browsing environment tend to regroup with respect to genre, which has potential application with real content-based browsing systems

    An in-depth evaluation of multimodal video genre categorization

    Get PDF
    International audienceIn this paper we propose an in-depth evaluation of the performance of video descriptors to multimodal video genre categorization. We discuss the perspective of designing appropriate late fusion techniques that would enable to attain very high categorization accuracy, close to the one achieved with user-based text information. Evaluation is carried out in the context of the 2012 Video Genre Tagging Task of the MediaEval Benchmarking Initiative for Multimedia Evaluation, using a data set of up to 15.000 videos (3,200 hours of footage) and 26 video genre categories specific to web media. Results show that the proposed approach significantly improves genre categorization performance, outperforming other existing approaches. The main contribution of this paper is in the experimental part, several valuable interesting findings are reported that motivate further research on video genre classification

    Multimodal music information processing and retrieval: survey and future challenges

    Full text link
    Towards improving the performance in various music information processing tasks, recent studies exploit different modalities able to capture diverse aspects of music. Such modalities include audio recordings, symbolic music scores, mid-level representations, motion, and gestural data, video recordings, editorial or cultural tags, lyrics and album cover arts. This paper critically reviews the various approaches adopted in Music Information Processing and Retrieval and highlights how multimodal algorithms can help Music Computing applications. First, we categorize the related literature based on the application they address. Subsequently, we analyze existing information fusion approaches, and we conclude with the set of challenges that Music Information Retrieval and Sound and Music Computing research communities should focus in the next years
    corecore