278 research outputs found

    Smartphone picture organization: a hierarchical approach

    Get PDF
    We live in a society where the large majority of the population has a camera-equipped smartphone. In addition, hard drives and cloud storage are getting cheaper and cheaper, leading to a tremendous growth in stored personal photos. Unlike photo collections captured by a digital camera, which typically are pre-processed by the user who organizes them into event-related folders, smartphone pictures are automatically stored in the cloud. As a consequence, photo collections captured by a smartphone are highly unstructured and because smartphones are ubiquitous, they present a larger variability compared to pictures captured by a digital camera. To solve the need of organizing large smartphone photo collections automatically, we propose here a new methodology for hierarchical photo organization into topics and topic-related categories. Our approach successfully estimates latent topics in the pictures by applying probabilistic Latent Semantic Analysis, and automatically assigns a name to each topic by relying on a lexical database. Topic-related categories are then estimated by using a set of topic-specific Convolutional Neuronal Networks. To validate our approach, we ensemble and make public a large dataset of more than 8,000 smartphone pictures from 40 persons. Experimental results demonstrate major user satisfaction with respect to state of the art solutions in terms of organization.Peer ReviewedPreprin

    Measuring concept similarities in multimedia ontologies: analysis and evaluations

    Get PDF
    The recent development of large-scale multimedia concept ontologies has provided a new momentum for research in the semantic analysis of multimedia repositories. Different methods for generic concept detection have been extensively studied, but the question of how to exploit the structure of a multimedia ontology and existing inter-concept relations has not received similar attention. In this paper, we present a clustering-based method for modeling semantic concepts on low-level feature spaces and study the evaluation of the quality of such models with entropy-based methods. We cover a variety of methods for assessing the similarity of different concepts in a multimedia ontology. We study three ontologies and apply the proposed techniques in experiments involving the visual and semantic similarities, manual annotation of video, and concept detection. The results show that modeling inter-concept relations can provide a promising resource for many different application areas in semantic multimedia processing

    Hybrid image representation methods for automatic image annotation: a survey

    Get PDF
    In most automatic image annotation systems, images are represented with low level features using either global methods or local methods. In global methods, the entire image is used as a unit. Local methods divide images into blocks where fixed-size sub-image blocks are adopted as sub-units; or into regions by using segmented regions as sub-units in images. In contrast to typical automatic image annotation methods that use either global or local features exclusively, several recent methods have considered incorporating the two kinds of information, and believe that the combination of the two levels of features is beneficial in annotating images. In this paper, we provide a survey on automatic image annotation techniques according to one aspect: feature extraction, and, in order to complement existing surveys in literature, we focus on the emerging image annotation methods: hybrid methods that combine both global and local features for image representation

    A Review on Video Search Engine Ranking

    Get PDF
    Search reranking is considered as a best and basic approach to enhance recovery accuracy. The recordings are recovered utilizing the related literary data, for example, encompassing content from the website page. The execution of such frameworks basically depends on the importance between the content and the recordings. In any case, they may not generally coordinate all around ok, which causes boisterous positioning results. For example, outwardly comparative recordings may have altogether different positions. So reranking has been proposed to tackle the issue. Video reranking, as a compelling approach to enhance the consequences of electronic video look however the issue is not paltry particularly when we are thinking about different elements or modalities for pursuit in video and video recovery. This paper proposes another sort of reranking calculation, the round reranking, that backings the common trade of data over numerous modalities for enhancing seek execution and takes after the rationality of solid performing methodology could gain from weaker ones

    On image auto-annotation with latent space models

    Get PDF

    Multimodal Probabilistic Latent Semantic Analysis for Sentinel-1 and Sentinel-2 Image Fusion

    Get PDF
    Probabilistic topic models have recently shown a great potential in the remote sensing image fusion field, which is particularly helpful in land-cover categorization tasks. This letter first studies the application of probabilistic latent semantic analysis (pLSA) and latent Dirichlet allocation to remote sensing synthetic aperture radar (SAR) and multispectral imaging (MSI) unsupervised land-cover categorization. Then, a novel pLSA-based image fusion approach is presented, which pursues to uncover multimodal feature patterns from SAR and MSI data in order to effectively fuse and categorize Sentinel-1 and Sentinel-2 remotely sensed data. Experiments conducted over two different data sets reveal the advantages of the proposed approach for unsupervised land-cover categorization tasks

    Semantic multimedia analysis using knowledge and context

    Get PDF
    PhDThe difficulty of semantic multimedia analysis can be attributed to the extended diversity in form and appearance exhibited by the majority of semantic concepts and the difficulty to express them using a finite number of patterns. In meeting this challenge there has been a scientific debate on whether the problem should be addressed from the perspective of using overwhelming amounts of training data to capture all possible instantiations of a concept, or from the perspective of using explicit knowledge about the concepts’ relations to infer their presence. In this thesis we address three problems of pattern recognition and propose solutions that combine the knowledge extracted implicitly from training data with the knowledge provided explicitly in structured form. First, we propose a BNs modeling approach that defines a conceptual space where both domain related evi- dence and evidence derived from content analysis can be jointly considered to support or disprove a hypothesis. The use of this space leads to sig- nificant gains in performance compared to analysis methods that can not handle combined knowledge. Then, we present an unsupervised method that exploits the collective nature of social media to automatically obtain large amounts of annotated image regions. By proving that the quality of the obtained samples can be almost as good as manually annotated images when working with large datasets, we significantly contribute towards scal- able object detection. Finally, we introduce a method that treats images, visual features and tags as the three observable variables of an aspect model and extracts a set of latent topics that incorporates the semantics of both visual and tag information space. By showing that the cross-modal depen- dencies of tagged images can be exploited to increase the semantic capacity of the resulting space, we advocate the use of all existing information facets in the semantic analysis of social media
    • …
    corecore