145 research outputs found

    Editorial: Recent advances in image and video retrieval

    Get PDF

    Low-Light Image Enhancement Based on U-Net and Haar Wavelet Pooling

    Get PDF
    The inevitable environmental and technical limitations of image capturing has as a consequence that many images are frequently taken in inadequate and unbalanced lighting conditions. Low-light image enhancement has been very popular for improving the visual quality of image representations, while low-light images often require advanced techniques to improve the perception of information for a human viewer. One of the main objectives in increasing the lighting conditions is to retain patterns, texture, and style with minimal deviations from the considered image. To this direction, we propose a low-light image enhancement method with Haar wavelet-based pooling to preserve texture regions and increase their quality. The presented framework is based on the U-Net architecture to retain spatial information, with a multi-layer feature aggregation (MFA) method. The method obtains the details from the low-level layers in the stylization processing. The encoder is based on dense blocks, while the decoder is the reverse of the encoder, and extracts features that reconstruct the image. Experimental results show that the combination of the U-Net architecture with dense blocks and the wavelet-based pooling mechanism comprises an efficient approach in low-light image enhancement applications. Qualitative and quantitative evaluation demonstrates that the proposed framework reaches state-of-the-art accuracy but with less resources than LeGAN

    Predicting elections for multiple countries using Twitter and polls

    Get PDF
    The authors' work focuses on predicting the 2014 European Union elections in three different countries using Twitter and polls. Past works in this domain relying strictly on Twitter data have been proven ineffective. Others, using polls as their ground truth, have raised questions regarding the contribution of Twitter data for this task. Here, the authors treat this task as a multivariate time-series forecast, extracting Twitter- and poll-based features and training different predictive algorithms. They've achieved better results than several past works and the commercial baseline

    Advanced content-based semantic scene analysis and information retrieval: the SCHEMA project

    Get PDF
    The aim of the SCHEMA Network of Excellence is to bring together a critical mass of universities, research centers, industrial partners and end users, in order to design a reference system for content-based semantic scene analysis, interpretation and understanding. Relevant research areas include: content-based multimedia analysis and automatic annotation of semantic multimedia content, combined textual and multimedia information retrieval, semantic -web, MPEG-7 and MPEG-21 standards, user interfaces and human factors. In this paper, recent advances in content-based analysis, indexing and retrieval of digital media within the SCHEMA Network are presented. These advances will be integrated in the SCHEMA module-based, expandable reference system

    Video semantic content analysis framework based on ontology combined MPEG-7

    Get PDF
    The rapid increase in the available amount of video data is creating a growing demand for efficient methods for understanding and managing it at the semantic level. New multimedia standard, MPEG-7, provides the rich functionalities to enable the generation of audiovisual descriptions and is expressed solely in XML Schema which provides little support for expressing semantic knowledge. In this paper, a video semantic content analysis framework based on ontology combined MPEG-7 is presented. Domain ontology is used to define high level semantic concepts and their relations in the context of the examined domain. MPEG-7 metadata terms of audiovisual descriptions and video content analysis algorithms are expressed in this ontology to enrich video semantic analysis. OWL is used for the ontology description. Rules in Description Logic are defined to describe how low-level features and algorithms for video analysis should be applied according to different perception content. Temporal Description Logic is used to describe the semantic events, and a reasoning algorithm is proposed for events detection. The proposed framework is demonstrated in sports video domain and shows promising results

    Geotagging Text Content With Language Models and Feature Mining

    Get PDF

    Cycle-Consistent Adversarial Networks and Fast Adaptive Bi-Dimensional Empirical Mode Decomposition for Style Transfer

    Get PDF
    Recently, research endeavors have shown the potentiality of Cycle-Consistent Adversarial Networks (CycleGAN) in style transfer. In Cycle-Consistent Adversarial Networks, the consistency loss is introduced to measure the difference between the original images and the reconstructed in both directions, forward and backward. In this work, the combination of Cycle-Consistent Adversarial Networks with Fast and Adaptive Bidimensional Empirical Mode Decomposition (FABEMD) is proposed to perform style transfer on images. In the proposed approach the cycle-consistency loss is modified to include the differences between the extracted Intrinsic Mode Functions (BIMFs) images. Instead of an estimation of pixel-to-pixel difference between the produced and input images, the FABEMD is applied and the extracted BIMFs are involved in the computation of the total cycle loss. This method enriches the computation of the total loss in a content-to-content and style-to-style comparison by connecting the spatial information to the frequency components. The experimental results reveal that the proposed method is efficient and produces qualitative results comparable to state-of-the-art methods

    Hierarchical representation and coding of surfaces using 3-D polygon meshes

    Full text link

    FIVR: Fine-Grained Incident Video Retrieval

    Get PDF
    This paper introduces the problem of Fine-grained Incident Video Retrieval (FIVR). Given a query video, the objective is to retrieve all associated videos, considering several types of associations that range from duplicate videos to videos from the same incident. FIVR offers a single framework that contains several retrieval tasks as special cases. To address the benchmarking needs of all such tasks, we construct and present a large-scale annotated video dataset, which we call FIVR-200K, and it comprises 225,960 videos. To create the dataset, we devise a process for the collection of YouTube videos based on major news events from recent years crawled from Wikipedia and deploy a retrieval pipeline for the automatic selection of query videos based on their estimated suitability as benchmarks. We also devise a protocol for the annotation of the dataset with respect to the four types of video associations defined by FIVR. Finally, we report the results of an experimental study on the dataset comparing five state-of-the-art methods developed based on a variety of visual descriptors, highlighting the challenges of the current problem

    SocialSensor: sensing user generated input for improved media discovery and experience

    Get PDF
    SocialSensor will develop a new framework for enabling real-time multimedia indexing and search in the Social Web. The project moves beyond conventional text-based indexing and retrieval models by mining and aggregating user inputs and content over multiple social networking sites. Social Indexing will incorporate information about the structure and activity of the users‟ social network directly into the multimedia analysis and search process. Furthermore, it will enhance the multimedia consumption experience by developing novel user-centric media visualization and browsing paradigms. For example, SocialSensor will analyse the dynamic and massive user contributions in order to extract unbiased trending topics and events and will use social connections for improved recommendations. To achieve its objectives, SocialSensor introduces the concept of Dynamic Social COntainers (DySCOs), a new layer of online multimedia content organisation with particular emphasis on the real-time, social and contextual nature of content and information consumption. Through the proposed DySCOs-centered media search, SocialSensor will integrate social content mining, search and intelligent presentation in a personalized, context and network-aware way, based on aggregation and indexing of both UGC and multimedia Web content
    • 

    corecore