145 research outputs found
Low-Light Image Enhancement Based on U-Net and Haar Wavelet Pooling
The inevitable environmental and technical limitations of image capturing has as a consequence that many images are frequently taken in inadequate and unbalanced lighting conditions. Low-light image enhancement has been very popular for improving the visual quality of image representations, while low-light images often require advanced techniques to improve the perception of information for a human viewer. One of the main objectives in increasing the lighting conditions is to retain patterns, texture, and style with minimal deviations from the considered image. To this direction, we propose a low-light image enhancement method with Haar wavelet-based pooling to preserve texture regions and increase their quality. The presented framework is based on the U-Net architecture to retain spatial information, with a multi-layer feature aggregation (MFA) method. The method obtains the details from the low-level layers in the stylization processing. The encoder is based on dense blocks, while the decoder is the reverse of the encoder, and extracts features that reconstruct the image. Experimental results show that the combination of the U-Net architecture with dense blocks and the wavelet-based pooling mechanism comprises an efficient approach in low-light image enhancement applications. Qualitative and quantitative evaluation demonstrates that the proposed framework reaches state-of-the-art accuracy but with less resources than LeGAN
Predicting elections for multiple countries using Twitter and polls
The authors' work focuses on predicting the 2014 European Union elections in three different countries using Twitter and polls. Past works in this domain relying strictly on Twitter data have been proven ineffective. Others, using polls as their ground truth, have raised questions regarding the contribution of Twitter data for this task. Here, the authors treat this task as a multivariate time-series forecast, extracting Twitter- and poll-based features and training different predictive algorithms. They've achieved better results than several past works and the commercial baseline
Advanced content-based semantic scene analysis and information retrieval: the SCHEMA project
The aim of the SCHEMA Network of Excellence is to bring together a critical mass of universities, research centers, industrial partners and end users, in order to design a reference system for content-based semantic scene analysis, interpretation and understanding. Relevant research areas include: content-based multimedia analysis and automatic annotation of semantic multimedia content, combined textual and multimedia information retrieval, semantic -web, MPEG-7 and MPEG-21 standards, user interfaces and human factors. In this paper, recent advances in content-based analysis, indexing and retrieval of digital media within the SCHEMA Network are presented. These advances will be integrated in the SCHEMA module-based, expandable reference system
Video semantic content analysis framework based on ontology combined MPEG-7
The rapid increase in the available amount of video data is creating a growing demand for efficient methods for understanding and managing it at the semantic level. New multimedia standard, MPEG-7, provides the rich functionalities to enable the generation of audiovisual descriptions and is expressed solely in XML Schema which provides little support for expressing semantic knowledge. In this paper, a video semantic content analysis framework based on ontology combined MPEG-7 is presented. Domain
ontology is used to define high level semantic concepts and their relations in the context of the examined domain. MPEG-7 metadata terms of audiovisual descriptions and video content analysis algorithms are expressed in this ontology to enrich video semantic analysis. OWL is used for the ontology description. Rules in Description Logic are defined to describe how low-level features and algorithms for video analysis should be applied according to different perception content. Temporal Description Logic is used to describe the
semantic events, and a reasoning algorithm is proposed for events detection. The proposed framework is demonstrated in sports video domain and shows promising results
Cycle-Consistent Adversarial Networks and Fast Adaptive Bi-Dimensional Empirical Mode Decomposition for Style Transfer
Recently, research endeavors have shown the potentiality of Cycle-Consistent Adversarial Networks (CycleGAN) in style transfer. In Cycle-Consistent Adversarial Networks, the consistency loss is introduced to measure the difference between the original images and the reconstructed in both directions, forward and backward. In this work, the combination of Cycle-Consistent Adversarial Networks with Fast and Adaptive Bidimensional Empirical Mode Decomposition (FABEMD) is proposed to perform style transfer on images. In the proposed approach the cycle-consistency loss is modified to include the differences between the extracted Intrinsic Mode Functions (BIMFs) images. Instead of an estimation of pixel-to-pixel difference between the produced and input images, the FABEMD is applied and the extracted BIMFs are involved in the computation of the total cycle loss. This method enriches the computation of the total loss in a content-to-content and style-to-style comparison by connecting the spatial information to the frequency components. The experimental results reveal that the proposed method is efficient and produces qualitative results comparable to state-of-the-art methods
FIVR: Fine-Grained Incident Video Retrieval
This paper introduces the problem of Fine-grained Incident Video Retrieval
(FIVR). Given a query video, the objective is to retrieve all associated
videos, considering several types of associations that range from duplicate
videos to videos from the same incident. FIVR offers a single framework that
contains several retrieval tasks as special cases. To address the benchmarking
needs of all such tasks, we construct and present a large-scale annotated video
dataset, which we call FIVR-200K, and it comprises 225,960 videos. To create
the dataset, we devise a process for the collection of YouTube videos based on
major news events from recent years crawled from Wikipedia and deploy a
retrieval pipeline for the automatic selection of query videos based on their
estimated suitability as benchmarks. We also devise a protocol for the
annotation of the dataset with respect to the four types of video associations
defined by FIVR. Finally, we report the results of an experimental study on the
dataset comparing five state-of-the-art methods developed based on a variety of
visual descriptors, highlighting the challenges of the current problem
SocialSensor: sensing user generated input for improved media discovery and experience
SocialSensor will develop a new framework for enabling real-time multimedia indexing and search in the Social Web. The project moves beyond conventional text-based indexing and retrieval models by mining and aggregating user inputs and content over multiple social networking sites. Social Indexing will incorporate information about the structure and activity of the usersâ social network directly into the multimedia analysis and search process. Furthermore, it will enhance the multimedia consumption experience by developing novel user-centric media visualization and browsing paradigms. For example, SocialSensor will analyse the dynamic and massive user contributions in order to extract unbiased trending topics and events and will use social connections for improved recommendations. To achieve its objectives, SocialSensor introduces the concept of Dynamic Social COntainers (DySCOs), a new layer of online multimedia content organisation with particular emphasis on the real-time, social and contextual nature of content and information consumption. Through the proposed DySCOs-centered media search, SocialSensor will integrate social content mining, search and intelligent presentation in a personalized, context and network-aware way, based on aggregation and indexing of both UGC and multimedia Web content
- âŠ