24,886 research outputs found
Information Extraction, Data Integration, and Uncertain Data Management: The State of The Art
Information Extraction, data Integration, and uncertain data management are different areas of research that got vast focus in the last two decades. Many researches tackled those areas of research individually. However, information extraction systems should have integrated with data integration methods to make use of the extracted information. Handling uncertainty in extraction and integration process is an important issue to enhance the quality of the data in such integrated systems. This article presents the state of the art of the mentioned areas of research and shows the common grounds and how to integrate information extraction and data integration under uncertainty management cover
Exploiting multimedia in creating and analysing multimedia Web archives
The data contained on the web and the social web are inherently multimedia and consist of a mixture of textual, visual and audio modalities. Community memories embodied on the web and social web contain a rich mixture of data from these modalities. In many ways, the web is the greatest resource ever created by human-kind. However, due to the dynamic and distributed nature of the web, its content changes, appears and disappears on a daily basis. Web archiving provides a way of capturing snapshots of (parts of) the web for preservation and future analysis. This paper provides an overview of techniques we have developed within the context of the EU funded ARCOMEM (ARchiving COmmunity MEMories) project to allow multimedia web content to be leveraged during the archival process and for post-archival analysis. Through a set of use cases, we explore several practical applications of multimedia analytics within the realm of web archiving, web archive analysis and multimedia data on the web in general
A Data-Driven Approach for Tag Refinement and Localization in Web Videos
Tagging of visual content is becoming more and more widespread as web-based
services and social networks have popularized tagging functionalities among
their users. These user-generated tags are used to ease browsing and
exploration of media collections, e.g. using tag clouds, or to retrieve
multimedia content. However, not all media are equally tagged by users. Using
the current systems is easy to tag a single photo, and even tagging a part of a
photo, like a face, has become common in sites like Flickr and Facebook. On the
other hand, tagging a video sequence is more complicated and time consuming, so
that users just tag the overall content of a video. In this paper we present a
method for automatic video annotation that increases the number of tags
originally provided by users, and localizes them temporally, associating tags
to keyframes. Our approach exploits collective knowledge embedded in
user-generated tags and web sources, and visual similarity of keyframes and
images uploaded to social sites like YouTube and Flickr, as well as web sources
like Google and Bing. Given a keyframe, our method is able to select on the fly
from these visual sources the training exemplars that should be the most
relevant for this test sample, and proceeds to transfer labels across similar
images. Compared to existing video tagging approaches that require training
classifiers for each tag, our system has few parameters, is easy to implement
and can deal with an open vocabulary scenario. We demonstrate the approach on
tag refinement and localization on DUT-WEBV, a large dataset of web videos, and
show state-of-the-art results.Comment: Preprint submitted to Computer Vision and Image Understanding (CVIU
Historical collaborative geocoding
The latest developments in digital have provided large data sets that can
increasingly easily be accessed and used. These data sets often contain
indirect localisation information, such as historical addresses. Historical
geocoding is the process of transforming the indirect localisation information
to direct localisation that can be placed on a map, which enables spatial
analysis and cross-referencing. Many efficient geocoders exist for current
addresses, but they do not deal with the temporal aspect and are based on a
strict hierarchy (..., city, street, house number) that is hard or impossible
to use with historical data. Indeed historical data are full of uncertainties
(temporal aspect, semantic aspect, spatial precision, confidence in historical
source, ...) that can not be resolved, as there is no way to go back in time to
check. We propose an open source, open data, extensible solution for geocoding
that is based on the building of gazetteers composed of geohistorical objects
extracted from historical topographical maps. Once the gazetteers are
available, geocoding an historical address is a matter of finding the
geohistorical object in the gazetteers that is the best match to the historical
address. The matching criteriae are customisable and include several dimensions
(fuzzy semantic, fuzzy temporal, scale, spatial precision ...). As the goal is
to facilitate historical work, we also propose web-based user interfaces that
help geocode (one address or batch mode) and display over current or historical
topographical maps, so that they can be checked and collaboratively edited. The
system is tested on Paris city for the 19-20th centuries, shows high returns
rate and is fast enough to be used interactively.Comment: WORKING PAPE
The Ultraviolet Sky: An Overview from the GALEX Surveys
The Galaxy Evolution Explorer (GALEX) has performed the first surveys of the
sky in the Ultraviolet (UV). Its legacy is an unprecedented database with more
than 200 million source measurements in far-UV (FUV) and near-UV (NUV), as well
as wide-field imaging of extended objects, filling an important gap in our view
of the sky across the electromagnetic spectrum. The UV surveys offer unique
sensitivity for identifying and studying selected classes of astrophysical
objects, both stellar and extra-galactic. We examine the overall content and
distribution of UV sources over the sky, and with magnitude and color. For this
purpose, we have constructed final catalogs of UV sources with homogeneous
quality, eliminating duplicate measurements of the same source. Such catalogs
can facilitate a variety of investigations on UV-selected samples, as well as
planning of observations with future missions.
We describe the criteria used to build the catalogs, their coverage and
completeness. We included observations in which both the far-UV and near-UV
detectors were exposed; 28,707 fields from the All-Sky Imaging survey (AIS)
cover a unique area of 22,080 square degrees (after we restrict the catalogs to
the central 1-degree diameter of the field), with a typical depth of about
20/21 mag (FUV/NUV, in the AB mag system), and 3,008 fields from the
Medium-depth Imaging Survey (MIS) cover a total of 2,251 square degrees at a
depth of about 22.7mag. The catalogs contain about 71 and 16.6 million sources
respectively. The density of hot stars reflects the Galactic structure, and the
number counts of both Galactic and extra-galactic sources are modulated by the
Milky Way dust extinction, to which the UV data are very sensitive.Comment: J. Adv. Space Res. (2013), Full resolution figures can be found in
the original published article (open access) at :
http://www.sciencedirect.com/science/article/pii/S0273117713004742 or from
http://dolomiti.pha.jhu.edu/publgoto.html ; catalogs are posted on MAS
- âŠ