24,886 research outputs found

    Information Extraction, Data Integration, and Uncertain Data Management: The State of The Art

    Get PDF
    Information Extraction, data Integration, and uncertain data management are different areas of research that got vast focus in the last two decades. Many researches tackled those areas of research individually. However, information extraction systems should have integrated with data integration methods to make use of the extracted information. Handling uncertainty in extraction and integration process is an important issue to enhance the quality of the data in such integrated systems. This article presents the state of the art of the mentioned areas of research and shows the common grounds and how to integrate information extraction and data integration under uncertainty management cover

    Exploiting multimedia in creating and analysing multimedia Web archives

    No full text
    The data contained on the web and the social web are inherently multimedia and consist of a mixture of textual, visual and audio modalities. Community memories embodied on the web and social web contain a rich mixture of data from these modalities. In many ways, the web is the greatest resource ever created by human-kind. However, due to the dynamic and distributed nature of the web, its content changes, appears and disappears on a daily basis. Web archiving provides a way of capturing snapshots of (parts of) the web for preservation and future analysis. This paper provides an overview of techniques we have developed within the context of the EU funded ARCOMEM (ARchiving COmmunity MEMories) project to allow multimedia web content to be leveraged during the archival process and for post-archival analysis. Through a set of use cases, we explore several practical applications of multimedia analytics within the realm of web archiving, web archive analysis and multimedia data on the web in general

    A Data-Driven Approach for Tag Refinement and Localization in Web Videos

    Get PDF
    Tagging of visual content is becoming more and more widespread as web-based services and social networks have popularized tagging functionalities among their users. These user-generated tags are used to ease browsing and exploration of media collections, e.g. using tag clouds, or to retrieve multimedia content. However, not all media are equally tagged by users. Using the current systems is easy to tag a single photo, and even tagging a part of a photo, like a face, has become common in sites like Flickr and Facebook. On the other hand, tagging a video sequence is more complicated and time consuming, so that users just tag the overall content of a video. In this paper we present a method for automatic video annotation that increases the number of tags originally provided by users, and localizes them temporally, associating tags to keyframes. Our approach exploits collective knowledge embedded in user-generated tags and web sources, and visual similarity of keyframes and images uploaded to social sites like YouTube and Flickr, as well as web sources like Google and Bing. Given a keyframe, our method is able to select on the fly from these visual sources the training exemplars that should be the most relevant for this test sample, and proceeds to transfer labels across similar images. Compared to existing video tagging approaches that require training classifiers for each tag, our system has few parameters, is easy to implement and can deal with an open vocabulary scenario. We demonstrate the approach on tag refinement and localization on DUT-WEBV, a large dataset of web videos, and show state-of-the-art results.Comment: Preprint submitted to Computer Vision and Image Understanding (CVIU

    Historical collaborative geocoding

    Full text link
    The latest developments in digital have provided large data sets that can increasingly easily be accessed and used. These data sets often contain indirect localisation information, such as historical addresses. Historical geocoding is the process of transforming the indirect localisation information to direct localisation that can be placed on a map, which enables spatial analysis and cross-referencing. Many efficient geocoders exist for current addresses, but they do not deal with the temporal aspect and are based on a strict hierarchy (..., city, street, house number) that is hard or impossible to use with historical data. Indeed historical data are full of uncertainties (temporal aspect, semantic aspect, spatial precision, confidence in historical source, ...) that can not be resolved, as there is no way to go back in time to check. We propose an open source, open data, extensible solution for geocoding that is based on the building of gazetteers composed of geohistorical objects extracted from historical topographical maps. Once the gazetteers are available, geocoding an historical address is a matter of finding the geohistorical object in the gazetteers that is the best match to the historical address. The matching criteriae are customisable and include several dimensions (fuzzy semantic, fuzzy temporal, scale, spatial precision ...). As the goal is to facilitate historical work, we also propose web-based user interfaces that help geocode (one address or batch mode) and display over current or historical topographical maps, so that they can be checked and collaboratively edited. The system is tested on Paris city for the 19-20th centuries, shows high returns rate and is fast enough to be used interactively.Comment: WORKING PAPE

    The Ultraviolet Sky: An Overview from the GALEX Surveys

    Get PDF
    The Galaxy Evolution Explorer (GALEX) has performed the first surveys of the sky in the Ultraviolet (UV). Its legacy is an unprecedented database with more than 200 million source measurements in far-UV (FUV) and near-UV (NUV), as well as wide-field imaging of extended objects, filling an important gap in our view of the sky across the electromagnetic spectrum. The UV surveys offer unique sensitivity for identifying and studying selected classes of astrophysical objects, both stellar and extra-galactic. We examine the overall content and distribution of UV sources over the sky, and with magnitude and color. For this purpose, we have constructed final catalogs of UV sources with homogeneous quality, eliminating duplicate measurements of the same source. Such catalogs can facilitate a variety of investigations on UV-selected samples, as well as planning of observations with future missions. We describe the criteria used to build the catalogs, their coverage and completeness. We included observations in which both the far-UV and near-UV detectors were exposed; 28,707 fields from the All-Sky Imaging survey (AIS) cover a unique area of 22,080 square degrees (after we restrict the catalogs to the central 1-degree diameter of the field), with a typical depth of about 20/21 mag (FUV/NUV, in the AB mag system), and 3,008 fields from the Medium-depth Imaging Survey (MIS) cover a total of 2,251 square degrees at a depth of about 22.7mag. The catalogs contain about 71 and 16.6 million sources respectively. The density of hot stars reflects the Galactic structure, and the number counts of both Galactic and extra-galactic sources are modulated by the Milky Way dust extinction, to which the UV data are very sensitive.Comment: J. Adv. Space Res. (2013), Full resolution figures can be found in the original published article (open access) at : http://www.sciencedirect.com/science/article/pii/S0273117713004742 or from http://dolomiti.pha.jhu.edu/publgoto.html ; catalogs are posted on MAS
    • 

    corecore