34 research outputs found

    Breakingnews: article annotation by image and text processing

    Get PDF
    © 20xx IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.Building upon recent Deep Neural Network architectures, current approaches lying in the intersection of Computer Vision and Natural Language Processing have achieved unprecedented breakthroughs in tasks like automatic captioning or image retrieval. Most of these learning methods, though, rely on large training sets of images associated with human annotations that specifically describe the visual content. In this paper we propose to go a step further and explore the more complex cases where textual descriptions are loosely related to the images. We focus on the particular domain of news articles in which the textual content often expresses connotative and ambiguous relations that are only suggested but not directly inferred from images. We introduce an adaptive CNN architecture that shares most of the structure for multiple tasks including source detection, article illustration and geolocation of articles. Deep Canonical Correlation Analysis is deployed for article illustration, and a new loss function based on Great Circle Distance is proposed for geolocation. Furthermore, we present BreakingNews, a novel dataset with approximately 100K news articles including images, text and captions, and enriched with heterogeneous meta-data (such as GPS coordinates and user comments). We show this dataset to be appropriate to explore all aforementioned problems, for which we provide a baseline performance using various Deep Learning architectures, and different representations of the textual and visual features. We report very promising results and bring to light several limitations of current state-of-the-art in this kind of domain, which we hope will help spur progress in the field.Peer ReviewedPostprint (author's final draft

    BlogForever D3.2: Interoperability Prospects

    Get PDF
    This report evaluates the interoperability prospects of the BlogForever platform. Therefore, existing interoperability models are reviewed, a Delphi study to identify crucial aspects for the interoperability of web archives and digital libraries is conducted, technical interoperability standards and protocols are reviewed regarding their relevance for BlogForever, a simple approach to consider interoperability in specific usage scenarios is proposed, and a tangible approach to develop a succession plan that would allow a reliable transfer of content from the current digital archive to other digital repositories is presented

    Hierarchical categorisation of web tags for Delicious

    Get PDF
    In the scenario of social bookmarking, a user browsing the Web bookmarks web pages and assigns free-text labels (i.e., tags) to them according to their personal preferences. The benefits of social tagging are clear – tags enhance Web content browsing and search. However, since these tags may be publicly available to any Internet user, a privacy attacker may collect this information and extract an accurate snapshot of users’ interests or user profiles, containing sensitive information, such as health-related information, political preferences, salary or religion. In order to hinder attackers in their efforts to profile users, this report focuses on the practical aspects of capturing user interests from their tagging activity. More accurately, we study how to categorise a collection of tags posted by users in one of the most popular bookmarking services, Delicious (http://delicious.com).Preprin

    Image annotation and retrieval based on multi-modal feature clustering and similarity propagation.

    Get PDF
    The performance of content-based image retrieval systems has proved to be inherently constrained by the used low level features, and cannot give satisfactory results when the user\u27s high level concepts cannot be expressed by low level features. In an attempt to bridge this semantic gap, recent approaches started integrating both low level-visual features and high-level textual keywords. Unfortunately, manual image annotation is a tedious process and may not be possible for large image databases. In this thesis we propose a system for image retrieval that has three mains components. The first component of our system consists of a novel possibilistic clustering and feature weighting algorithm based on robust modeling of the Generalized Dirichlet (GD) finite mixture. Robust estimation of the mixture model parameters is achieved by incorporating two complementary types of membership degrees. The first one is a posterior probability that indicates the degree to which a point fits the estimated distribution. The second membership represents the degree of typicality and is used to indentify and discard noise points. Robustness to noisy and irrelevant features is achieved by transforming the data to make the features independent and follow Beta distribution, and learning optimal relevance weight for each feature subset within each cluster. We extend our algorithm to find the optimal number of clusters in an unsupervised and efficient way by exploiting some properties of the possibilistic membership function. We also outline a semi-supervised version of the proposed algorithm. In the second component of our system consists of a novel approach to unsupervised image annotation. Our approach is based on: (i) the proposed semi-supervised possibilistic clustering; (ii) a greedy selection and joining algorithm (GSJ); (iii) Bayes rule; and (iv) a probabilistic model that is based on possibilistic memebership degrees to annotate an image. The third component of the proposed system consists of an image retrieval framework based on multi-modal similarity propagation. The proposed framework is designed to deal with two data modalities: low-level visual features and high-level textual keywords generated by our proposed image annotation algorithm. The multi-modal similarity propagation system exploits the mutual reinforcement of relational data and results in a nonlinear combination of the different modalities. Specifically, it is used to learn the semantic similarities between images by leveraging the relationships between features from the different modalities. The proposed image annotation and retrieval approaches are implemented and tested with a standard benchmark dataset. We show the effectiveness of our clustering algorithm to handle high dimensional and noisy data. We compare our proposed image annotation approach to three state-of-the-art methods and demonstrate the effectiveness of the proposed image retrieval system

    AXMEDIS 2008

    Get PDF
    The AXMEDIS International Conference series aims to explore all subjects and topics related to cross-media and digital-media content production, processing, management, standards, representation, sharing, protection and rights management, to address the latest developments and future trends of the technologies and their applications, impacts and exploitation. The AXMEDIS events offer venues for exchanging concepts, requirements, prototypes, research ideas, and findings which could contribute to academic research and also benefit business and industrial communities. In the Internet as well as in the digital era, cross-media production and distribution represent key developments and innovations that are fostered by emergent technologies to ensure better value for money while optimising productivity and market coverage

    Development of a conceptual graphical user interface framework for the creation of XML metadata for digital archives

    Get PDF
    This dissertation is motivated by the DFG sponsored Jonas Cohn Archive digitization project at Steinheim-Institut whose aim was to preserve and provide digital access to structured handwritten historical archive material highlighting New Kantian philosophy scattered in the correspondence, diaries and private journals kept by and written to and by Jonas Cohn. The dissertation describes a framework for processing and presenting multi-standard digital archive material. A set of standard markup schema and semantic bibliographic descriptions have been chosen to illustrate the multiple standard and hence semantic heterogeneous digital archiving process. The standards include Text Encoding Initiative (TEI), Metadata Encoding and Transmission Standard (METS) and Metadata Object Description Schema (MODS). The chosen standards best illustrate the structural contrast between the systematic archive, digitized archive and digitized text standards. Furthermore, combined digital preservation and presentation approaches offer not only the digitized texts but also metadata structured variably sized images of the archive documents enabling virtual visualization. State of the art applications focus solely on either one of the structural areas neglecting the compound idea of a virtual digital archive. The content of this work describes the requirements analysis for managing multi-structured and therefore multi-standard digital archival artefacts in textual and image form. In addition to the architecture and design, an infrastructure suitable for processing, managing and presenting such scholarly archives is sought for recognition as a digital framework useful for the preservation and access to digitized cultural resources. The proposed solution therefore includes the instrumentation of a conglomerate of existing and novel XML technology for transformations based in a centralized application. The archive can then be managed via a client-server application thereby focusing archival activities on structured data collection and information preservation illustrated in the dissertation process by the: ‱ Development of a prototype data model allowing the integration of the relevant markup schema ‱ Implementation of a prototype client server application handling archive processing, management and presentation and based on the data model already mentioned ‱ Development and implementation of a role archive access user interface Furthermore as an infrastructural development serving expert archivists from the humanities, the dissertation explores methods of binding the existing XML metadata creation process to other programming languages. In doing so, one opens further for channels simplifying the metadata creation process by integrating the use of graphical user interfaces. To this end the java programming language, its swing and AWT graphical user interface libraries, associated relational persistency and enterprise client server architecture resemble a suitable environment for integrating XML metadata into main stream computing. Hence the implementation of Java XML Data Binding as part of the metadata creation framework is part and parcel of the proposed solution.Diese Arbeit geht hervor aus dem von der DFG geförderten Projekt zu Digitalisierung des Jonas Cohn Archivs im Steinheim-Institut, dessen Ziel es ist, eine strukturierte Auswahl von Handschriften des Philosophen Jonas Cohns in digitaler Form zu bewahren und den Zugang zu ihnen zu erleichtern. Die Dissertation beschreibt ein Rahmenwerk fĂŒr die digitale Verarbeitung und PrĂ€sentation digitalisierter Archivinhalte und ihrer Metadaten, strukturiert anhand von mehr als einem Beschreibungsstandard. Eine Auswahl von Standard Markup Schemata und bibliographisch semantischen Beschreibungen wurde getroffen, um die Problematik darzustellen, die aus der BerĂŒcksichtigung mehrerer Standards und damit aus semantischer HeterogenitĂ€t des Digitalisierungsprozesses entsteht. Diese Auswahl umfasst unter anderem die Text Encoding Initiative (TEI), Metadata Encoding and Transmission Schema (METS) und Metadata Object Description Schema (MODS) als Beispiele fĂŒr Beschreibungsstandards. Diese Standards sind am besten geeignet, die strukturellen und semantischen Unterschiede zwischen den Standards eines systematisch und semantisch zu digitalisierenden Archivs darzustellen. ZusĂ€tzlich verbindet der Ansatz die digitale Bewahrung und PrĂ€sentation von digitalisierten Texten und von Metadaten strukturierter Bilder der Archivinhalte. Dies ermöglicht eine virtuelle PrĂ€sentation des digitalen Archivs. Eine große Zahl bekannter Digitalisierungsanwendungen folgt nur einer der beiden Strukturierungsziele Bewahrung und PrĂ€sentation, wodurch der Ansatz eines vollstĂ€ndig virtuellen digitalen Archivs vernachlĂ€ssigt wird. Der Schwerpunkt dieser Arbeit ist die Beschreibung einer Managementinfrastruktur fĂŒr die Erfassung und Auszeichnung von Multi-Standard Metadaten fĂŒr digitale Handschriftensammlungen. ZusĂ€tzlich zu der Architektur und dem Design wird nach einer geeigneten Infrastruktur gesucht fĂŒr die Erfassung, Verarbeitung und die PrĂ€sentation wissenschaftlicher Archive als digitales Rahmenwerk fĂŒr den Zugang zu und die Bewahrung von Kulturbesitz. Die hier vorgeschlagene Lösung sieht deshalb die Nutzung bestehender und neuer XML Technologien vor, verknĂŒpft in einer zentralen Anwendung. So wird im Rahmen der Dissertation die Strukturierung des Archivs mittels einer Client-Server-Anwendung betrieben und die Bewahrungsmaßnahmen als Prozess herausgearbeitet. Die Arbeit verfolgt mehrere Zielsetzungen: ‱ Die Entwicklung eines prototypischen Datenmodells mit der Einbindung relevanter Markup Schemata ‱ Die Implementierung einer prototypischen Client Server Anwendung fĂŒr die Bearbeitung, Erfassung und PrĂ€sentation der Archive anhand des beschriebenen Datenmodells ‱ Die Entwicklung, Implementierung und Bewertung einer Benutzerschnittstelle fĂŒr die Interaktion mit dem Rahmenwerk anhand einer Expertenevaluation

    Semantic multimedia analysis using knowledge and context

    Get PDF
    PhDThe difficulty of semantic multimedia analysis can be attributed to the extended diversity in form and appearance exhibited by the majority of semantic concepts and the difficulty to express them using a finite number of patterns. In meeting this challenge there has been a scientific debate on whether the problem should be addressed from the perspective of using overwhelming amounts of training data to capture all possible instantiations of a concept, or from the perspective of using explicit knowledge about the concepts’ relations to infer their presence. In this thesis we address three problems of pattern recognition and propose solutions that combine the knowledge extracted implicitly from training data with the knowledge provided explicitly in structured form. First, we propose a BNs modeling approach that defines a conceptual space where both domain related evi- dence and evidence derived from content analysis can be jointly considered to support or disprove a hypothesis. The use of this space leads to sig- nificant gains in performance compared to analysis methods that can not handle combined knowledge. Then, we present an unsupervised method that exploits the collective nature of social media to automatically obtain large amounts of annotated image regions. By proving that the quality of the obtained samples can be almost as good as manually annotated images when working with large datasets, we significantly contribute towards scal- able object detection. Finally, we introduce a method that treats images, visual features and tags as the three observable variables of an aspect model and extracts a set of latent topics that incorporates the semantics of both visual and tag information space. By showing that the cross-modal depen- dencies of tagged images can be exploited to increase the semantic capacity of the resulting space, we advocate the use of all existing information facets in the semantic analysis of social media

    Format-independent media resource adaptation and delivery

    Get PDF

    Web Archive Services Framework for Tighter Integration Between the Past and Present Web

    Get PDF
    Web archives have contained the cultural history of the web for many years, but they still have a limited capability for access. Most of the web archiving research has focused on crawling and preservation activities, with little focus on the delivery methods. The current access methods are tightly coupled with web archive infrastructure, hard to replicate or integrate with other web archives, and do not cover all the users\u27 needs. In this dissertation, we focus on the access methods for archived web data to enable users, third-party developers, researchers, and others to gain knowledge from the web archives. We build ArcSys, a new service framework that extracts, preserves, and exposes APIs for the web archive corpus. The dissertation introduces a novel categorization technique to divide the archived corpus into four levels. For each level, we will propose suitable services and APIs that enable both users and third-party developers to build new interfaces. The first level is the content level that extracts the content from the archived web data. We develop ArcContent to expose the web archive content processed through various filters. The second level is the metadata level; we extract the metadata from the archived web data and make it available to users. We implement two services, ArcLink for temporal web graph and ArcThumb for optimizing the thumbnail creation in the web archives. The third level is the URI level that focuses on using the URI HTTP redirection status to enhance the user query. Finally, the highest level in the web archiving service framework pyramid is the archive level. In this level, we define the web archive by the characteristics of its corpus and building Web Archive Profiles. The profiles are used by the Memento Aggregator for query optimization
    corecore