2,045 research outputs found

    Extraction and representation of semantic information in digital media

    Get PDF

    Unsupervised quantification of entity consistency between photos and text in real-world news

    Get PDF
    Das World Wide Web und die sozialen Medien übernehmen im heutigen Informationszeitalter eine wichtige Rolle für die Vermittlung von Nachrichten und Informationen. In der Regel werden verschiedene Modalitäten im Sinne der Informationskodierung wie beispielsweise Fotos und Text verwendet, um Nachrichten effektiver zu vermitteln oder Aufmerksamkeit zu erregen. Kommunikations- und Sprachwissenschaftler erforschen das komplexe Zusammenspiel zwischen Modalitäten seit Jahrzehnten und haben unter Anderem untersucht, wie durch die Kombination der Modalitäten zusätzliche Informationen oder eine neue Bedeutungsebene entstehen können. Die Anzahl gemeinsamer Konzepte oder Entitäten (beispielsweise Personen, Orte und Ereignisse) zwischen Fotos und Text stellen einen wichtigen Aspekt für die Bewertung der Gesamtaussage und Bedeutung eines multimodalen Artikels dar. Automatisierte Ansätze zur Quantifizierung von Bild-Text-Beziehungen können für zahlreiche Anwendungen eingesetzt werden. Sie ermöglichen beispielsweise eine effiziente Exploration von Nachrichten, erleichtern die semantische Suche von Multimedia-Inhalten in (Web)-Archiven oder unterstützen menschliche Analysten bei der Evaluierung der Glaubwürdigkeit von Nachrichten. Allerdings gibt es bislang nur wenige Ansätze, die sich mit der Quantifizierung von Beziehungen zwischen Fotos und Text beschäftigen. Diese Ansätze berücksichtigen jedoch nicht explizit die intermodalen Beziehungen von Entitäten, welche eine wichtige Rolle in Nachrichten darstellen, oder basieren auf überwachten multimodalen Deep-Learning-Techniken. Diese überwachten Lernverfahren können ausschließlich die intermodalen Beziehungen von Entitäten detektieren, die in annotierten Trainingsdaten enthalten sind. Um diese Forschungslücke zu schließen, wird in dieser Arbeit ein unüberwachter Ansatz zur Quantifizierung der intermodalen Konsistenz von Entitäten zwischen Fotos und Text in realen multimodalen Nachrichtenartikeln vorgestellt. Im ersten Teil dieser Arbeit werden neuartige Verfahren auf Basis von Deep Learning zur Extrahierung von Informationen aus Fotos vorgestellt, um Ereignisse (Events), Orte, Zeitangaben und Personen automatisch zu erkennen. Diese Verfahren bilden eine wichtige Voraussetzung, um die Beziehungen von Entitäten zwischen Bild und Text zu bewerten. Zunächst wird ein Ansatz zur Ereignisklassifizierung präsentiert, der neuartige Optimierungsfunktionen und Gewichtungsschemata nutzt um Ontologie-Informationen aus einer Wissensdatenbank in ein Deep-Learning-Verfahren zu integrieren. Das Training erfolgt anhand eines neu vorgestellten Datensatzes, der 570.540 Fotos und eine Ontologie mit 148 Ereignistypen enthält. Der Ansatz übertrifft die Ergebnisse von Referenzsystemen die keine strukturierten Ontologie-Informationen verwenden. Weiterhin wird ein DeepLearning-Ansatz zur Schätzung des Aufnahmeortes von Fotos vorgeschlagen, der Kontextinformationen über die Umgebung (Innen-, Stadt-, oder Naturaufnahme) und von Erdpartitionen unterschiedlicher Granularität verwendet. Die vorgeschlagene Lösung übertrifft die bisher besten Ergebnisse von aktuellen Forschungsarbeiten, obwohl diese deutlich mehr Fotos zum Training verwenden. Darüber hinaus stellen wir den ersten Datensatz zur Schätzung des Aufnahmejahres von Fotos vor, der mehr als eine Million Bilder aus den Jahren 1930 bis 1999 umfasst. Dieser Datensatz wird für das Training von zwei Deep-Learning-Ansätzen zur Schätzung des Aufnahmejahres verwendet, welche die Aufgabe als Klassifizierungs- und Regressionsproblem behandeln. Beide Ansätze erzielen sehr gute Ergebnisse und übertreffen Annotationen von menschlichen Probanden. Schließlich wird ein neuartiger Ansatz zur Identifizierung von Personen des öffentlichen Lebens und ihres gemeinsamen Auftretens in Nachrichtenfotos aus der digitalen Bibliothek Internet Archiv präsentiert. Der Ansatz ermöglicht es unstrukturierte Webdaten aus dem Internet Archiv mit Metadaten, beispielsweise zur semantischen Suche, zu erweitern. Experimentelle Ergebnisse haben die Effektivität des zugrundeliegenden Deep-Learning-Ansatzes zur Personenerkennung bestätigt. Im zweiten Teil dieser Arbeit wird ein unüberwachtes System zur Quantifizierung von BildText-Beziehungen in realen Nachrichten vorgestellt. Im Gegensatz zu bisherigen Verfahren liefert es automatisch neuartige Maße der intermodalen Konsistenz für verschiedene Entitätstypen (Personen, Orte und Ereignisse) sowie den Gesamtkontext. Das System ist nicht auf vordefinierte Datensätze angewiesen, und kann daher mit der Vielzahl und Diversität von Entitäten und Themen in Nachrichten umgehen. Zur Extrahierung von Entitäten aus dem Text werden geeignete Methoden der natürlichen Sprachverarbeitung eingesetzt. Examplarbilder für diese Entitäten werden automatisch aus dem Internet beschafft. Die vorgeschlagenen Methoden zur Informationsextraktion aus Fotos werden auf die Nachrichten- und heruntergeladenen Exemplarbilder angewendet, um die intermodale Konsistenz von Entitäten zu quantifizieren. Es werden zwei Aufgaben untersucht um die Qualität des vorgeschlagenen Ansatzes in realen Anwendungen zu bewerten. Experimentelle Ergebnisse für die Dokumentverifikation und die Beschaffung von Nachrichten mit geringer (potenzielle Fehlinformation) oder hoher multimodalen Konsistenz zeigen den Nutzen und das Potenzial des Ansatzes zur Unterstützung menschlicher Analysten bei der Untersuchung von Nachrichten.In today’s information age, the World Wide Web and social media are important sources for news and information. Different modalities (in the sense of information encoding) such as photos and text are typically used to communicate news more effectively or to attract attention. Communication scientists, linguists, and semioticians have studied the complex interplay between modalities for decades and investigated, e.g., how their combination can carry additional information or add a new level of meaning. The number of shared concepts or entities (e.g., persons, locations, and events) between photos and text is an important aspect to evaluate the overall message and meaning of an article. Computational models for the quantification of image-text relations can enable many applications. For example, they allow for more efficient exploration of news, facilitate semantic search and multimedia retrieval in large (web) archives, or assist human assessors in evaluating news for credibility. To date, only a few approaches have been suggested that quantify relations between photos and text. However, they either do not explicitly consider the cross-modal relations of entities – which are important in the news – or rely on supervised deep learning approaches that can only detect the cross-modal presence of entities covered in the labeled training data. To address this research gap, this thesis proposes an unsupervised approach that can quantify entity consistency between photos and text in multimodal real-world news articles. The first part of this thesis presents novel approaches based on deep learning for information extraction from photos to recognize events, locations, dates, and persons. These approaches are an important prerequisite to measure the cross-modal presence of entities in text and photos. First, an ontology-driven event classification approach that leverages new loss functions and weighting schemes is presented. It is trained on a novel dataset of 570,540 photos and an ontology with 148 event types. The proposed system outperforms approaches that do not use structured ontology information. Second, a novel deep learning approach for geolocation estimation is proposed that uses additional contextual information on the environmental setting (indoor, urban, natural) and from earth partitions of different granularity. The proposed solution outperforms state-of-the-art approaches, which are trained with significantly more photos. Third, we introduce the first large-scale dataset for date estimation with more than one million photos taken between 1930 and 1999, along with two deep learning approaches that treat date estimation as a classification and regression problem. Both approaches achieve very good results that are superior to human annotations. Finally, a novel approach is presented that identifies public persons and their co-occurrences in news photos extracted from the Internet Archive, which collects time-versioned snapshots of web pages that are rarely enriched with metadata relevant to multimedia retrieval. Experimental results confirm the effectiveness of the deep learning approach for person identification. The second part of this thesis introduces an unsupervised approach capable of quantifying image-text relations in real-world news. Unlike related work, the proposed solution automatically provides novel measures of cross-modal consistency for different entity types (persons, locations, and events) as well as the overall context. The approach does not rely on any predefined datasets to cope with the large amount and diversity of entities and topics covered in the news. State-of-the-art tools for natural language processing are applied to extract named entities from the text. Example photos for these entities are automatically crawled from the Web. The proposed methods for information extraction from photos are applied to both news images and example photos to quantify the cross-modal consistency of entities. Two tasks are introduced to assess the quality of the proposed approach in real-world applications. Experimental results for document verification and retrieval of news with either low (potential misinformation) or high cross-modal similarities demonstrate the feasibility of the approach and its potential to support human assessors to study news

    Web based public participation in visual impact assessment of urban landscape.

    Get PDF
    Zhang Zongyu.Thesis (M.Phil.)--Chinese University of Hong Kong, 2001.Includes bibliographical references (leaves 101-108).Abstracts in English and Chinese.ABSTRACT IN ENGLISH --- p.i-iiABSTRACT IN CHINESE --- p.iiiACKNOWLEDGEMENTS --- p.iv-vTABLE OF CONTENTS --- p.vi-viiiLIST OF TABLES --- p.ixLIST OF FIGURES --- p.x-xiChapter CHAPTER ONE --- INTRODUCTIONChapter 1.1 --- Landscape and landscape Assessment --- p.1Chapter 1.1.1 --- The descriptive inventory approach --- p.2Chapter 1.1.2 --- Public preference models --- p.4Chapter 1.2 --- Urban Landscape --- p.5Chapter 1.3 --- Relationship between professional and public --- p.8Chapter 1.3.1 . --- Inherent conflicts --- p.9Chapter 1.3.2. --- Roles of both sides --- p.9Chapter 1.3.3 --- Collaboration between professionals and the public --- p.10Chapter CHAPTER TWO --- VISUAL IMPACT ASSESSMENTChapter 2.1 --- The needs for visual impact assessment --- p.13Chapter 2.2 --- The visual impact assessment process --- p.16Chapter 2.3 --- The information inventory in the visual impact assessment --- p.19Chapter 2.3.1 --- Landscape simulation --- p.20Chapter 2.3.2 --- Visual impacts identification --- p.22Chapter 2.4 --- Public participation --- p.23Chapter 2.4.1 --- Public preference in the urban landscape --- p.24Chapter 2.4.2 --- Public accessibility to the urban landscape planning process --- p.28Chapter CHAPTER THREE --- CAPTURING THE SYSTEM SPECIFICATIONSChapter 3.1 --- General considerations --- p.30Chapter 3.1.1 --- Function requirements --- p.30Chapter 3.1.2 --- Project management --- p.32Chapter 3.1.3 --- User interface --- p.33Chapter 3.1.4 --- Web access --- p.34Chapter 3.1.5 --- Qualification of public participation in urban planning --- p.35Chapter 3.2 --- Envisioning the proposed web based system --- p.37Chapter 3.2.1 --- Proposed virtual collaboration --- p.38Chapter 3.2.1.1 --- Improving participants' access to the web based visual impact assessment --- p.39Chapter 3.2.1.2 --- Capturing the public appreciation --- p.41Chapter 3.2.2 --- Collaboration between planners and public --- p.43Chapter CHAPTER FOUR --- SYSTEM DESIGNChapter 4.1 --- Main software or tools for developing the proposed web based system --- p.45Chapter 4.1.1 --- Arcview 3.1 or Arc/Info with 3D analyst and Internet mapping server extensions --- p.46Chapter 4:1.2 --- VRML 2.0 and Java --- p.49Chapter 4.1.3 --- Java3D API --- p.52Chapter 4.2 --- System configuration --- p.55Chapter 4.2.1. --- System architecture --- p.55Chapter 4.2.2. --- Data management --- p.57Chapter 4.2.2.1 --- Urban landscape information management --- p.57Chapter 4.2.2.2 --- Public participation --- p.64Chapter 4.2.3. --- User interface design --- p.69Chapter CHAPTER FIVE --- PROTOTYPE SYSTME AND PILOT STUDYChapter 5.1 --- General description --- p.74Chapter 5.2 --- Implementation --- p.75Chapter 5.2.1 --- Connecting the two-dimensional world with a three-dimensional virtual urban environment --- p.75Chapter 5.2.2 --- Data flow of the system for interactions between the GIS and the VRML browser --- p.77Chapter 5.3 --- Data preparation --- p.81Chapter 5.3.1 --- Constructing the terrain model --- p.81Chapter 5.3.2 --- Retrieving the landscape themes --- p.87Chapter 5.4 --- Public oriented user interface design --- p.88Chapter 5.5 --- Participation log --- p.96Chapter CHAPTER SIX --- CONCLUSIONAPPENDI

    Evolutionary Computation for Digital Artefact Design

    Get PDF
    This thesis presents novel systems for the automatic and semi-automatic design of digital artefacts. Currently, users wanting to create digital models, such as three-dimensional (3D) digital landscapes and website colour schemes, need to possess significant expertise, as the tools involved demand a high level of knowledge and skill. By developing an intuitive algorithmic process, founded on evolutionary computation (EC), this research enables non-specialist human designers to create digital assets more efficiently. This is achieved by replacing design activities that require significant manual input with algorithmic functions, thereby greatly improving the efficiency and accessibility of the practices involved. This research places an initial focus on the generation of 3D landscapes, but the latter aspect concentrates on the identification of text and background colour combinations more amenable to the reading process, particularly for readers with vision impairments. Choosing an ideal combination of colours requires knowledge of the cognitive and psychological procedures involved. Designers need to be aware of colour contrast ratios, brightness, and variations, which would require a series of aesthetic measurements if they are to be manually tested. In an effort to provide a colour design facility, this research offers algorithms that can generate colour schemes, based on the aforementioned principles, which can be used to derive an optimum scheme for a website. This research demonstrates a novel interactive genetic algorithm (IGA), coupled with the use of computational aesthetics, suitable for use in the evolution of terrain generation and digital landscape design. It also provides a tool for automatically creating EC-driven colour palettes for web design via evolutionary searches. Experimental trials use the EC framework developed from this research using both IGA technique and the computational aesthetic measures. Results indicate that the end-users can build any target digital landscape design with less inputs and more comfort, and if required can also automate the whole process to evolve aesthetically pleasing landscape designs. The results obtained for designing colour schemes for website design have proven that end-users can quickly develop a colour scheme, without the need for fine-tuning of colour combinations. Results can compete in quality the colour schemes that are designed by the professional website developers

    THE ART AND SCIENCE OF WOOD: FROM PYROGRAPHY TO TERMITES AND WOOD DECOMPOSITION

    Get PDF
    Wood is vital to many natural ecosystems, as it provides energy, nutrients, and habitat for organisms from the micro- to the macro- scale. Wood is also critical to humans for similar reasons, and can be an important medium of art and education. This dissertation addresses three diverse aspects of wood with the contexts of science, art, and education. First, we explored the impact of timber harvest techniques and site preparation on microbial wood decay and subterranean termite responses on a forest-stand scale. The amount of coarse woody debris removed post-harvest, coupled with the location and species of the test wood stakes, significantly affected both termite and microbial-mediated decomposition after two and a half years of exposure. These findings help to better understand the impact of timber harvest practices on carbon cycling and associated modes of decay. We then explored effects of wood species and wood surface preparation on pyrography, the art of woodburning. The species of wood and the surface preparation significantly affected line and shading work in pyrography, with more detailed linework produced on hardwoods (Acer rubrum, Populus tremuloides) than on softwoods (Pinus taeda, Pinus strobus). Lastly, placing wood into an educational context, high school level lesson plans that address several science curriculum state and federal benchmarks were developed, to be taught through the active learning technique of pyrography. A general “Introduction to woodburning” lesson plan is included, followed by lesson plans for cellular respiration, human impacts on the environment, photosynthesis, and the carbon cycle. Lesson plans provide instructors with the resources needed to teach across both science and art curriculums. Each lesson plan includes background material, vocabulary, assignments, instructional videos, and PowerPoint presentations. These three chapters weave together science, art, and education using wood as the common thread

    Computer Game Innovation

    Get PDF
    Faculty of Technical Physics, Information Technology and Applied Mathematics. Institute of Information TechnologyWydział Fizyki Technicznej, Informatyki i Matematyki Stosowanej. Instytut InformatykiThe "Computer Game Innovations" series is an international forum designed to enable the exchange of knowledge and expertise in the field of video game development. Comprising both academic research and industrial needs, the series aims at advancing innovative industry-academia collaboration. The monograph provides a unique set of articles presenting original research conducted in the leading academic centres which specialise in video games education. The goal of the publication is, among others, to enhance networking opportunities for industry and university representatives seeking to form R&D partnerships. This publication covers the key focus areas specified in the GAMEINN sectoral programme supported by the National Centre for Research and Development

    Video Abstracting at a Semantical Level

    Get PDF
    One the most common form of a video abstract is the movie trailer. Contemporary movie trailers share a common structure across genres which allows for an automatic generation and also reflects the corresponding moviea s composition. In this thesis a system for the automatic generation of trailers is presented. In addition to action trailers, the system is able to deal with further genres such as Horror and comedy trailers, which were first manually analyzed in order to identify their basic structures. To simplify the modeling of trailers and the abstract generation itself a new video abstracting application was developed. This application is capable of performing all steps of the abstract generation automatically and allows for previews and manual optimizations. Based on this system, new abstracting models for horror and comedy trailers were created and the corresponding trailers have been automatically generated using the new abstracting models. In an evaluation the automatic trailers were compared to the original Trailers and showed a similar structure. However, the automatically generated trailers still do not exhibit the full perfection of the Hollywood originals as they lack intentional storylines across shots

    I Directed Macbeth, and So Can You

    Get PDF
    Designed in the second year of my graduate studies, and rehearsed and performed in the autumn of my third, Macbeth served as my thesis production at Lindenwood University. In this paper, I will address the details of how the production came to light, my approach to developing the performance, what was learned throughout the nearly year-long endeavor, and how those lessons have shaped my relationship with the art of storytelling

    CGAMES'2009

    Get PDF
    corecore