205 research outputs found

    Data driven approaches for investigating molecular heterogeneity of the brain

    Get PDF
    It has been proposed that one of the clearest organizing principles for most sensory systems is the existence of parallel subcircuits and processing streams that form orderly and systematic mappings from stimulus space to neurons. Although the spatial heterogeneity of the early olfactory circuitry has long been recognized, we know comparatively little about the circuits that propagate sensory signals downstream. Investigating the potential modularity of the bulb’s intrinsic circuits proves to be a difficult task as termination patterns of converging projections, as with the bulb’s inputs, are not feasibly realized. Thus, if such circuit motifs exist, their detection essentially relies on identifying differential gene expression, or “molecular signatures,” that may demarcate functional subregions. With the arrival of comprehensive (whole genome, cellular resolution) datasets in biology and neuroscience, it is now possible for us to carry out large-scale investigations and make particular use of the densely catalogued, whole genome expression maps of the Allen Brain Atlas to carry out systematic investigations of the molecular topography of the olfactory bulb’s intrinsic circuits. To address the challenges associated with high-throughput and high-dimensional datasets, a deep learning approach will form the backbone of our informatic pipeline. In the proposed work, we test the hypothesis that the bulb’s intrinsic circuits are parceled into distinct, parallel modules that can be defined by genome-wide patterns of expression. In pursuit of this aim, our deep learning framework will facilitate the group-registration of the mitral cell layers of ~ 50,000 in-situ olfactory bulb circuits to test this hypothesis

    Towards Content-based Pixel Retrieval in Revisited Oxford and Paris

    Full text link
    This paper introduces the first two pixel retrieval benchmarks. Pixel retrieval is segmented instance retrieval. Like semantic segmentation extends classification to the pixel level, pixel retrieval is an extension of image retrieval and offers information about which pixels are related to the query object. In addition to retrieving images for the given query, it helps users quickly identify the query object in true positive images and exclude false positive images by denoting the correlated pixels. Our user study results show pixel-level annotation can significantly improve the user experience. Compared with semantic and instance segmentation, pixel retrieval requires a fine-grained recognition capability for variable-granularity targets. To this end, we propose pixel retrieval benchmarks named PROxford and PRParis, which are based on the widely used image retrieval datasets, ROxford and RParis. Three professional annotators label 5,942 images with two rounds of double-checking and refinement. Furthermore, we conduct extensive experiments and analysis on the SOTA methods in image search, image matching, detection, segmentation, and dense matching using our pixel retrieval benchmarks. Results show that the pixel retrieval task is challenging to these approaches and distinctive from existing problems, suggesting that further research can advance the content-based pixel-retrieval and thus user search experience. The datasets can be downloaded from \href{https://github.com/anguoyuan/Pixel_retrieval-Segmented_instance_retrieval}{this link}

    'Doing Food-Knowing Food: An Exploration of Allotment Practices and the Production of Knowledge Through Visceral Engagement’

    Get PDF
    The original contribution of this thesis is through its conceptualisations of human more-than- human encounters on the allotment that break down the boundaries of subjectivities. This work extends knowledge of cultural food geography by investigating how people engage with the matter of the plot and learn to grow food. The conceptual tool by which this occurs is set out as processes of visceral learning within a framework of mattering. Therefore this work follows the material transformations of matter across production consumption cycles of allotment produce. This is examined through processes of bodily adaptions to the matter of the plot. The processes of growing your own food affords an opportunity to focus on the processes of doing and becoming, allowing the how of food growing to take centre stage (Crouch 2003, Ingold 2010, Grosz 1999). Procuring and producing food for consumption is enacted through the human more-than-human interface of bodily engagement that disrupts dualisms and revealing their complex inter-relationships, as well as the potential of visceral research (Roe 2006, Whatmore 2006, Hayes-Conroy 2008). Therefore, this is an immersive account of the procurement of food and the development of food knowledge through material, sensory and visceral becomings, which occur within a contextual frame of everyday food experiences. This study is contextualised in the complexities of contemporary food issues where matters of access, foodism and sustainability shape the enquiry. However the research is carried out at a micro-geographies lens of bodily engagements with food matter through grow your own practices on allotments. Growing food on new allotments is the locus of procurement reflecting a resurgence in such activities following from the recent rise in interest in local food, alternative food networks (AFNs) and food as a conduit for celebrity in the media (Dupuis & Goodman 2005, Lockie & Kitto 2000, Winter 2003). Moreover, the current spread of the allotment is examined as transgressing urban/rural divides and disrupting traditional perceptions of plot users. This allows investigations into spaces where community processes can unfold, providing a richly observed insight into the broadened demographics of recent allotment life

    Unsupervised quantification of entity consistency between photos and text in real-world news

    Get PDF
    Das World Wide Web und die sozialen Medien ĂŒbernehmen im heutigen Informationszeitalter eine wichtige Rolle fĂŒr die Vermittlung von Nachrichten und Informationen. In der Regel werden verschiedene ModalitĂ€ten im Sinne der Informationskodierung wie beispielsweise Fotos und Text verwendet, um Nachrichten effektiver zu vermitteln oder Aufmerksamkeit zu erregen. Kommunikations- und Sprachwissenschaftler erforschen das komplexe Zusammenspiel zwischen ModalitĂ€ten seit Jahrzehnten und haben unter Anderem untersucht, wie durch die Kombination der ModalitĂ€ten zusĂ€tzliche Informationen oder eine neue Bedeutungsebene entstehen können. Die Anzahl gemeinsamer Konzepte oder EntitĂ€ten (beispielsweise Personen, Orte und Ereignisse) zwischen Fotos und Text stellen einen wichtigen Aspekt fĂŒr die Bewertung der Gesamtaussage und Bedeutung eines multimodalen Artikels dar. Automatisierte AnsĂ€tze zur Quantifizierung von Bild-Text-Beziehungen können fĂŒr zahlreiche Anwendungen eingesetzt werden. Sie ermöglichen beispielsweise eine effiziente Exploration von Nachrichten, erleichtern die semantische Suche von Multimedia-Inhalten in (Web)-Archiven oder unterstĂŒtzen menschliche Analysten bei der Evaluierung der GlaubwĂŒrdigkeit von Nachrichten. Allerdings gibt es bislang nur wenige AnsĂ€tze, die sich mit der Quantifizierung von Beziehungen zwischen Fotos und Text beschĂ€ftigen. Diese AnsĂ€tze berĂŒcksichtigen jedoch nicht explizit die intermodalen Beziehungen von EntitĂ€ten, welche eine wichtige Rolle in Nachrichten darstellen, oder basieren auf ĂŒberwachten multimodalen Deep-Learning-Techniken. Diese ĂŒberwachten Lernverfahren können ausschließlich die intermodalen Beziehungen von EntitĂ€ten detektieren, die in annotierten Trainingsdaten enthalten sind. Um diese ForschungslĂŒcke zu schließen, wird in dieser Arbeit ein unĂŒberwachter Ansatz zur Quantifizierung der intermodalen Konsistenz von EntitĂ€ten zwischen Fotos und Text in realen multimodalen Nachrichtenartikeln vorgestellt. Im ersten Teil dieser Arbeit werden neuartige Verfahren auf Basis von Deep Learning zur Extrahierung von Informationen aus Fotos vorgestellt, um Ereignisse (Events), Orte, Zeitangaben und Personen automatisch zu erkennen. Diese Verfahren bilden eine wichtige Voraussetzung, um die Beziehungen von EntitĂ€ten zwischen Bild und Text zu bewerten. ZunĂ€chst wird ein Ansatz zur Ereignisklassifizierung prĂ€sentiert, der neuartige Optimierungsfunktionen und Gewichtungsschemata nutzt um Ontologie-Informationen aus einer Wissensdatenbank in ein Deep-Learning-Verfahren zu integrieren. Das Training erfolgt anhand eines neu vorgestellten Datensatzes, der 570.540 Fotos und eine Ontologie mit 148 Ereignistypen enthĂ€lt. Der Ansatz ĂŒbertrifft die Ergebnisse von Referenzsystemen die keine strukturierten Ontologie-Informationen verwenden. Weiterhin wird ein DeepLearning-Ansatz zur SchĂ€tzung des Aufnahmeortes von Fotos vorgeschlagen, der Kontextinformationen ĂŒber die Umgebung (Innen-, Stadt-, oder Naturaufnahme) und von Erdpartitionen unterschiedlicher GranularitĂ€t verwendet. Die vorgeschlagene Lösung ĂŒbertrifft die bisher besten Ergebnisse von aktuellen Forschungsarbeiten, obwohl diese deutlich mehr Fotos zum Training verwenden. DarĂŒber hinaus stellen wir den ersten Datensatz zur SchĂ€tzung des Aufnahmejahres von Fotos vor, der mehr als eine Million Bilder aus den Jahren 1930 bis 1999 umfasst. Dieser Datensatz wird fĂŒr das Training von zwei Deep-Learning-AnsĂ€tzen zur SchĂ€tzung des Aufnahmejahres verwendet, welche die Aufgabe als Klassifizierungs- und Regressionsproblem behandeln. Beide AnsĂ€tze erzielen sehr gute Ergebnisse und ĂŒbertreffen Annotationen von menschlichen Probanden. Schließlich wird ein neuartiger Ansatz zur Identifizierung von Personen des öffentlichen Lebens und ihres gemeinsamen Auftretens in Nachrichtenfotos aus der digitalen Bibliothek Internet Archiv prĂ€sentiert. Der Ansatz ermöglicht es unstrukturierte Webdaten aus dem Internet Archiv mit Metadaten, beispielsweise zur semantischen Suche, zu erweitern. Experimentelle Ergebnisse haben die EffektivitĂ€t des zugrundeliegenden Deep-Learning-Ansatzes zur Personenerkennung bestĂ€tigt. Im zweiten Teil dieser Arbeit wird ein unĂŒberwachtes System zur Quantifizierung von BildText-Beziehungen in realen Nachrichten vorgestellt. Im Gegensatz zu bisherigen Verfahren liefert es automatisch neuartige Maße der intermodalen Konsistenz fĂŒr verschiedene EntitĂ€tstypen (Personen, Orte und Ereignisse) sowie den Gesamtkontext. Das System ist nicht auf vordefinierte DatensĂ€tze angewiesen, und kann daher mit der Vielzahl und DiversitĂ€t von EntitĂ€ten und Themen in Nachrichten umgehen. Zur Extrahierung von EntitĂ€ten aus dem Text werden geeignete Methoden der natĂŒrlichen Sprachverarbeitung eingesetzt. Examplarbilder fĂŒr diese EntitĂ€ten werden automatisch aus dem Internet beschafft. Die vorgeschlagenen Methoden zur Informationsextraktion aus Fotos werden auf die Nachrichten- und heruntergeladenen Exemplarbilder angewendet, um die intermodale Konsistenz von EntitĂ€ten zu quantifizieren. Es werden zwei Aufgaben untersucht um die QualitĂ€t des vorgeschlagenen Ansatzes in realen Anwendungen zu bewerten. Experimentelle Ergebnisse fĂŒr die Dokumentverifikation und die Beschaffung von Nachrichten mit geringer (potenzielle Fehlinformation) oder hoher multimodalen Konsistenz zeigen den Nutzen und das Potenzial des Ansatzes zur UnterstĂŒtzung menschlicher Analysten bei der Untersuchung von Nachrichten.In today’s information age, the World Wide Web and social media are important sources for news and information. Different modalities (in the sense of information encoding) such as photos and text are typically used to communicate news more effectively or to attract attention. Communication scientists, linguists, and semioticians have studied the complex interplay between modalities for decades and investigated, e.g., how their combination can carry additional information or add a new level of meaning. The number of shared concepts or entities (e.g., persons, locations, and events) between photos and text is an important aspect to evaluate the overall message and meaning of an article. Computational models for the quantification of image-text relations can enable many applications. For example, they allow for more efficient exploration of news, facilitate semantic search and multimedia retrieval in large (web) archives, or assist human assessors in evaluating news for credibility. To date, only a few approaches have been suggested that quantify relations between photos and text. However, they either do not explicitly consider the cross-modal relations of entities – which are important in the news – or rely on supervised deep learning approaches that can only detect the cross-modal presence of entities covered in the labeled training data. To address this research gap, this thesis proposes an unsupervised approach that can quantify entity consistency between photos and text in multimodal real-world news articles. The first part of this thesis presents novel approaches based on deep learning for information extraction from photos to recognize events, locations, dates, and persons. These approaches are an important prerequisite to measure the cross-modal presence of entities in text and photos. First, an ontology-driven event classification approach that leverages new loss functions and weighting schemes is presented. It is trained on a novel dataset of 570,540 photos and an ontology with 148 event types. The proposed system outperforms approaches that do not use structured ontology information. Second, a novel deep learning approach for geolocation estimation is proposed that uses additional contextual information on the environmental setting (indoor, urban, natural) and from earth partitions of different granularity. The proposed solution outperforms state-of-the-art approaches, which are trained with significantly more photos. Third, we introduce the first large-scale dataset for date estimation with more than one million photos taken between 1930 and 1999, along with two deep learning approaches that treat date estimation as a classification and regression problem. Both approaches achieve very good results that are superior to human annotations. Finally, a novel approach is presented that identifies public persons and their co-occurrences in news photos extracted from the Internet Archive, which collects time-versioned snapshots of web pages that are rarely enriched with metadata relevant to multimedia retrieval. Experimental results confirm the effectiveness of the deep learning approach for person identification. The second part of this thesis introduces an unsupervised approach capable of quantifying image-text relations in real-world news. Unlike related work, the proposed solution automatically provides novel measures of cross-modal consistency for different entity types (persons, locations, and events) as well as the overall context. The approach does not rely on any predefined datasets to cope with the large amount and diversity of entities and topics covered in the news. State-of-the-art tools for natural language processing are applied to extract named entities from the text. Example photos for these entities are automatically crawled from the Web. The proposed methods for information extraction from photos are applied to both news images and example photos to quantify the cross-modal consistency of entities. Two tasks are introduced to assess the quality of the proposed approach in real-world applications. Experimental results for document verification and retrieval of news with either low (potential misinformation) or high cross-modal similarities demonstrate the feasibility of the approach and its potential to support human assessors to study news

    The development and application of the use of encased voids within the body of glass artefacts as a means of drawing and expression

    Get PDF
    This practice -led thesis is based on a study of the use of encased voids or bubbles in glass. The study is grounded in practice and draws out through antecedents in philosophy, psychology and epistemology, a methodology called Reflective Risk. It shows that through a rigorous analysis of practice, using video and personal reflection that new insights emerge. The study is framed by craft practice (the word craft here used as a collection of 'genre' of which glass is part). The thesis uses experiential learning as a tool and a means of understanding the practice of creating and controlling encased voids in glass in the context of contemporary applied arts practice. The framework, Reflective Risk, is constructivist in approach. It is based on Experiential Learning Theory (ELT), but it also draws on epistemological theories of tacit knowledge. The thesis shows that through an understanding of technique and material qualities, process can be deconstructed to reveal new insights. The thesis documents how an understanding ELT and a range of self- regulatory antecedents can influence the cognitive process of craft practice through praxis. The results of this study, on the one hand, are directed to glass practitioners and on the other, to provide a theoretical approach appropriate for the reflective practitioner working in other media by adopting a parallel method of enquiry

    Text–to–Video: Image Semantics and NLP

    Get PDF
    When aiming at automatically translating an arbitrary text into a visual story, the main challenge consists in finding a semantically close visual representation whereby the displayed meaning should remain the same as in the given text. Besides, the appearance of an image itself largely influences how its meaningful information is transported towards an observer. This thesis now demonstrates that investigating in both, image semantics as well as the semantic relatedness between visual and textual sources enables us to tackle the challenging semantic gap and to find a semantically close translation from natural language to a corresponding visual representation. Within the last years, social networking became of high interest leading to an enormous and still increasing amount of online available data. Photo sharing sites like Flickr allow users to associate textual information with their uploaded imagery. Thus, this thesis exploits this huge knowledge source of user generated data providing initial links between images and words, and other meaningful data. In order to approach visual semantics, this work presents various methods to analyze the visual structure as well as the appearance of images in terms of meaningful similarities, aesthetic appeal, and emotional effect towards an observer. In detail, our GPU-based approach efficiently finds visual similarities between images in large datasets across visual domains and identifies various meanings for ambiguous words exploring similarity in online search results. Further, we investigate in the highly subjective aesthetic appeal of images and make use of deep learning to directly learn aesthetic rankings from a broad diversity of user reactions in social online behavior. To gain even deeper insights into the influence of visual appearance towards an observer, we explore how simple image processing is capable of actually changing the emotional perception and derive a simple but effective image filter. To identify meaningful connections between written text and visual representations, we employ methods from Natural Language Processing (NLP). Extensive textual processing allows us to create semantically relevant illustrations for simple text elements as well as complete storylines. More precisely, we present an approach that resolves dependencies in textual descriptions to arrange 3D models correctly. Further, we develop a method that finds semantically relevant illustrations to texts of different types based on a novel hierarchical querying algorithm. Finally, we present an optimization based framework that is capable of not only generating semantically relevant but also visually coherent picture stories in different styles.Bei der automatischen Umwandlung eines beliebigen Textes in eine visuelle Geschichte, besteht die grĂ¶ĂŸte Herausforderung darin eine semantisch passende visuelle Darstellung zu finden. Dabei sollte die Bedeutung der Darstellung dem vorgegebenen Text entsprechen. DarĂŒber hinaus hat die Erscheinung eines Bildes einen großen Einfluß darauf, wie seine bedeutungsvollen Inhalte auf einen Betrachter ĂŒbertragen werden. Diese Dissertation zeigt, dass die Erforschung sowohl der Bildsemantik als auch der semantischen Verbindung zwischen visuellen und textuellen Quellen es ermöglicht, die anspruchsvolle semantische LĂŒcke zu schließen und eine semantisch nahe Übersetzung von natĂŒrlicher Sprache in eine entsprechend sinngemĂ€ĂŸe visuelle Darstellung zu finden. Des Weiteren gewann die soziale Vernetzung in den letzten Jahren zunehmend an Bedeutung, was zu einer enormen und immer noch wachsenden Menge an online verfĂŒgbaren Daten gefĂŒhrt hat. Foto-Sharing-Websites wie Flickr ermöglichen es Benutzern, Textinformationen mit ihren hochgeladenen Bildern zu verknĂŒpfen. Die vorliegende Arbeit nutzt die enorme Wissensquelle von benutzergenerierten Daten welche erste Verbindungen zwischen Bildern und Wörtern sowie anderen aussagekrĂ€ftigen Daten zur VerfĂŒgung stellt. Zur Erforschung der visuellen Semantik stellt diese Arbeit unterschiedliche Methoden vor, um die visuelle Struktur sowie die Wirkung von Bildern in Bezug auf bedeutungsvolle Ähnlichkeiten, Ă€sthetische Erscheinung und emotionalem Einfluss auf einen Beobachter zu analysieren. Genauer gesagt, findet unser GPU-basierter Ansatz effizient visuelle Ähnlichkeiten zwischen Bildern in großen Datenmengen quer ĂŒber visuelle DomĂ€nen hinweg und identifiziert verschiedene Bedeutungen fĂŒr mehrdeutige Wörter durch die Erforschung von Ähnlichkeiten in Online-Suchergebnissen. Des Weiteren wird die höchst subjektive Ă€sthetische Anziehungskraft von Bildern untersucht und "deep learning" genutzt, um direkt Ă€sthetische Einordnungen aus einer breiten Vielfalt von Benutzerreaktionen im sozialen Online-Verhalten zu lernen. Um noch tiefere Erkenntnisse ĂŒber den Einfluss des visuellen Erscheinungsbildes auf einen Betrachter zu gewinnen, wird erforscht, wie alleinig einfache Bildverarbeitung in der Lage ist, tatsĂ€chlich die emotionale Wahrnehmung zu verĂ€ndern und ein einfacher aber wirkungsvoller Bildfilter davon abgeleitet werden kann. Um bedeutungserhaltende Verbindungen zwischen geschriebenem Text und visueller Darstellung zu ermitteln, werden Methoden des "Natural Language Processing (NLP)" verwendet, die der Verarbeitung natĂŒrlicher Sprache dienen. Der Einsatz umfangreicher Textverarbeitung ermöglicht es, semantisch relevante Illustrationen fĂŒr einfache Textteile sowie fĂŒr komplette HandlungsstrĂ€nge zu erzeugen. Im Detail wird ein Ansatz vorgestellt, der AbhĂ€ngigkeiten in Textbeschreibungen auflöst, um 3D-Modelle korrekt anzuordnen. Des Weiteren wird eine Methode entwickelt die, basierend auf einem neuen hierarchischen Such-Anfrage Algorithmus, semantisch relevante Illustrationen zu Texten verschiedener Art findet. Schließlich wird ein optimierungsbasiertes Framework vorgestellt, das nicht nur semantisch relevante, sondern auch visuell kohĂ€rente Bildgeschichten in verschiedenen Bildstilen erzeugen kann

    The integrated sound, space and movement environment : The uses of analogue and digital technologies to correlate topographical and gestural movement with sound

    Get PDF
    This thesis investigates correlations between auditory parameters and parameters associated with movement in a sensitised space. The research examines those aspects of sound that form correspondences with movement, force or position of a body or bodies in a space sensitised by devices for acquiring gestural or topographical data. A wide range of digital technologies are scrutinised to establish what the most effective technologies are in order to achieve detailed and accurate information about movement in a given space, and the methods and procedures for analysis, transposition and synthesis into sound. The thesis describes pertinent work in the field from the last 20 years, the issues that have been raised in those works and issues raised by my work in the area. The thesis draws conclusions that point to further development of an integrated model of a space that is sensitised to movement, and responds in sound in such a way that it can be appreciated by performers and audiences. The artistic and research practices that are cited, are principally from the areas of danceand- technology, sound installation and alternative gestural controllers for musical applications
    • 

    corecore