3,533 research outputs found
Zero-Shot Visual Recognition using Semantics-Preserving Adversarial Embedding Networks
We propose a novel framework called Semantics-Preserving Adversarial
Embedding Network (SP-AEN) for zero-shot visual recognition (ZSL), where test
images and their classes are both unseen during training. SP-AEN aims to tackle
the inherent problem --- semantic loss --- in the prevailing family of
embedding-based ZSL, where some semantics would be discarded during training if
they are non-discriminative for training classes, but could become critical for
recognizing test classes. Specifically, SP-AEN prevents the semantic loss by
introducing an independent visual-to-semantic space embedder which disentangles
the semantic space into two subspaces for the two arguably conflicting
objectives: classification and reconstruction. Through adversarial learning of
the two subspaces, SP-AEN can transfer the semantics from the reconstructive
subspace to the discriminative one, accomplishing the improved zero-shot
recognition of unseen classes. Comparing with prior works, SP-AEN can not only
improve classification but also generate photo-realistic images, demonstrating
the effectiveness of semantic preservation. On four popular benchmarks: CUB,
AWA, SUN and aPY, SP-AEN considerably outperforms other state-of-the-art
methods by an absolute performance difference of 12.2\%, 9.3\%, 4.0\%, and
3.6\% in terms of harmonic mean value
Towards automated infographic design: Deep learning-based auto-extraction of extensible timeline
Designers need to consider not only perceptual effectiveness but also visual
styles when creating an infographic. This process can be difficult and time
consuming for professional designers, not to mention non-expert users, leading
to the demand for automated infographics design. As a first step, we focus on
timeline infographics, which have been widely used for centuries. We contribute
an end-to-end approach that automatically extracts an extensible timeline
template from a bitmap image. Our approach adopts a deconstruction and
reconstruction paradigm. At the deconstruction stage, we propose a multi-task
deep neural network that simultaneously parses two kinds of information from a
bitmap timeline: 1) the global information, i.e., the representation, scale,
layout, and orientation of the timeline, and 2) the local information, i.e.,
the location, category, and pixels of each visual element on the timeline. At
the reconstruction stage, we propose a pipeline with three techniques, i.e.,
Non-Maximum Merging, Redundancy Recover, and DL GrabCut, to extract an
extensible template from the infographic, by utilizing the deconstruction
results. To evaluate the effectiveness of our approach, we synthesize a
timeline dataset (4296 images) and collect a real-world timeline dataset (393
images) from the Internet. We first report quantitative evaluation results of
our approach over the two datasets. Then, we present examples of automatically
extracted templates and timelines automatically generated based on these
templates to qualitatively demonstrate the performance. The results confirm
that our approach can effectively extract extensible templates from real-world
timeline infographics.Comment: 10 pages, Automated Infographic Design, Deep Learning-based Approach,
Timeline Infographics, Multi-task Mode
Joint Video and Text Parsing for Understanding Events and Answering Queries
We propose a framework for parsing video and text jointly for understanding
events and answering user queries. Our framework produces a parse graph that
represents the compositional structures of spatial information (objects and
scenes), temporal information (actions and events) and causal information
(causalities between events and fluents) in the video and text. The knowledge
representation of our framework is based on a spatial-temporal-causal And-Or
graph (S/T/C-AOG), which jointly models possible hierarchical compositions of
objects, scenes and events as well as their interactions and mutual contexts,
and specifies the prior probabilistic distribution of the parse graphs. We
present a probabilistic generative model for joint parsing that captures the
relations between the input video/text, their corresponding parse graphs and
the joint parse graph. Based on the probabilistic model, we propose a joint
parsing system consisting of three modules: video parsing, text parsing and
joint inference. Video parsing and text parsing produce two parse graphs from
the input video and text respectively. The joint inference module produces a
joint parse graph by performing matching, deduction and revision on the video
and text parse graphs. The proposed framework has the following objectives:
Firstly, we aim at deep semantic parsing of video and text that goes beyond the
traditional bag-of-words approaches; Secondly, we perform parsing and reasoning
across the spatial, temporal and causal dimensions based on the joint S/T/C-AOG
representation; Thirdly, we show that deep joint parsing facilitates subsequent
applications such as generating narrative text descriptions and answering
queries in the forms of who, what, when, where and why. We empirically
evaluated our system based on comparison against ground-truth as well as
accuracy of query answering and obtained satisfactory results
Image Segmentation of Bacterial Cells in Biofilms
Bacterial biofilms are three-dimensional cell communities that live embedded in a self-produced extracellular matrix. Due to the protective properties of the dense coexistence of microorganisms, single bacteria inside the communities are hard to eradicate by antibacterial agents and bacteriophages. This increased resilience gives rise to severe problems in medical and technological settings. To fight the bacterial cells, an in-detail understanding of the underlying mechanisms of biofilm formation and development is required. Due to spatio-temporal variances in environmental conditions inside a single biofilm, the mechanisms can only be investigated by probing single-cells at different locations over time. Currently, the mechanistic information is primarily encoded in volumetric image data gathered with confocal fluorescence microscopy. To quantify features of the single-cell behaviour, single objects need to be detected. This identification of objects inside biofilm image data is called segmentation and is a key step for the understanding of the biological processes inside biofilms.
In the first part of this work, a user-friendly computer program is presented which simplifies the analysis of bacterial biofilms. It provides a comprehensive set of tools to segment, analyse, and visualize fluorescent microscopy data without writing a single line of analysis code. This allows for faster feedback loops between experiment and analysis, and allows fast insights into the gathered data.
The single-cell segmentation accuracy of a recent segmentation algorithm is discussed in detail. In this discussion, points for improvements are identified and a new optimized segmentation approach presented. The improved algorithm achieves superior segmentation accuracy on bacterial biofilms when compared to the current state-of-the-art algorithms.
Finally, the possibility of deep learning-based end-to-end segmentation of biofilm data is investigated. A method for the quick generation of training data is presented and the results of two single-cell segmentation approaches for eukaryotic cells are adapted for the segmentation of bacterial biofilm segmentation.Bakterielle Biofilme sind drei-dimensionale Zellcluster, welche ihre eigene Matrix produzieren. Die selbst-produzierte Matrix bietet den Zellen einen gemeinschaftlichen Schutz vor äußeren Stressfaktoren. Diese Stressfaktoren können abiotischer Natur sein wie z.B. Temperatur- und Nährstoff\- schwankungen, oder aber auch biotische Faktoren wie z.B. Antibiotikabehandlung oder Bakteriophageninfektionen. Dies führt dazu, dass einzelne Zelle innerhalb der mikrobiologischen Gemeinschaften eine erhöhte Widerstandsfähigkeit aufweisen und eine große Herausforderung für Medizin und technische Anwendungen darstellen. Um Biofilme wirksam zu bekämpfen, muss man die dem Wachstum und Entwicklung zugrundeliegenden Mechanismen entschlüsseln.
Aufgrund der hohen Zelldichte innerhalb der Gemeinschaften sind die Mechanismen nicht räumlich und zeitlich invariant, sondern hängen z.B. von Metabolit-, Nährstoff- und Sauerstoffgradienten ab. Daher ist es für die Beschreibung unabdingbar Beobachtungen auf Einzelzellebene durchzuführen. Für die nicht-invasive Untersuchung von einzelnen Zellen innerhalb eines Biofilms ist man auf konfokale Fluoreszenzmikroskopie angewiesen. Um aus den gesammelten, drei-dimensionalen Bilddaten Zelleigenschaften zu extrahieren, ist die Erkennung von den jeweiligen Zellen erforderlich. Besonders die digitale Rekonstruktion der Zellmorphologie spielt dabei eine große Rolle. Diese erhält man über die Segmentierung der Bilddaten. Dabei werden einzelne Bildelemente den abgebildeten Objekten zugeordnet. Damit lassen sich die einzelnen Objekte voneinander unterscheiden und deren Eigenschaften extrahieren.
Im ersten Teil dieser Arbeit wird ein benutzerfreundliches Computerprogramm vorgestellt, welches die Segmentierung und Analyse von Fluoreszenzmikroskopiedaten wesentlich vereinfacht. Es stellt eine umfangreiche Auswahl an traditionellen Segmentieralgorithmen, Parameterberechnungen und Visualisierungsmöglichkeiten zur Verfügung. Alle Funktionen sind ohne Programmierkenntnisse zugänglich, sodass sie einer großen Gruppe von Benutzern zur Verfügung stehen. Die implementierten Funktionen ermöglichen es die Zeit zwischen durchgeführtem Experiment und vollendeter Datenanalyse signifikant zu verkürzen. Durch eine schnelle Abfolge von stetig angepassten Experimenten können in kurzer Zeit schnell wissenschaftliche Einblicke in Biofilme gewonnen werden.\\
Als Ergänzung zu den bestehenden Verfahren zur Einzelzellsegmentierung in Biofilmen, wird eine Verbesserung vorgestellt, welche die Genauigkeit von bisherigen Filter-basierten Algorithmen übertrifft und einen weiteren Schritt in Richtung von zeitlich und räumlich aufgelöster Einzelzellverfolgung innerhalb bakteriellen Biofilme darstellt.
Abschließend wird die Möglichkeit der Anwendung von Deep Learning Algorithmen für die Segmentierung in Biofilmen evaluiert. Dazu wird eine Methode vorgestellt welche den Annotationsaufwand von Trainingsdaten im Vergleich zu einer vollständig manuellen Annotation drastisch verkürzt. Die erstellten Daten werden für das Training von Algorithmen eingesetzt und die Genauigkeit der Segmentierung an experimentellen Daten untersucht
- …