804 research outputs found
Object recognition using multi-view imaging
Single view imaging data has been used in most previous research in computer vision and
image understanding and lots of techniques have been developed. Recently with the fast
development and dropping cost of multiple cameras, it has become possible to have many
more views to achieve image processing tasks. This thesis will consider how to use the
obtained multiple images in the application of target object recognition.
In this context, we present two algorithms for object recognition based on scale-
invariant feature points. The first is single view object recognition method (SOR), which
operates on single images and uses a chirality constraint to reduce the recognition errors
that arise when only a small number of feature points are matched. The procedure is
extended in the second multi-view object recognition algorithm (MOR) which operates on
a multi-view image sequence and, by tracking feature points using a dynamic programming
method in the plenoptic domain subject to the epipolar constraint, is able to fuse feature
point matches from all the available images, resulting in more robust recognition.
We evaluated these algorithms using a number of data sets of real images capturing
both indoor and outdoor scenes. We demonstrate that MOR is better than SOR particularly for noisy and low resolution images, and it is also able to recognize objects that are
partially occluded by combining it with some segmentation techniques
Systems analysis of guard cell membrane transport for enhanced stomatal dynamics and water use efficiency
Stomatal transpiration is at the centre of a crisis in water availability and crop production that is expected to unfold over the next 20-30 years. Global water usage has increased 6-fold in the past 100 years, twice as fast as the human population, and is expected to double again before 2030, driven mainly by irrigation and agriculture. Guard cell membrane transport is integral to controlling stomatal aperture and offers important targets for genetic manipulation to improve crop performance. However, its complexity presents a formidable barrier to exploring such possibilities. With few exceptions, mutations that increase water use efficiency commonly have been found to do so with substantial costs to the rate of carbon assimilation, reflecting the trade-off in CO2 availability with suppressed stomatal transpiration. One approach yet to be explored in any detail relies on quantitative systems analysis of the guard cell. Our deep knowledge of transport and homeostasis in these cells gives real substance to the prospect for ‘reverse engineering’ of stomatal responses, using in silico design in directing genetic manipulation for improved water use and crop yields. Here we address this problem with a focus on stomatal kinetics, taking advantage of the OnGuard software and models of the stomatal guard cell (www.psrg.org.uk) recently developed for exploring stomatal physiology. Our analysis suggests that manipulations of single transporter populations are likely to have unforeseen consequences. Channel gating, especially of the dominant K+ channels, appears the most favorable target for experimental manipulation
Harvesting Discriminative Meta Objects with Deep CNN Features for Scene Classification
Recent work on scene classification still makes use of generic CNN features
in a rudimentary manner. In this ICCV 2015 paper, we present a novel pipeline
built upon deep CNN features to harvest discriminative visual objects and parts
for scene classification. We first use a region proposal technique to generate
a set of high-quality patches potentially containing objects, and apply a
pre-trained CNN to extract generic deep features from these patches. Then we
perform both unsupervised and weakly supervised learning to screen these
patches and discover discriminative ones representing category-specific objects
and parts. We further apply discriminative clustering enhanced with local CNN
fine-tuning to aggregate similar objects and parts into groups, called meta
objects. A scene image representation is constructed by pooling the feature
response maps of all the learned meta objects at multiple spatial scales. We
have confirmed that the scene image representation obtained using this new
pipeline is capable of delivering state-of-the-art performance on two popular
scene benchmark datasets, MIT Indoor 67~\cite{MITIndoor67} and
Sun397~\cite{Sun397}Comment: To Appear in ICCV 201
Probing the ligand receptor interface of TNF ligand family members RANKL and TRAIL
During the last two decades, research has shown that the tumor necrosis factor (TNF) superfamily is of importance in numerous biological activities, such as mediating cellular apoptosis, survival, differentiation or proliferation. The binding between the TNF superfamily ligands and receptors regulates normal physiological processes, while the deregulation may cause harmful effects. Therefore, targeting TNF superfamily ligands or receptors with either agonistic or antagonistic molecules may provide novel approaches for therapy. The work described in this thesis is focused on the ligand-receptor interface of TNF super family members Ligand of Receptor Activator of Nuclear Factor κB (RANKL) and TNF-Related Apoptosis Inducing Ligand (TRAIL), to design and characterization novel recombinant RANKL and TRAIL variants for their use as potential therapeutics
Collaborative Deep Reinforcement Learning for Joint Object Search
We examine the problem of joint top-down active search of multiple objects
under interaction, e.g., person riding a bicycle, cups held by the table, etc..
Such objects under interaction often can provide contextual cues to each other
to facilitate more efficient search. By treating each detector as an agent, we
present the first collaborative multi-agent deep reinforcement learning
algorithm to learn the optimal policy for joint active object localization,
which effectively exploits such beneficial contextual information. We learn
inter-agent communication through cross connections with gates between the
Q-networks, which is facilitated by a novel multi-agent deep Q-learning
algorithm with joint exploitation sampling. We verify our proposed method on
multiple object detection benchmarks. Not only does our model help to improve
the performance of state-of-the-art active localization models, it also reveals
interesting co-detection patterns that are intuitively interpretable
Speaker-following Video Subtitles
We propose a new method for improving the presentation of subtitles in video
(e.g. TV and movies). With conventional subtitles, the viewer has to constantly
look away from the main viewing area to read the subtitles at the bottom of the
screen, which disrupts the viewing experience and causes unnecessary eyestrain.
Our method places on-screen subtitles next to the respective speakers to allow
the viewer to follow the visual content while simultaneously reading the
subtitles. We use novel identification algorithms to detect the speakers based
on audio and visual information. Then the placement of the subtitles is
determined using global optimization. A comprehensive usability study indicated
that our subtitle placement method outperformed both conventional
fixed-position subtitling and another previous dynamic subtitling method in
terms of enhancing the overall viewing experience and reducing eyestrain
Stable Feature Selection from Brain sMRI
Neuroimage analysis usually involves learning thousands or even millions of
variables using only a limited number of samples. In this regard, sparse
models, e.g. the lasso, are applied to select the optimal features and achieve
high diagnosis accuracy. The lasso, however, usually results in independent
unstable features. Stability, a manifest of reproducibility of statistical
results subject to reasonable perturbations to data and the model, is an
important focus in statistics, especially in the analysis of high dimensional
data. In this paper, we explore a nonnegative generalized fused lasso model for
stable feature selection in the diagnosis of Alzheimer's disease. In addition
to sparsity, our model incorporates two important pathological priors: the
spatial cohesion of lesion voxels and the positive correlation between the
features and the disease labels. To optimize the model, we propose an efficient
algorithm by proving a novel link between total variation and fast network flow
algorithms via conic duality. Experiments show that the proposed nonnegative
model performs much better in exploring the intrinsic structure of data via
selecting stable features compared with other state-of-the-arts
- …