56 research outputs found
Object Tracking
Object tracking consists in estimation of trajectory of moving objects in the sequence of images. Automation of the computer object tracking is a difficult task. Dynamics of multiple parameters changes representing features and motion of the objects, and temporary partial or full occlusion of the tracked objects have to be considered. This monograph presents the development of object tracking algorithms, methods and systems. Both, state of the art of object tracking methods and also the new trends in research are described in this book. Fourteen chapters are split into two sections. Section 1 presents new theoretical ideas whereas Section 2 presents real-life applications. Despite the variety of topics contained in this monograph it constitutes a consisted knowledge in the field of computer object tracking. The intention of editor was to follow up the very quick progress in the developing of methods as well as extension of the application
Scene understanding for interactive applications
Para interactuar con el entorno, es necesario entender que está ocurriendo en la escena donde se desarrolla la acción. Décadas de investigación en el campo de la visión por computador han contribuido a conseguir sistemas que permiten interpretar de manera automática el contenido en una escena a partir de información visual. Se podría decir el objetivo principal de estos sistemas es replicar la capacidad humana para extraer toda la información a partir solo de datos visuales. Por ejemplo, uno de sus objetivos es entender como percibimosel mundo en tres dimensiones o como podemos reconocer sitios y objetos a pesar de la gran variación en su apariencia. Una de las tareas básicas para entender una escena es asignar un significado semántico a cada elemento (píxel) de una imagen. Esta tarea se puede formular como un problema de etiquetado denso el cual especifica valores (etiquetas) a cada pixel o región de una imagen. Dependiendo de la aplicación, estas etiquetas puedenrepresentar conceptos muy diferentes, desde magnitudes físicas como la información de profundidad, hasta información semántica, como la categoría de un objeto. El objetivo general en esta tesis es investigar y desarrollar nuevas técnicas para incorporar automáticamente una retroalimentación por parte del usuario, o un conocimiento previo en sistemas inteligente para conseguir analizar automáticamente el contenido de una escena. en particular,esta tesis explora dos fuentes comunes de información previa proporcionado por los usuario: interacción humana y etiquetado manual de datos de ejemplo.La primera parte de esta tesis esta dedicada a aprendizaje de información de una escena a partir de información proporcionada de manera interactiva por un usuario. Las soluciones que involucran a un usuario imponen limitaciones en el rendimiento, ya que la respuesta que se le da al usuario debe obtenerse en un tiempo interactivo. Esta tesis presenta un paradigma eficiente que aproxima cualquier magnitud por píxel a partir de unos pocos trazos del usuario. Este sistema propaga los escasos datos de entrada proporcionados por el usuario a cada píxel de la imagen. El paradigma propuesto se ha validado a través detres aplicaciones interactivas para editar imágenes, las cuales requieren un conocimiento por píxel de una cierta magnitud, con el objetivo de simular distintos efectos.Otra estrategia común para aprender a partir de información de usuarios es diseñar sistemas supervisados de aprendizaje automático. En los últimos años, las redes neuronales convolucionales han superado el estado del arte de gran variedad de problemas de reconocimiento visual. Sin embargo, para nuevas tareas, los datos necesarios de entrenamiento pueden no estar disponibles y recopilar suficientes no es siempre posible. La segunda parte de esta tesis explora como mejorar los sistema que aprenden etiquetado denso semántico a partir de imágenes previamente etiquetadas por los usuarios. En particular, se presenta y validan estrategias, basadas en los dos principales enfoques para transferir modelos basados en deep learning, para segmentación semántica, con el objetivo de poder aprender nuevas clases cuando los datos de entrenamiento no son suficientes en cantidad o precisión.Estas estrategias se han validado en varios entornos realistas muy diferentes, incluyendo entornos urbanos, imágenes aereas y imágenes submarinas.In order to interact with the environment, it is necessary to understand what is happening on it, on the scene where the action is ocurring. Decades of research in the computer vision field have contributed towards automatically achieving this scene understanding from visual information. Scene understanding is a very broad area of research within the computer vision field. We could say that it tries to replicate the human capability of extracting plenty of information from visual data. For example, we would like to understand how the people perceive the world in three dimensions or can quickly recognize places or objects despite substantial appearance variation. One of the basic tasks in scene understanding from visual data is to assign a semantic meaning to every element of the image, i.e., assign a concept or object label to every pixel in the image. This problem can be formulated as a dense image labeling problem which assigns specific values (labels) to each pixel or region in the image. Depending on the application, the labels can represent very different concepts, from a physical magnitude, such as depth information, to high level semantic information, such as an object category. The general goal in this thesis is to investigate and develop new ways to automatically incorporate human feedback or prior knowledge in intelligent systems that require scene understanding capabilities. In particular, this thesis explores two common sources of prior information from users: human interactions and human labeling of sample data. The first part of this thesis is focused on learning complex scene information from interactive human knowledge. Interactive user solutions impose limitations on the performance where the feedback to the user must be at interactive rates. This thesis presents an efficient interaction paradigm that approximates any per-pixel magnitude from a few user strokes. It propagates the sparse user input to each pixel of the image. We demonstrate the suitability of the proposed paradigm through three interactive image editing applications which require per-pixel knowledge of certain magnitude: simulate the effect of depth of field, dehazing and HDR tone mapping. Other common strategy to learn from user prior knowledge is to design supervised machine-learning approaches. In the last years, Convolutional Neural Networks (CNNs) have pushed the state-of-the-art on a broad variety of visual recognition problems. However, for new tasks, enough training data is not always available and therefore, training from scratch is not always feasible. The second part of this thesis investigates how to improve systems that learn dense semantic labeling of images from user labeled examples. In particular, we present and validate strategies, based on common transfer learning approaches, for semantic segmentation. The goal of these strategies is to learn new specific classes when there is not enough labeled data to train from scratch. We evaluate these strategies across different environments, such as autonomous driving scenes, aerial images or underwater ones.<br /
Advances in Bioengineering
The technological approach and the high level of innovation make bioengineering extremely dynamic and this forces researchers to continuous updating. It involves the publication of the results of the latest scientific research. This book covers a wide range of aspects and issues related to advances in bioengineering research with a particular focus on innovative technologies and applications. The book consists of 13 scientific contributions divided in four sections: Materials Science; Biosensors. Electronics and Telemetry; Light Therapy; Computing and Analysis Techniques
Fast algorithm for real-time rings reconstruction
The GAP project is dedicated to study the application of GPU in several contexts in which
real-time response is important to take decisions. The definition of real-time depends on
the application under study, ranging from answer time of μs up to several hours in case
of very computing intensive task. During this conference we presented our work in low
level triggers [1] [2] and high level triggers [3] in high energy physics experiments, and
specific application for nuclear magnetic resonance (NMR) [4] [5] and cone-beam CT [6].
Apart from the study of dedicated solution to decrease the latency due to data transport
and preparation, the computing algorithms play an essential role in any GPU application.
In this contribution, we show an original algorithm developed for triggers application, to
accelerate the ring reconstruction in RICH detector when it is not possible to have seeds
for reconstruction from external trackers
Multimodal Navigation for Accurate Space Rendezvous Missions
© Cranfield University 2021. All rights reserved. No part of
this publication may be reproduced without the written
permission of the copyright ownerRelative navigation is paramount in space missions that involve rendezvousing
between two spacecraft. It demands accurate and continuous estimation of the six
degree-of-freedom relative pose, as this stage involves close-proximity-fast-reaction
operations that can last up to five orbits. This has been routinely achieved thanks to
active sensors such as lidar, but their large size, cost, power and limited operational
range remain a stumbling block for en masse on-board integration. With the onset
of faster processing units, lighter and cheaper passive optical sensors are emerging as
the suitable alternative for autonomous rendezvous in combination with computer
vision algorithms. Current vision-based solutions, however, are limited by adverse
illumination conditions such as solar glare, shadowing, and eclipse. These effects are
exacerbated when the target does not hold cooperative markers to accommodate the
estimation process and is incapable of controlling its rotational state.
This thesis explores novel model-based methods that exploit sequences of monoc ular images acquired by an on-board camera to accurately carry out spacecraft
relative pose estimation for non-cooperative close-range rendezvous with a known
artificial target. The proposed solutions tackle the current challenges of imaging in
the visible spectrum and investigate the contribution of the long wavelength infrared
(or “thermal”) band towards a combined multimodal approach.
As part of the research, a visible-thermal synthetic dataset of a rendezvous
approach with the defunct satellite Envisat is generated from the ground up using a
realistic orbital camera simulator. From the rendered trajectories, the performance
of several state-of-the-art feature detectors and descriptors is first evaluated for
both modalities in a tailored scenario for short and wide baseline image processing
transforms. Multiple combinations, including the pairing of algorithms with their
non-native counterparts, are tested. Computational runtimes are assessed in an
embedded hardware board.
From the insight gained, a method to estimate the pose on the visible band is
derived from minimising geometric constraints between online local point and edge
contour features matched to keyframes generated offline from a 3D model of the
target. The combination of both feature types is demonstrated to achieve a pose
solution for a tumbling target using a sparse set of training images, bypassing the
need for hardware-accelerated real-time renderings of the model.
The proposed algorithm is then augmented with an extended Kalman filter
which processes each feature-induced minimisation output as individual pseudo measurements, fusing them to estimate the relative pose and velocity states at
each time-step. Both the minimisation and filtering are established using Lie group
formalisms, allowing for the covariance of the solution computed by the former to be automatically incorporated as measurement noise in the latter, providing
an automatic weighing of each feature type directly related to the quality of the
matches. The predicted states are then used to search for new feature matches in the
subsequent time-step. Furthermore, a method to derive a coarse viewpoint estimate
to initialise the nominal algorithm is developed based on probabilistic modelling of
the target’s shape. The robustness of the complete approach is demonstrated for
several synthetic and laboratory test cases involving two types of target undergoing
extreme illumination conditions.
Lastly, an innovative deep learning-based framework is developed by processing
the features extracted by a convolutional front-end with long short-term memory cells,
thus proposing the first deep recurrent convolutional neural network for spacecraft
pose estimation. The framework is used to compare the performance achieved by
visible-only and multimodal input sequences, where the addition of the thermal band
is shown to greatly improve the performance during sunlit sequences. Potential
limitations of this modality are also identified, such as when the target’s thermal
signature is comparable to Earth’s during eclipse.PH
NOVEL ALGORITHMS AND TOOLS FOR LIGAND-BASED DRUG DESIGN
Computer-aided drug design (CADD) has become an indispensible component in modern drug discovery projects. The prediction of physicochemical properties and pharmacological properties of candidate compounds effectively increases the probability for drug candidates to pass latter phases of clinic trials. Ligand-based virtual screening exhibits advantages over structure-based drug design, in terms of its wide applicability and high computational efficiency. The established chemical repositories and reported bioassays form a gigantic knowledgebase to derive quantitative structure-activity relationship (QSAR) and structure-property relationship (QSPR). In addition, the rapid advance of machine learning techniques suggests new solutions for data-mining huge compound databases. In this thesis, a novel ligand classification algorithm, Ligand Classifier of Adaptively Boosting Ensemble Decision Stumps (LiCABEDS), was reported for the prediction of diverse categorical pharmacological properties. LiCABEDS was successfully applied to model 5-HT1A ligand functionality, ligand selectivity of cannabinoid receptor subtypes, and blood-brain-barrier (BBB) passage. LiCABEDS was implemented and integrated with graphical user interface, data import/export, automated model training/ prediction, and project management. Besides, a non-linear ligand classifier was proposed, using a novel Topomer kernel function in support vector machine. With the emphasis on green high-performance computing, graphics processing units are alternative platforms for computationally expensive tasks. A novel GPU algorithm was designed and implemented in order to accelerate the calculation of chemical similarities with dense-format molecular fingerprints. Finally, a compound acquisition algorithm was reported to construct structurally diverse screening library in order to enhance hit rates in high-throughput screening
Measurement of the ZH(H -> bb) associated production with Z l+l- in pp collisions at square root of s = 13 TeV with the ATLAS Experiment
The signal strength measurement of the ppVH(H \rightarrow bb) at a center of mass energy
of pp collision of 13 TeV is presented in this thesis. The data have been collected with the
ATLAS detector in 2015, 2016 and 2017 data taking corresponding to an integrated luminosity
of 79.8 fb-1. The analysis has been performed in three different channels distinguished
according to the number of charged leptons in the final state coming from the leptonical
decay of the associated vector boson. The measured signal strength, with respect to the
Standard Model expectation, is \rightarrow bb\sigma\rho-parameter (defined as the ratio of the real to imaginary part of the elastic scattering amplitude
in the forward direction) is also presented.
LUCID-2 is the reference detector for online and offline luminosity measurements for ATLAS.
It is described with particular attention to the PMT gain monitoring system where I
developed a code for the analysis of the calibration data.Il tema principale di questa tesi è la misura della \emph{signal strength} del canale di produzione associata dell'Higgs con un bosone vettore pp \rightarrow VH(H\rightarrow b\bar{b})b^{-1}. L'analisi è stata effettuata in tre diversi canali, distinti tra loro in base al numero di leptoni carichi nello stato finale, derivanti dal decadimento leptonico del bosone vettore associato. La \emph{signal strength} misurata dalla combinazione dei tre canali è \mu_{VH(H\rightarrow b\bar{b})}^{+0.27}_{-0.31}\sigma^{-1}), sottolineando i principali cambiamenti rispetto alla strategia precedentemente adottata.\\
E' fornita una descrizione del LUCID-2, il rivelatore di riferimento per le misure \emph{online} e \emph{offline} di luminosità, con particolare attenzione al sistema di monitoraggio del guadagno dei fotomoltiplicatori, per cui mi sono occupata dell'analisi dei dati di calibrazione raccolti.
E' inoltre presentata una descrizione dei miei studi fenomenologici sull'evoluzione della sezione d'urto totale adronica e del parametro \rho$ (definito come il rapporto tra la parte reale e immaginaria dell'ampiezza di scattering elastico nella direzione \emph{forward})
Recommended from our members
Semi-Supervised Learning for Scalable and Robust Visual Search
Unlike textual document retrieval, searching of visual data is still far from satisfactory. There exist major gaps between the available solutions and practical needs in both accuracy and computational cost. This thesis aims at the development of robust and scalable solutions for visual search and retrieval. Specifically, we investigate two classes of approaches: graph-based semi-supervised learning and hashing techniques. The graph-based approaches are used to improve accuracy, while hashing approaches are used to improve efficiency and cope with large-scale applications. A common theme shared between these two subareas of our work is the focus on semi-supervised learning paradigm, in which a small set of labeled data is complemented with large unlabeled datasets. Graph-based approaches have emerged as methods of choice for general semi-supervised tasks when no parametric information is available about the data distribution. It treats both labeled and unlabeled samples as vertices in a graph and then instantiates pairwise edges between these vertices to capture affinity between the corresponding samples. A quadratic regularization framework has been widely used for label prediction over such graphs. However, most of the existing graph-based semi-supervised learning methods are sensitive to the graph construction process and the initial labels. We propose a new bivariate graph transduction formulation and an efficient solution via an alternating minimization procedure. Based on this bivariate framework, we also develop new methods to filter unreliable and noisy labels. Extensive experiments over diverse benchmark datasets demonstrate the superior performance of our proposed methods. However, graph-based approaches suffer from the critical bottleneck in scalability since graph construction requires a quadratic complexity and the inference procedure costs even more. The widely used graph construction method relies on nearest neighbor search, which is prohibitive for large-scale applications. In addition, most large-scale visual search problems involve handling high-dimensional visual descriptors, thereby causing another challenge in excessive storage requirement. To handle the scalability issue of both computation and storage, the second part of the thesis focuses on efficient techniques for conducting approximate nearest neighbor (ANN) search, which is key to many machine learning algorithms, including graph-based semi-supervised learning and clustering. Specifically, we propose Semi-Supervised Hashing (SSH) methods that leverage semantic similarity over a small set of labeled data while preventing overfitting. We derive a rigorous formulation in which a supervised term minimizes the empirical errors on the labeled data and an unsupervised term provides effective regularization by maximizing variance and independence of individual bits. Experiments on several large datasets demonstrate the clear performance gain over several state-of-the-art methods without significant increase of the computational cost. The main contributions of the thesis include the following. Bivariate graph transduction: a) a bivariate formulation for graph-based semi-supervised learning with an efficient solution by alternating optimization; b) theoretic analysis from the view of graph cut for the bivariate optimization procedure; c) novel applications of the proposed techniques, such as interactive image retrieval, automatic re-ranking for text based image search, and a brain computer interface (BCI) for image retrieval. Semi-supervised hashing: a) a rigorous semi-supervised paradigm for hash functions learning with a tradeoff between empirical fitness on pair-wise label consistence and an information-theoretic regularizer; b) several efficient solutions for deriving semi-supervised hash functions, including an orthogonal solution using eigen-decomposition, a revised strategy for learning non-orthogonal hash functions, a sequential learning algorithm to derive boosted hash functions, and an extension to unsupervised cases by using pseudo labels. Two parts of the thesis - bivariate graph transduction and semi-supervised hashing - are complimentary and can be combined to achieve significant performance improvement in both speed and accuracy. Hash methods can help build sparse graphs in a linear time fashion and greatly reduce the data size, but they lack sufficient accuracy. Graph-based methods provide unique capabilities to handle non-linear data structures with noisy labels but suffer from high computational complexity. The synergistic combination of the two offers great potential for advancing the state-of-the-art in large-scale visual search and many other applications
- …