2,706 research outputs found

    Robust and Distributed Cluster Enumeration and Object Labeling

    Get PDF
    This dissertation contributes to the area of cluster analysis by providing principled methods to determine the number of data clusters and cluster memberships, even in the presence of outliers. The main theoretical contributions are summarized in two theorems on Bayesian cluster enumeration based on modeling the data as a family of Gaussian and t distributions. Real-world applicability is demonstrated by considering advanced signal processing applications, such as distributed camera networks and radar-based person identification. In particular, a new cluster enumeration criterion, which is applicable to a broad class of data distributions, is derived by utilizing Bayes' theorem and asymptotic approximations. This serves as a starting point when deriving cluster enumeration criteria for specific data distributions. Along this line, a Bayesian cluster enumeration criterion is derived by modeling the data as a family of multivariate Gaussian distributions. In real-world applications, the observed data is often subject to heavy tailed noise and outliers which obscure the true underlying structure of the data. Consequently, estimating the number of data clusters becomes challenging. To this end, a robust cluster enumeration criterion is derived by modeling the data as a family of multivariate t distributions. The family of t distributions is flexible by variation of its degree of freedom parameter (ν) and it contains, as special cases, the heavy tailed Cauchy for ν = 1, and the Gaussian distribution for ν → ∞. Given that ν is sufficiently small, the robust criterion accounts for outliers by giving them less weight in the objective function. A further contribution of this dissertation lies in refining the penalty terms of both the robust and Gaussian criterion for the finite sample regime. The derived cluster enumeration criteria require a clustering algorithm that partitions the data according to the number of clusters specified by each candidate model and provides an estimate of cluster parameters. Hence, a model-based unsupervised learning method is applied to partition the data prior to the calculation of an enumeration criterion, resulting in a two-step algorithm. The proposed algorithm provides a unified framework for the estimation of the number of clusters and cluster memberships. The developed algorithms are applied to two advanced signal processing use cases. Specifically, the cluster enumeration criteria are extended to a distributed sensor network setting by proposing two distributed and adaptive Bayesian cluster enumeration algorithms. The proposed algorithms are applied to a camera network use case, where the task is to estimate the number of pedestrians based on streaming-in data collected by multiple cameras filming a non-stationary scene from different viewpoints. A further research focus of this dissertation is the cluster membership assignment of individual data points and their associated cluster labels given that the number of clusters is either prespecified by the user or estimated by one of the methods described earlier. Solving this task is required in a broad range of applications, such as distributed sensor networks and radar-based person identification. For this purpose, an adaptive joint object labeling and tracking algorithm is proposed and applied to a real data use case of pedestrian labeling in a calibration-free multi-object multi-camera setup with low video resolution and frequent object occlusions. The proposed algorithm is well suited for ad hoc networks, as it requires neither registration of camera views nor a fusion center. Finally, a joint cluster enumeration and labeling algorithm is proposed to deal with the combined problem of estimating the number of clusters and cluster memberships at the same time. The proposed algorithm is applied to person labeling in a real data application of radar-based person identification without prior information on the number of individuals. It achieves comparable performance to a supervised approach that requires knowledge of the number of persons and a considerable amount of training data with known cluster labels. The proposed unsupervised method is advantageous in the considered application of smart assisted living, as it extracts the missing information from the data. Based on these examples, and, also considering the comparably low computational cost, we conjuncture that the proposed methods provide a useful set of robust cluster analysis tools for data science with many potential application areas, not only in the area of engineering

    Advances in Object and Activity Detection in Remote Sensing Imagery

    Get PDF
    The recent revolution in deep learning has enabled considerable development in the fields of object and activity detection. Visual object detection tries to find objects of target classes with precise localisation in an image and assign each object instance a corresponding class label. At the same time, activity recognition aims to determine the actions or activities of an agent or group of agents based on sensor or video observation data. It is a very important and challenging problem to detect, identify, track, and understand the behaviour of objects through images and videos taken by various cameras. Together, objects and their activity recognition in imaging data captured by remote sensing platforms is a highly dynamic and challenging research topic. During the last decade, there has been significant growth in the number of publications in the field of object and activity recognition. In particular, many researchers have proposed application domains to identify objects and their specific behaviours from air and spaceborne imagery. This Special Issue includes papers that explore novel and challenging topics for object and activity detection in remote sensing images and videos acquired by diverse platforms

    Development of image processing and vision systems with industrial applications

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH

    Segmentation of surgical tools from laparoscopy images

    Get PDF
    Relatório de projeto de mestrado em Engenharia BiomédicaCirurgias roboticamente assistidas têm vindo a substituir as cirurgias abertas com enorme impacto no tempo de convalescença do paciente e consequentemente em tudo o que isso implica, economia de recursos no sector da saúde e a retoma antecipada das atividades laborais do paciente. Este tipo de cirurgia auxiliada por um sistema robótico é guiado por uma câmara laparoscópica, facultando ao médico uma visão das partes anatómicas do paciente. A fim do cirurgião se encontrar apto para operar este equipamento tem de passar por inúmeras horas de formação, tornando o processo desgastante e dispendioso. Para além do referido, a manipulação dos instrumentos cirúrgicos em concordância com a câmara laparoscópica não é de todo um processo intuitivo, ou seja, os erros de natureza subjetiva não são erradicados. A diretiva desta tese é o desenvolvimento de um sistema automático capaz de segmentar instrumentos cirúrgicos, possibilitando desta forma a monitorização constante da posição dos instrumentos. Para tal foram explorados diferentes modelos de aprendizagem automática. Numa segunda fase, foram considerados métodos que pudessem ser incorporados no modelo base. Tendo-se encontrado uma resposta, partiu-se para a comparação dos modelos previamente selecionados, com o modelo base e ainda com o otimizado. Numa terceira abordagem, de forma a melhorar as métricas que serviram de comparação, procurou-se por soluções alternativas, nomeadamente a geração de dados artificiais. Neste ponto, deparou-se com duas possibilidades, uma baseada em sistemas de aprendizagem autónoma por competição e outra em sistemas de aprendizagem de síntese de imagens a partir de ruido com densidade espectral sucessivamente incrementada. Ambas as abordagens permitiram o aumento da base de dados tendo-se aferido a sua eficácia por comparação do efeito do aumento de dados nos sistemas de segmentação. O sistema proposto pode vir a ser implementado em cirurgias roboticamente assistidas, necessitando apenas de mínimas alterações.Robotic-assisted surgeries have been replacing open surgeries with a significant impact on patient recovery time, and consequently, on various aspects such as healthcare resource savings and the early resumption of the patient's work activities. This type of surgery, assisted by a robotic system, is guided by a laparoscopic camera, providing the surgeon with a view of the patient's anatomical structures. To operate this equipment, surgeons must undergo numerous hours of training, making the process exhaustive and costly. In addition, manipulating surgical instruments in coordination with the laparoscopic camera is not an intuitive process, meaning errors of a subjective nature are not eliminated. The objective of this thesis is the development of an automated system capable of segmenting surgical instruments, thereby enabling constant monitoring of their positions. Various machine learning models were explored to address this issue. In a second phase, methods that could be incorporated into the base model were considered. Once a solution was found, a comparison was made between the previously selected models, the base model, and the optimized model. In a third approach, with the aim of improving the comparison metrics, alternative solutions were sought, including the generation of synthetic data. At this point, two possibilities were encountered, one based on autonomous learning systems through competition and the other on image synthesis learning systems from progressively increasing noise spectral density. Both approaches expanded the available database, and their effectiveness was evaluated by comparing the impact of data augmentation on segmentation systems. The proposed system can potentially be implemented in robotic-assisted surgeries with minimal modifications

    Exploiting Spatio-Temporal Coherence for Video Object Detection in Robotics

    Get PDF
    This paper proposes a method to enhance video object detection for indoor environments in robotics. Concretely, it exploits knowledge about the camera motion between frames to propagate previously detected objects to successive frames. The proposal is rooted in the concepts of planar homography to propose regions of interest where to find objects, and recursive Bayesian filtering to integrate observations over time. The proposal is evaluated on six virtual, indoor environments, accounting for the detection of nine object classes over a total of ∼ 7k frames. Results show that our proposal improves the recall and the F1-score by a factor of 1.41 and 1.27, respectively, as well as it achieves a significant reduction of the object categorization entropy (58.8%) when compared to a two-stage video object detection method used as baseline, at the cost of small time overheads (120 ms) and precision loss (0.92).</p

    Translational Functional Imaging in Surgery Enabled by Deep Learning

    Get PDF
    Many clinical applications currently rely on several imaging modalities such as Positron Emission Tomography (PET), Magnetic Resonance Imaging (MRI), Computed Tomography (CT), etc. All such modalities provide valuable patient data to the clinical staff to aid clinical decision-making and patient care. Despite the undeniable success of such modalities, most of them are limited to preoperative scans and focus on morphology analysis, e.g. tumor segmentation, radiation treatment planning, anomaly detection, etc. Even though the assessment of different functional properties such as perfusion is crucial in many surgical procedures, it remains highly challenging via simple visual inspection. Functional imaging techniques such as Spectral Imaging (SI) link the unique optical properties of different tissue types with metabolism changes, blood flow, chemical composition, etc. As such, SI is capable of providing much richer information that can improve patient treatment and care. In particular, perfusion assessment with functional imaging has become more relevant due to its involvement in the treatment and development of several diseases such as cardiovascular diseases. Current clinical practice relies on Indocyanine Green (ICG) injection to assess perfusion. Unfortunately, this method can only be used once per surgery and has been shown to trigger deadly complications in some patients (e.g. anaphylactic shock). This thesis addressed common roadblocks in the path to translating optical functional imaging modalities to clinical practice. The main challenges that were tackled are related to a) the slow recording and processing speed that SI devices suffer from, b) the errors introduced in functional parameter estimations under changing illumination conditions, c) the lack of medical data, and d) the high tissue inter-patient heterogeneity that is commonly overlooked. This framework follows a natural path to translation that starts with hardware optimization. To overcome the limitation that the lack of labeled clinical data and current slow SI devices impose, a domain- and task-specific band selection component was introduced. The implementation of such component resulted in a reduction of the amount of data needed to monitor perfusion. Moreover, this method leverages large amounts of synthetic data, which paired with unlabeled in vivo data is capable of generating highly accurate simulations of a wide range of domains. This approach was validated in vivo in a head and neck rat model, and showed higher oxygenation contrast between normal and cancerous tissue, in comparison to a baseline using all available bands. The need for translation to open surgical procedures was met by the implementation of an automatic light source estimation component. This method extracts specular reflections from low exposure spectral images, and processes them to obtain an estimate of the light source spectrum that generated such reflections. The benefits of light source estimation were demonstrated in silico, in ex vivo pig liver, and in vivo human lips, where the oxygenation estimation error was reduced when utilizing the correct light source estimated with this method. These experiments also showed that the performance of the approach proposed in this thesis surpass the performance of other baseline approaches. Video-rate functional property estimation was achieved by two main components: a regression and an Out-of-Distribution (OoD) component. At the core of both components is a compact SI camera that is paired with state-of-the-art deep learning models to achieve real time functional estimations. The first of such components features a deep learning model based on a Convolutional Neural Network (CNN) architecture that was trained on highly accurate physics-based simulations of light-tissue interactions. By doing this, the challenge of lack of in vivo labeled data was overcome. This approach was validated in the task of perfusion monitoring in pig brain and in a clinical study involving human skin. It was shown that this approach is capable of monitoring subtle perfusion changes in human skin in an arm clamping experiment. Even more, this approach was capable of monitoring Spreading Depolarizations (SDs) (deoxygenation waves) in the surface of a pig brain. Even though this method is well suited for perfusion monitoring in domains that are well represented with the physics-based simulations on which it was trained, its performance cannot be guaranteed for outlier domains. To handle outlier domains, the task of ischemia monitoring was rephrased as an OoD detection task. This new functional estimation component comprises an ensemble of Invertible Neural Networks (INNs) that only requires perfused tissue data from individual patients to detect ischemic tissue as outliers. The first ever clinical study involving a video-rate capable SI camera in laparoscopic partial nephrectomy was designed to validate this approach. Such study revealed particularly high inter-patient tissue heterogeneity under the presence of pathologies (cancer). Moreover, it demonstrated that this personalized approach is now capable of monitoring ischemia at video-rate with SI during laparoscopic surgery. In conclusion, this thesis addressed challenges related to slow image recording and processing during surgery. It also proposed a method for light source estimation to facilitate translation to open surgical procedures. Moreover, the methodology proposed in this thesis was validated in a wide range of domains: in silico, rat head and neck, pig liver and brain, and human skin and kidney. In particular, the first clinical trial with spectral imaging in minimally invasive surgery demonstrated that video-rate ischemia monitoring is now possible with deep learning

    Localization and Mapping for Self-Driving Vehicles:A Survey

    Get PDF
    The upsurge of autonomous vehicles in the automobile industry will lead to better driving experiences while also enabling the users to solve challenging navigation problems. Reaching such capabilities will require significant technological attention and the flawless execution of various complex tasks, one of which is ensuring robust localization and mapping. Recent surveys have not provided a meaningful and comprehensive description of the current approaches in this field. Accordingly, this review is intended to provide adequate coverage of the problems affecting autonomous vehicles in this area, by examining the most recent methods for mapping and localization as well as related feature extraction and data security problems. First, a discussion of the contemporary methods of extracting relevant features from equipped sensors and their categorization as semantic, non-semantic, and deep learning methods is presented. We conclude that representativeness, low cost, and accessibility are crucial constraints in the choice of the methods to be adopted for localization and mapping tasks. Second, the survey focuses on methods to build a vehicle’s environment map, considering both the commercial and the academic solutions available. The analysis proposes a difference between two types of environment, known and unknown, and develops solutions in each case. Third, the survey explores different approaches to vehicles’ localization and also classifies them according to their mathematical characteristics and priorities. Each section concludes by presenting the related challenges and some future directions. The article also highlights the security problems likely to be encountered in self-driving vehicles, with an assessment of possible defense mechanisms that could prevent security attacks in vehicles. Finally, the article ends with a debate on the potential impacts of autonomous driving, spanning energy consumption and emission reduction, sound and light pollution, integration into smart cities, infrastructure optimization, and software refinement. This thorough investigation aims to foster a comprehensive understanding of the diverse implications of autonomous driving across various domains

    Generalized Video Anomaly Event Detection: Systematic Taxonomy and Comparison of Deep Models

    Full text link
    Video Anomaly Detection (VAD) serves as a pivotal technology in the intelligent surveillance systems, enabling the temporal or spatial identification of anomalous events within videos. While existing reviews predominantly concentrate on conventional unsupervised methods, they often overlook the emergence of weakly-supervised and fully-unsupervised approaches. To address this gap, this survey extends the conventional scope of VAD beyond unsupervised methods, encompassing a broader spectrum termed Generalized Video Anomaly Event Detection (GVAED). By skillfully incorporating recent advancements rooted in diverse assumptions and learning frameworks, this survey introduces an intuitive taxonomy that seamlessly navigates through unsupervised, weakly-supervised, supervised and fully-unsupervised VAD methodologies, elucidating the distinctions and interconnections within these research trajectories. In addition, this survey facilitates prospective researchers by assembling a compilation of research resources, including public datasets, available codebases, programming tools, and pertinent literature. Furthermore, this survey quantitatively assesses model performance, delves into research challenges and directions, and outlines potential avenues for future exploration.Comment: Accepted by ACM Computing Surveys. For more information, please see our project page: https://github.com/fudanyliu/GVAE

    Deep Learning Methods for Detection and Tracking of Particles in Fluorescence Microscopy Images

    Get PDF
    Studying the dynamics of sub-cellular structures such as receptors, filaments, and vesicles is a prerequisite for investigating cellular processes at the molecular level. In addition, it is important to characterize the dynamic behavior of virus structures to gain a better understanding of infection mechanisms and to develop novel drugs. To investigate the dynamics of fluorescently labeled sub-cellular and viral structures, time-lapse fluorescence microscopy is the most often used imaging technique. Due to the limited spatial resolution of microscopes caused by diffraction, these very small structures appear as bright, blurred spots, denoted as particles, in microscopy images. To draw statistically meaningful biological conclusions, a large number of such particles need to be analyzed. However, since manual analysis of fluorescent particles is very time consuming, fully automated computer-based methods are indispensable. We introduce novel deep learning methods for detection and tracking of multiple particles in fluorescence microscopy images. We propose a particle detection method based on a convolutional neural network which performs image-to-image mapping by density map regression and uses the adaptive wing loss. For particle tracking, we present a recurrent neural network that exploits past and future information in both forward and backward direction. Assignment probabilities across multiple detections as well as the probabilities for missing detections are computed jointly. To resolve tracking ambiguities using future information, several track hypotheses are propagated to later time points. In addition, we developed a novel probabilistic deep learning method for particle tracking, which is based on a recurrent neural network mimicking classical Bayesian filtering. The method includes both aleatoric and epistemic uncertainty, and provides valuable information about the reliability of the computed trajectories. Short and long-term temporal dependencies of individual object dynamics are exploited for state prediction, and assigned detections are used to update the predicted states. Moreover, we developed a convolutional Long Short-Term Memory neural network for combined particle tracking and colocalization analysis in two-channel microscopy image sequences. The network determines colocalization probabilities, and colocalization information is exploited to improve tracking. Short and long-term temporal dependencies of object motion as well as image intensities are taken into account to compute assignment probabilities jointly across multiple detections. We also introduce a deep learning method for probabilistic particle detection and tracking. For particle detection, temporal information is integrated to regress a density map and determine sub-pixel particle positions. For tracking, a fully Bayesian neural network is presented that mimics classical Bayesian filtering and takes into account both aleatoric and epistemic uncertainty. Uncertainty information of individual particle detections is considered. Network training for the developed deep learning-based particle tracking methods relies only on synthetic data, avoiding the need of time-consuming manual annotation. We performed an extensive evaluation of our methods based on image data of the Particle Tracking Challenge as well as on fluorescence microscopy images displaying virus proteins of HCV and HIV, chromatin structures, and cell-surface receptors. It turned out that the methods outperform previous methods
    corecore