Search CORE

117 research outputs found

Acceleration of Computer Vision Algorithms for Star Trackers on SoC FPGA Platforms

Author: Panousopoulos Vasileios
Πανουσόπουλος Βασίλειος
Publication venue
Publication date: 12/10/2022
Field of study

DSpace at NTUA

Recommended from our members

Multimodal Indexing of Presentation Videos

Author: Merler Michele
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2013
Field of study

This thesis presents four novel methods to help users efficiently and effectively retrieve information from unstructured and unsourced multimedia sources, in particular the increasing amount and variety of presentation videos such as those in e-learning, conference recordings, corporate talks, and student presentations. We demonstrate a system to summarize, index and cross-reference such videos, and measure the quality of the produced indexes as perceived by the end users. We introduce four major semantic indexing cues: text, speaker faces, graphics, and mosaics, going beyond standard tag based searches and simple video playbacks. This work aims at recognizing visual content "in the wild", where the system cannot rely on any additional information besides the video itself. For text, within a scene text detection and recognition framework, we present a novel locally optimal adaptive binarization algorithm, implemented with integral histograms. It determines of an optimal threshold that maximizes the between-classes variance within a subwindow, with computational complexity independent from the size of the window itself. We obtain character recognition rates of 74%, as validated against ground truth of 8 presentation videos spanning over 1 hour and 45 minutes, which almost doubles the baseline performance of an open source OCR engine. For speaker faces, we detect, track, match, and finally select a humanly preferred face icon per speaker, based on three quality measures: resolution, amount of skin, and pose. We register a 87% accordance (51 out of 58 speakers) between the face indexes automatically generated from three unstructured presentation videos of approximately 45 minutes each, and human preferences recorded through Mechanical Turk experiments. For diagrams, we locate graphics inside frames showing a projected slide, cluster them according to an on-line algorithm based on a combination of visual and temporal information, and select and color-correct their representatives to match human preferences recorded through Mechanical Turk experiments. We register 71% accuracy (57 out of 81 unique diagrams properly identified, selected and color-corrected) on three hours of videos containing five different presentations. For mosaics, we combine two existing suturing measures, to extend video images into in-the-world coordinate system. A set of frames to be registered into a mosaic are sampled according to the PTZ camera movement, which is computed through least square estimation starting from the luminance constancy assumption. A local features based stitching algorithm is then applied to estimate the homography among a set of video frames and median blending is used to render pixels in overlapping regions of the mosaic. For two of these indexes, namely faces and diagrams, we present two novel MTurk-derived user data collections to determine viewer preferences, and show that they are matched in selection by our methods. The net result work of this thesis allows users to search, inside a video collection as well as within a single video clip, for a segment of presentation by professor X on topic Y, containing graph Z

Columbia University Academic Commons

Deep Representation Learning with Limited Data for Biomedical Image Synthesis, Segmentation, and Detection

Author: Kamran Sharif Amit
Publication venue
Publication date: 27/06/2023
Field of study

Biomedical imaging requires accurate expert annotation and interpretation that can aid medical staff and clinicians in automating differential diagnosis and solving underlying health conditions. With the advent of Deep learning, it has become a standard for reaching expert-level performance in non-invasive biomedical imaging tasks by training with large image datasets. However, with the need for large publicly available datasets, training a deep learning model to learn intrinsic representations becomes harder. Representation learning with limited data has introduced new learning techniques, such as Generative Adversarial Networks, Semi-supervised Learning, and Self-supervised Learning, that can be applied to various biomedical applications. For example, ophthalmologists use color funduscopy (CF) and fluorescein angiography (FA) to diagnose retinal degenerative diseases. However, fluorescein angiography requires injecting a dye, which can create adverse reactions in the patients. So, to alleviate this, a non-invasive technique needs to be developed that can translate fluorescein angiography from fundus images. Similarly, color funduscopy and optical coherence tomography (OCT) are also utilized to semantically segment the vasculature and fluid build-up in spatial and volumetric retinal imaging, which can help with the future prognosis of diseases. Although many automated techniques have been proposed for medical image segmentation, the main drawback is the model's precision in pixel-wise predictions. Another critical challenge in the biomedical imaging field is accurately segmenting and quantifying dynamic behaviors of calcium signals in cells. Calcium imaging is a widely utilized approach to studying subcellular calcium activity and cell function; however, large datasets have yielded a profound need for fast, accurate, and standardized analyses of calcium signals. For example, image sequences from calcium signals in colonic pacemaker cells ICC (Interstitial cells of Cajal) suffer from motion artifacts and high periodic and sensor noise, making it difficult to accurately segment and quantify calcium signal events. Moreover, it is time-consuming and tedious to annotate such a large volume of calcium image stacks or videos and extract their associated spatiotemporal maps. To address these problems, we propose various deep representation learning architectures that utilize limited labels and annotations to address the critical challenges in these biomedical applications. To this end, we detail our proposed semi-supervised, generative adversarial networks and transformer-based architectures for individual learning tasks such as retinal image-to-image translation, vessel and fluid segmentation from fundus and OCT images, breast micro-mass segmentation, and sub-cellular calcium events tracking from videos and spatiotemporal map quantification. We also illustrate two multi-modal multi-task learning frameworks with applications that can be extended to other domains of biomedical applications. The main idea is to incorporate each of these as individual modules to our proposed multi-modal frameworks to solve the existing challenges with 1) Fluorescein angiography synthesis, 2) Retinal vessel and fluid segmentation, 3) Breast micro-mass segmentation, and 4) Dynamic quantification of calcium imaging datasets

University of Nevada, Reno ScholarWorks Repository

Development and Experimental Analysis of Wireless High Accuracy Ultra-Wideband Localization Systems for Indoor Medical Applications

Author: Kuhn Michael Joseph
Publication venue: TRACE: Tennessee Research and Creative Exchange
Publication date: 01/05/2012
Field of study

This dissertation addresses several interesting and relevant problems in the field of wireless technologies applied to medical applications and specifically problems related to ultra-wideband high accuracy localization for use in the operating room. This research is cross disciplinary in nature and fundamentally builds upon microwave engineering, software engineering, systems engineering, and biomedical engineering. A good portion of this work has been published in peer reviewed microwave engineering and biomedical engineering conferences and journals. Wireless technologies in medicine are discussed with focus on ultra-wideband positioning in orthopedic surgical navigation. Characterization of the operating room as a medium for ultra-wideband signal transmission helps define system design requirements. A discussion of the first generation positioning system provides a context for understanding the overall system architecture of the second generation ultra-wideband positioning system outlined in this dissertation. A system-level simulation framework provides a method for rapid prototyping of ultra-wideband positioning systems which takes into account all facets of the system (analog, digital, channel, experimental setup). This provides a robust framework for optimizing overall system design in realistic propagation environments. A practical approach is taken to outline the development of the second generation ultra-wideband positioning system which includes an integrated tag design and real-time dynamic tracking of multiple tags. The tag and receiver designs are outlined as well as receiver-side digital signal processing, system-level design support for multi-tag tracking, and potential error sources observed in dynamic experiments including phase center error, clock jitter and drift, and geometric position dilution of precision. An experimental analysis of the multi-tag positioning system provides insight into overall system performance including the main sources of error. A five base station experiment shows the potential of redundant base stations in improving overall dynamic accuracy. Finally, the system performance in low signal-to-noise ratio and non-line-of-sight environments is analyzed by focusing on receiver-side digitally-implemented ranging algorithms including leading-edge detection and peak detection. These technologies are aimed at use in next-generation medical systems with many applications including surgical navigation, wireless telemetry, medical asset tracking, and in vivo wireless sensors

University of Tennessee, Knoxville: Trace

REPRESENTATION LEARNING FOR ACTION RECOGNITION

Author: C Krishna Mohan
Roy Debaditya
Publication venue
Publication date: 01/01/2018
Field of study

The objective of this research work is to develop discriminative representations for human actions. The motivation stems from the fact that there are many issues encountered while capturing actions in videos like intra-action variations (due to actors, viewpoints, and duration), inter-action similarity, background motion, and occlusion of actors. Hence, obtaining a representation which can address all the variations in the same action while maintaining discrimination with other actions is a challenging task. In literature, actions have been represented either using either low-level or high-level features. Low-level features describe the motion and appearance in small spatio-temporal volumes extracted from a video. Due to the limited space-time volume used for extracting low-level features, they are not able to account for viewpoint and actor variations or variable length actions. On the other hand, high-level features handle variations in actors, viewpoints, and duration but the resulting representation is often high-dimensional which introduces the curse of dimensionality. In this thesis, we propose new representations for describing actions by combining the advantages of both low-level and high-level features. Specifically, we investigate various linear and non-linear decomposition techniques to extract meaningful attributes in both high-level and low-level features. In the first approach, the sparsity of high-level feature descriptors is leveraged to build action-specific dictionaries. Each dictionary retains only the discriminative information for a particular action and hence reduces inter-action similarity. Then, a sparsity-based classification method is proposed to classify the low-rank representation of clips obtained using these dictionaries. We show that this representation based on dictionary learning improves the classification performance across actions. Also, a few of the actions consist of rapid body deformations that hinder the extraction of local features from body movements. Hence, we propose to use a dictionary which is trained on convolutional neural network (CNN) features of the human body in various poses to reliably identify actors from the background. Particularly, we demonstrate the efficacy of sparse representation in the identification of the human body under rapid and substantial deformation. In the first two approaches, sparsity-based representation is developed to improve discriminability using class-specific dictionaries that utilize action labels. However, developing an unsupervised representation of actions is more beneficial as it can be used to both recognize similar actions and localize actions. We propose to exploit inter-action similarity to train a universal attribute model (UAM) in order to learn action attributes (common and distinct) implicitly across all the actions. Using maximum aposteriori (MAP) adaptation, a high-dimensional super action-vector (SAV) for each clip is extracted. As this SAV contains redundant attributes of all other actions, we use factor analysis to extract a novel lowvi dimensional action-vector representation for each clip. Action-vectors are shown to suppress background motion and highlight actions of interest in both trimmed and untrimmed clips that contributes to action recognition without the help of any classifiers. It is observed during our experiments that action-vector cannot effectively discriminate between actions which are visually similar to each other. Hence, we subject action-vectors to supervised linear embedding using linear discriminant analysis (LDA) and probabilistic LDA (PLDA) to enforce discrimination. Particularly, we show that leveraging complimentary information across action-vectors using different local features followed by discriminative embedding provides the best classification performance. Further, we explore non-linear embedding of action-vectors using Siamese networks especially for fine-grained action recognition. A visualization of the hidden layer output in Siamese networks shows its ability to effectively separate visually similar actions. This leads to better classification performance than linear embedding on fine-grained action recognition. All of the above approaches are presented on large unconstrained datasets with hundreds of examples per action. However, actions in surveillance videos like snatch thefts are difficult to model because of the diverse variety of scenarios in which they occur and very few labeled examples. Hence, we propose to utilize the universal attribute model (UAM) trained on large action datasets to represent such actions. Specifically, we show that there are similarities between certain actions in the large datasets with snatch thefts which help in extracting a representation for snatch thefts using the attributes from the UAM. This representation is shown to be effective in distinguishing snatch thefts from regular actions with high accuracy.In summary, this thesis proposes both supervised and unsupervised approaches for representing actions which provide better discrimination than existing representations. The first approach presents a dictionary learning based sparse representation for effective discrimination of actions. Also, we propose a sparse representation for the human body based on dictionaries in order to recognize actions with rapid body deformations. In the next approach, a low-dimensional representation called action-vector for unsupervised action recognition is presented. Further, linear and non-linear embedding of action-vectors is proposed for addressing inter-action similarity and fine-grained action recognition, respectively. Finally, we propose a representation for locating snatch thefts among thousands of regular interactions in surveillance videos

Research Archive of Indian Institute of Technology Hyderabad

Measurement of the charge asymmetry in beauty-dijet production at the LHCb experiment

Author: Bradley Matt
Publication venue: Physics, Imperial College London
Publication date: 01/08/2023
Field of study

A measurement of the charge asymmetry in beauty-dijet production at LHCb is presented in this thesis. The measurement uses the 2016 dataset of proton-proton collisions gathered by the detector, corresponding to an integrated luminosity of 1.7 fb-1. The charge asymmetry is measured in three bins of the invariant dijet mass, with bin edges at: [50, 75, 105, 150] GeV. This represents the first measurement of the charge asymmetry in beauty-dijet production in proton-proton collisions at a centre-of-mass energy of 13 TeV. To make the measurement a charge tagging method is first developed and tested in order to distinguish between the presence of b- and anti-b-quarks in jets. The calibration of simulated events is then carried out to correct for mismodelling when compared to the data. Next, fits are performed to extract the yield of beauty-dijets in the data sample. These yields are then corrected for detector and mistagging effects. The values of the asymmetry are then calculated. Results are compared to Standard Model predictions and are found to agree to within one standard deviation in all three bins.Open Acces

Spiral - Imperial College Digital Repository

Planning, Estimation and Control for Mobile Robot Localization with Application to Long-Term Autonomy

Author: Agarwal Saurav
Publication venue
Publication date: 17/01/2019
Field of study

There may arise two kinds of challenges in the problem of mobile robot localization; (i) a robot may have an a priori map of its environment, in which case the localization problem boils down to estimating the robot pose relative to a global frame or (ii) no a priori map information is given, in which case a robot may have to estimate a model of its environment and localize within it. In the case of a known map, simultaneous planning while localizing is a crucial ability for operating under uncertainty. We first address this problem by designing a method to dynamically replan while the localization uncertainty or environment map is updated. Extensive simulations are conducted to compare the proposed method with the performance of FIRM (Feedback-based Information RoadMap). However, a shortcoming of this method is its reliance on a Gaussian assumption for the Probability Density Function (pdf) on the robot state. This assumption may be violated during autonomous operation when a robot visits parts of the environment which appear similar to others. Such situations lead to ambiguity in data association between what is seen and the robot’s map leading to a non-Gaussian pdf on the robot state. We address this challenge by developing a motion planning method to resolve situations where ambiguous data associations result in a multimodal hypothesis on the robot state. A Receding Horizon approach is developed, to plan actions that sequentially disambiguate a multimodal belief to achieve tight localization on the correct pose in finite time. In our method, disambiguation is achieved through active data associations by picking target states in the map which allow distinctive information to be observed for each belief mode and creating local feedback controllers to visit the targets. Experiments are conducted for a kidnapped physical ground robot operating in an artificial maze-like environment. The hardest challenge arises when no a priori information is present. In longterm tasks where a robot must drive for long durations before closing loops, our goal is to minimize the localization error growth rate such that; (i) accurate data associations can be made for loop closure, or (ii) in cases where loop closure is not possible, the localization error stays limited within some desired bounds. We analyze this problem and show that accurate heading estimation is key to limiting localization error drift. We make three contributions in this domain. First we present a method for accurate long-term localization using absolute orientation measurements and analyze the underlying structure of the SLAM problem and how it is affected by unbiased heading measurements. We show that consistent estimates over a 100km trajectory are possible and that the error growth rate can be controlled with active data acquisition. Then we study the more general problem when orientation measurements may not be present and develop a SLAM technique to separate orientation and position estimation. We show that our method’s accuracy degrades gracefully compared to the standard non-linear optimization based SLAM approach and avoids catastrophic failures which may occur due a bad initial guess in non-linear optimization. Finally we take our understanding of orientation sensing into the physical world and demonstrate a 2D SLAM technique that leverages absolute orientation sensing based on naturally occurring structural cues. We demonstrate our method using both high-fidelity simulations and a real-world experiment in a 66, 000 square foot warehouse. Empirical studies show that maps generated by our approach never suffer catastrophic failure, whereas existing scan matching based SLAM methods fail ≈ 50% of the time

Texas A&M Repository

Study and design of a Business Model that explore the complementarity of VLEO platforms for Vessel Tracking

Author: Gómez González Carla
Publication venue: Universitat Politècnica de Catalunya
Publication date: 31/01/2022
Field of study

Throughout this study, the application of satellites in a Very Low Earth Orbit (VLEO) is analyzed to complement the already existing technologies used for vessels tracking. This study is part of the DISCOVERER project, which focuses on the research and development of VLEO technologies to apply them in Earth Observation (EO). Within the team, the UPC focuses on market analysis and the study of business opportunities for VLEO technologies. A value proposition is developed following the Canvas model, this being the strategy used to offer a service to a specific client. For the development of the value proposition, the study focuses on optimizing vessels tracking for maritime transport companies. A market study is carried out previously, to analyse how could the value proposition fit in it. The analysis determines that the optimal methodology and technologies to complement the platforms currently used for vessels tracking with an AIS system (Automatic Identification System) installed, is through Data Integration. This method refers to the combination of the data obtained by different platforms (satellites with different technologies and in different orbits complementing both, aerial and terrestrial platforms) once received in the ground station. For the tracking of those ships exempt from carrying an AIS transponder or those that do not want to be tracked, the optimal tracking method would be the combination of data between different platforms before being received on the ground station (System Integration)

UPCommons. Portal del coneixement obert de la UPC

Conference Proceedings of the 4th Biennial Symposium on Turbulence in Liquids

Author: University of Missouri--Rolla
Publication venue: Scholars\u27 Mine
Publication date: 01/01/1975
Field of study

Missouri University of Science and Technology (Missouri S&T): Scholars' Mine

STEP: Satellite Test of the Equivalence Principle. Report on the phase A study

Author: Blaser J. P.
Bye M.
Cavallo G.
Damour T.
Everitt C. W. F.
Hedin A.
Hellings R. W.
Jafry Y.
Laurance R.
Lee M.
Publication venue
Publication date
Field of study

During Phase A, the STEP Study Team identified three types of experiments that can be accommodated on the STEP satellite within the mission constraints and whose performance is orders of magnitude better than any present or planned future experiment of the same kind on the ground. The scientific objectives of the STEP mission are to: test the Equivalence Principle to one part in 10(exp 17), six orders of magnitude better than has been achieved on the ground; search for a new interaction between quantum-mechanical spin and ordinary matter with a sensitivity of the mass-spin coupling constant g(sub p)g(sub s) = 6 x 10(exp -34) at a range of 1 mm, which represents a seven order-of-magnitude improvement over comparable ground-based measurements; and determine the constant of gravity G with a precision of one part in 10(exp 6) and to test the validity of the inverse square law with the same precision, both two orders of magnitude better than has been achieved on the ground

NASA Technical Reports Server