159 research outputs found

    Image-Based Flexible Endoscope Steering

    Get PDF
    Manually steering the tip of a flexible endoscope to navigate through an endoluminal path relies on the physician’s dexterity and experience. In this paper we present the realization of a robotic flexible endoscope steering system that uses the endoscopic images to control the tip orientation towards the direction of the lumen. Two image-based control algorithms are investigated, one is based on the optical flow and the other is based on the image intensity. Both are evaluated using simulations in which the endoscope was steered through the lumen. The RMS distance to the lumen center was less than 25% of the lumen width. An experimental setup was built using a standard flexible endoscope, and the image-based control algorithms were used to actuate the wheels of the endoscope for tip steering. Experiments were conducted in an anatomical model to simulate gastroscopy. The image intensity- based algorithm was capable of steering the endoscope tip through an endoluminal path from the mouth to the duodenum accurately. Compared to manual control, the robotically steered endoscope performed 68% better in terms of keeping the lumen centered in the image

    Efficient spatial keyword query processing on geo-textual data

    Get PDF

    Bridging the semantic gap in content-based image retrieval.

    Get PDF
    To manage large image databases, Content-Based Image Retrieval (CBIR) emerged as a new research subject. CBIR involves the development of automated methods to use visual features in searching and retrieving. Unfortunately, the performance of most CBIR systems is inherently constrained by the low-level visual features because they cannot adequately express the user\u27s high-level concepts. This is known as the semantic gap problem. This dissertation introduces a new approach to CBIR that attempts to bridge the semantic gap. Our approach includes four components. The first one learns a multi-modal thesaurus that associates low-level visual profiles with high-level keywords. This is accomplished through image segmentation, feature extraction, and clustering of image regions. The second component uses the thesaurus to annotate images in an unsupervised way. This is accomplished through fuzzy membership functions to label new regions based on their proximity to the profiles in the thesaurus. The third component consists of an efficient and effective method for fusing the retrieval results from the multi-modal features. Our method is based on learning and adapting fuzzy membership functions to the distribution of the features\u27 distances and assigning a degree of worthiness to each feature. The fourth component provides the user with the option to perform hybrid querying and query expansion. This allows the enrichment of a visual query with textual data extracted from the automatically labeled images in the database. The four components are integrated into a complete CBIR system that can run in three different and complementary modes. The first mode allows the user to query using an example image. The second mode allows the user to specify positive and/or negative sample regions that should or should not be included in the retrieved images. The third mode uses a Graphical Text Interface to allow the user to browse the database interactively using a combination of low-level features and high-level concepts. The proposed system and ail of its components and modes are implemented and validated using a large data collection for accuracy, performance, and improvement over traditional CBIR techniques

    Towards 3D Matching of Point Clouds Derived from Oblique and Nadir Airborne Imagery

    Get PDF
    Because of the low-expense high-efficient image collection process and the rich 3D and texture information presented in the images, a combined use of 2D airborne nadir and oblique images to reconstruct 3D geometric scene has a promising market for future commercial usage like urban planning or first responders. The methodology introduced in this thesis provides a feasible way towards fully automated 3D city modeling from oblique and nadir airborne imagery. In this thesis, the difficulty of matching 2D images with large disparity is avoided by grouping the images first and applying the 3D registration afterward. The procedure starts with the extraction of point clouds using a modified version of the RIT 3D Extraction Workflow. Then the point clouds are refined by noise removal and surface smoothing processes. Since the point clouds extracted from different image groups use independent coordinate systems, there are translation, rotation and scale differences existing. To figure out these differences, 3D keypoints and their features are extracted. For each pair of point clouds, an initial alignment and a more accurate registration are applied in succession. The final transform matrix presents the parameters describing the translation, rotation and scale requirements. The methodology presented in the thesis has been shown to behave well for test data. The robustness of this method is discussed by adding artificial noise to the test data. For Pictometry oblique aerial imagery, the initial alignment provides a rough alignment result, which contains a larger offset compared to that of test data because of the low quality of the point clouds themselves, but it can be further refined through the final optimization. The accuracy of the final registration result is evaluated by comparing it to the result obtained from manual selection of matched points. Using the method introduced, point clouds extracted from different image groups could be combined with each other to build a more complete point cloud, or be used as a complement to existing point clouds extracted from other sources. This research will both improve the state of the art of 3D city modeling and inspire new ideas in related fields

    Digital photo album management techniques: from one dimension to multi-dimension.

    Get PDF
    Lu Yang.Thesis submitted in: November 2004.Thesis (M.Phil.)--Chinese University of Hong Kong, 2005.Includes bibliographical references (leaves 96-103).Abstracts in English and Chinese.Abstract --- p.iAcknowledgement --- p.ivChapter 1 --- Introduction --- p.1Chapter 1.1 --- Motivation --- p.1Chapter 1.2 --- Our Contributions --- p.3Chapter 1.3 --- Thesis Outline --- p.5Chapter 2 --- Background Study --- p.7Chapter 2.1 --- MPEG-7 Introduction --- p.8Chapter 2.2 --- Image Analysis in CBIR Systems --- p.11Chapter 2.2.1 --- Color Information --- p.13Chapter 2.2.2 --- Color Layout --- p.19Chapter 2.2.3 --- Texture Information --- p.20Chapter 2.2.4 --- Shape Information --- p.24Chapter 2.2.5 --- CBIR Systems --- p.26Chapter 2.3 --- Image Processing in JPEG Frequency Domain --- p.30Chapter 2.4 --- Photo Album Clustering --- p.33Chapter 3 --- Feature Extraction and Similarity Analysis --- p.38Chapter 3.1 --- Feature Set in Frequency Domain --- p.38Chapter 3.1.1 --- JPEG Frequency Data --- p.39Chapter 3.1.2 --- Our Feature Set --- p.42Chapter 3.2 --- Digital Photo Similarity Analysis --- p.43Chapter 3.2.1 --- Energy Histogram --- p.43Chapter 3.2.2 --- Photo Distance --- p.45Chapter 4 --- 1-Dimensional Photo Album Management Techniques --- p.49Chapter 4.1 --- Photo Album Sorting --- p.50Chapter 4.2 --- Photo Album Clustering --- p.52Chapter 4.3 --- Photo Album Compression --- p.56Chapter 4.3.1 --- Variable IBP frames --- p.56Chapter 4.3.2 --- Adaptive Search Window --- p.57Chapter 4.3.3 --- Compression Flow --- p.59Chapter 4.4 --- Experiments and Performance Evaluations --- p.60Chapter 5 --- High Dimensional Photo Clustering --- p.67Chapter 5.1 --- Traditional Clustering Techniques --- p.67Chapter 5.1.1 --- Hierarchical Clustering --- p.68Chapter 5.1.2 --- Traditional K-means --- p.71Chapter 5.2 --- Multidimensional Scaling --- p.74Chapter 5.2.1 --- Introduction --- p.75Chapter 5.2.2 --- Classical Scaling --- p.77Chapter 5.3 --- Our Interactive MDS-based Clustering --- p.80Chapter 5.3.1 --- Principal Coordinates from MDS --- p.81Chapter 5.3.2 --- Clustering Scheme --- p.82Chapter 5.3.3 --- Layout Scheme --- p.84Chapter 5.4 --- Experiments and Results --- p.87Chapter 6 --- Conclusions --- p.94Bibliography --- p.9

    Apparent source levels and active communication space of whistles of free-ranging Indo-Pacific humpback dolphins (Sousa chinensis) in the Pearl River Estuary and Beibu Gulf, China

    Get PDF
    Grants for this study was provided by the National Natural Science Foundation (NNSF) of China (Grant No.31070347), the Ministry of Science and Technology of China (Grant No. 2011BAG07B05-3), the Knowledge Innovation Program of the Chinese Academy of Sciences (Grant No. KSCX2-EW-Z-4) and the Special Fund for Agro-scientific Research in the Public Interest of the Ministry of Agriculture of China (Grant No. 201203086) to DW, the State Oceanic Administration of China (Grant No. 201105011-3) and NNSF of China (Grant No. 31170501) to KXW and the China Scholarship Council (Grant No. (2014)3026) to ZTW.Background . Knowledge of species-specific vocalization characteristics and their associated active communication space, the effective range over which a communication signal can be detected by a conspecific, is critical for understanding the impacts of underwater acoustic pollution, as well as other threats. Methods. We used a two-dimensional cross-shaped hydrophone array system to record the whistles of free-ranging Indo-Pacific humpback dolphins (Sousa chinensis) in shallow-water environments of the Pearl River Estuary (PRE) and Beibu Gulf (BG), China. Using hyperbolic position fixing, which exploits time differences of arrival of a signal between pairs of hydrophone receivers, we obtained source location estimates for whistles with good signal-to-noise ratio (SNR  ≥ 10 dB) and not polluted by other sounds and back-calculated their apparent source levels (ASL). Combining with the masking levels (including simultaneous noise levels, masking tonal threshold, and the Sousa auditory threshold) and the custom made site-specific sound propagation models, we further estimated their active communication space (ACS). Results. Humpback dolphins produced whistles with average root-mean-square ASL of 138.5 ± 6.8 (mean ± standard deviation) and 137.2 ± 7.0 dB re 1 µPa in PRE (N = 33) and BG (N = 209), respectively. We found statistically significant differences in ASLs among different whistle contour types. The mean and maximum ACS of whistles were estimated to be 14.7 ± 2.6 (median ± quartile deviation) and 17.1 ± 3.5 m in PRE, and 34.2 ± 9.5 and 43.5 ±12.2 m in BG. Using just the auditory threshold as the masking level produced the mean and maximum ACSat of 24.3 ± 4.8 and 35.7 ± 4.6 m for PRE, and 60.7 ± 18.1 and 74.3 ± 25.3 m for BG. The small ACSs were due to the high ambient noise level. Significant differences in ACSs were also observed among different whistle contour types. Discussion. Besides shedding some light for evaluating appropriate noise exposure levels and information for the regulation of underwater acoustic pollution, these baseline data can also be used for aiding the passive acoustic monitoring of dolphin populations, defining the boundaries of separate groups in a more biologically meaningful way during field surveys, and guiding the appropriate approach distance for local dolphin-watching boats and research boat during focal group following.Publisher PDFPeer reviewe

    Visual Perception For Robotic Spatial Understanding

    Get PDF
    Humans understand the world through vision without much effort. We perceive the structure, objects, and people in the environment and pay little direct attention to most of it, until it becomes useful. Intelligent systems, especially mobile robots, have no such biologically engineered vision mechanism to take for granted. In contrast, we must devise algorithmic methods of taking raw sensor data and converting it to something useful very quickly. Vision is such a necessary part of building a robot or any intelligent system that is meant to interact with the world that it is somewhat surprising we don\u27t have off-the-shelf libraries for this capability. Why is this? The simple answer is that the problem is extremely difficult. There has been progress, but the current state of the art is impressive and depressing at the same time. We now have neural networks that can recognize many objects in 2D images, in some cases performing better than a human. Some algorithms can also provide bounding boxes or pixel-level masks to localize the object. We have visual odometry and mapping algorithms that can build reasonably detailed maps over long distances with the right hardware and conditions. On the other hand, we have robots with many sensors and no efficient way to compute their relative extrinsic poses for integrating the data in a single frame. The same networks that produce good object segmentations and labels in a controlled benchmark still miss obvious objects in the real world and have no mechanism for learning on the fly while the robot is exploring. Finally, while we can detect pose for very specific objects, we don\u27t yet have a mechanism that detects pose that generalizes well over categories or that can describe new objects efficiently. We contribute algorithms in four of the areas mentioned above. First, we describe a practical and effective system for calibrating many sensors on a robot with up to 3 different modalities. Second, we present our approach to visual odometry and mapping that exploits the unique capabilities of RGB-D sensors to efficiently build detailed representations of an environment. Third, we describe a 3-D over-segmentation technique that utilizes the models and ego-motion output in the previous step to generate temporally consistent segmentations with camera motion. Finally, we develop a synthesized dataset of chair objects with part labels and investigate the influence of parts on RGB-D based object pose recognition using a novel network architecture we call PartNet

    Complex Assessment of Pilot Fatigue in Terms of Physiological Parameters

    Get PDF
    Únava pilotů je jedním z hlavních důvodů leteckých nehod, ke kterým došlo v důsledku pochybení lidského činitele. Z tohoto důvodu je v zájmu zachování nejvyšších standardů letové bezpečnosti ve všech fázích letu zásadní být schopen zabránit vzniku únavy nebo alespoň být schopen ji účinně detekovat a následně na tuto skutečnost upozornit posádku, aby byla schopna unaveného člena posádky odstavit. V současnosti existují studie zabývající se detekcí a sledováním únavy pilotů prostřednictvím fyziologických parametrů, jako je srdeční aktivita, pohyby očí, aktivita horních končetin apod. Ze všech dostupných fyziologických měření se pak analýza variability srdečního rytmu (HRV) jeví jako nejvhodnější metoda zkoumání únavy pilota. Ačkoli se k hodnocení únavy používá mnoho parametrů vycházejících z analýzy variability srdečního rytmu, v literatuře neexistuje shoda o tom, které z těchto parametrů variability srdeční frekvence jsou nejdůležitější pro použití při detekci únavy piloty. Na základě tohoto nedostatku informací v kontextu současného stavu poznání je cílem této práce zjistit nejvýznamnější parametry analýzy variability srdečního rytmu, které lze přímo použít při monitorování únavy pilota. Pro účely zisku dat byly provedeny 24hodinové experimenty, při nichž byla sbírána data o srdeční aktivitě 16 subjektů na Ústavu letecké dopravy, Fakulty dopravní, Českého vysokého učení technického v Praze. Údaje o srdeční aktivitě subjektu byly zaznamenány ve formě elektrokardiogramu (EKG), zatímco plnily letové úkoly. První část této práce přináší teoretické základy únavy v prostředí kokpitu a vysvětluje několik metod, které se používají pro analýzu variability srdeční frekvence zaznamenaných signálů EKG. Následující části obsahují metody statistické analýzy používané k zjištění parametrů s nejvyšší importancí. Výsledky naznačují, že parametr pVLF analýzy ve frekvenční a časově-frekvenční doméně a parametr nHF analýzy HRV ve frekvenční doméně jsou parametry s nejvyšší importancí v případě indikace únavy člena letové posádky. Klíčová slova: Únava pilota, fyziologické parametry, srdeční aktivita, variabilita srdečního rytmuPilot fatigue is one of the main reasons of aircraft accidents that were caused due to the human error factors in flight crew. Therefore, in order to maintain the highest standards of flight safety throughout all flight phases, it is crucially important to be able to prevent occurrence of fatigue or at least to be able to efficiently detect it, afterwards alert the crew to eliminate the fatigued member from flying. At present, there are many studies focusing on detection and monitoring of pilot fatigue by tracking pilot’s physiological parameters such as: cardiac activity, eye movements, upper-limb activities etc. Among all those physiological measurements available, heart rate variability analysis seems to be the most accurate method to examine pilot fatigue. Although many indices of heart rate variability analysis are used to evaluate fatigue, there is no consensus in the literature on which of those heart rate variability indices are the most important ones to utilize on determination of pilot fatigue. Based on this lack of information on the current state of the art, the purpose of this thesis is to ascertain the most significant parameters of heart rate variability analysis that can be directly used in determining pilot fatigue. For obtaining data, a 24-hours of cardiac activity measurements were conducted on 16 subjects on a flight simulator located at the Department of Air Transport, Faculty of Transportation Sciences, Czech Technical University in Prague. The subject’s cardiac activity data were recorded in form of electrocardiogram (ECG) while they performed flying tasks. The first part of this thesis delivers a theoretical background on fatigue in the cockpit environment and explains several methods that are used for heart rate variability analysis of the recorded ECG signals. The following parts provide the statistical analysis methods used to find out the most important parameters. The results indicate that pVLF index of the frequency domain and time-frequency domain analysis and nHF parameter of frequency-domain analysis of HRV corresponds to the most important indices which indicate fatigued condition of a flight crew member. Keywords: Pilot fatigue, physiological parameters, cardiac activity, heart rate variabilit

    A COLLISION AVOIDANCE SYSTEM FOR AUTONOMOUS UNDERWATER VEHICLES

    Get PDF
    The work in this thesis is concerned with the development of a novel and practical collision avoidance system for autonomous underwater vehicles (AUVs). Synergistically, advanced stochastic motion planning methods, dynamics quantisation approaches, multivariable tracking controller designs, sonar data processing and workspace representation, are combined to enhance significantly the survivability of modern AUVs. The recent proliferation of autonomous AUV deployments for various missions such as seafloor surveying, scientific data gathering and mine hunting has demanded a substantial increase in vehicle autonomy. One matching requirement of such missions is to allow all the AUV to navigate safely in a dynamic and unstructured environment. Therefore, it is vital that a robust and effective collision avoidance system should be forthcoming in order to preserve the structural integrity of the vehicle whilst simultaneously increasing its autonomy. This thesis not only provides a holistic framework but also an arsenal of computational techniques in the design of a collision avoidance system for AUVs. The design of an obstacle avoidance system is first addressed. The core paradigm is the application of the Rapidly-exploring Random Tree (RRT) algorithm and the newly developed version for use as a motion planning tool. Later, this technique is merged with the Manoeuvre Automaton (MA) representation to address the inherent disadvantages of the RRT. A novel multi-node version which can also address time varying final state is suggested. Clearly, the reference trajectory generated by the aforementioned embedded planner must be tracked. Hence, the feasibility of employing the linear quadratic regulator (LQG) and the nonlinear kinematic based state-dependent Ricatti equation (SDRE) controller as trajectory trackers are explored. The obstacle detection module, which comprises of sonar processing and workspace representation submodules, is developed and tested on actual sonar data acquired in a sea-trial via a prototype forward looking sonar (AT500). The sonar processing techniques applied are fundamentally derived from the image processing perspective. Likewise, a novel occupancy grid using nonlinear function is proposed for the workspace representation of the AUV. Results are presented that demonstrate the ability of an AUV to navigate a complex environment. To the author's knowledge, it is the first time the above newly developed methodologies have been applied to an A UV collision avoidance system, and, therefore, it is considered that the work constitutes a contribution of knowledge in this area of work.J&S MARINE LT
    corecore