62 research outputs found

    Agent and object aware tracking and mapping methods for mobile manipulators

    Get PDF
    The age of the intelligent machine is upon us. They exist in our factories, our warehouses, our military, our hospitals, on our roads, and on the moon. Most of these things we call robots. When placed in a controlled or known environment such as an automotive factory or a distribution warehouse they perform their given roles with exceptional efficiency, achieving far more than is within reach of a humble human being. Despite the remarkable success of intelligent machines in such domains, they have yet to make a full-hearted deployment into our homes. The missing link between the robots we have now and the robots that are soon to come to our houses is perception. Perception as we mean it here refers to a level of understanding beyond the collection and aggregation of sensory data. Much of the available sensory information is noisy and unreliable, our homes contain many reflective surfaces, repeating textures on large flat surfaces, and many disruptive moving elements, including humans. These environments change over time, with objects frequently moving within and between rooms. This idea of change in an environment is fundamental to robotic applications, as in most cases we expect them to be effectors of such change. We can identify two particular challenges1 that must be solved for robots to make the jump to less structured environments - how to manage noise and disruptive elements in observational data, and how to understand the world as a set of changeable elements (objects) which move over time within a wider environment. In this thesis we look at one possible approach to solving each of these problems. For the first challenge we use proprioception aboard a robot with an articulated arm to handle difficult and unreliable visual data caused both by the robot and the environment. We use sensor data aboard the robot to improve the pose tracking of a visual system when the robot moves rapidly, with high jerk, or when observing a scene with little visual variation. For the second challenge, we build a model of the world on the level of rigid objects, and relocalise them both as they change location between different sequences and as they move. We use semantics, image keypoints, and 3D geometry to register and align objects between sequences, showing how their position has moved between disparate observations.Open Acces

    High-Resolution Numerical Simulation of Turbulent Interfacial Marine Flows.

    Full text link
    An important aspect of designing offshore structures and seagoing vessels is an accurate prediction of the loads associated with wave impacts. In regions near the shore or during storms at sea, breaking waves are a common occurrence and the loading caused by their impact is typically more severe than in the case of regular non-breaking waves. Present methods for numerically predicting the impact forces use potential-flow methods with empirically-derived coefficients or relatively low-order methods in the computational-fluid dynamics (CFD) family. The potential-flow methods usually cannot simulate wave breaking and thus correction factors are necessary to account for slamming-like impacts that may occur due to plunging breakers. In some applications of the CFD tools, turbulence models are used to approximate the turbulent wave-breaking process in an effort to improve the prediction of the flow. The present work expands the understanding of the turbulence-interface interaction using highly-resolved numerical simulations to improve the CFD modeling capabilities in marine applications. The complex behavior of turbulence in the proximity of a deformable interface separating two incompressible phases is studied using two variants of CFD: direct numerical simulations (DNS) and large-eddy simulations (LES) that require modeling of the turbulence closure terms. Canonical flows are studied with DNS to determine the influence of the information typically not resolved by lower-order CFD methods and to establish the hierarchy of the modeling terms present in the governing equations. The relative magnitude of the convective and the interfacial subgrid terms are found to be significant and thus not negligible for a plunging-breaking wave flow. A scale-similarity-based model is proposed and implemented in the LES solver to include the effects of the unresolved flow features associated with the presence of the interface. The model is found to successfully approximate the subgrid behavior in multiphase flows with sufficient spatial and temporal resolution. The multiphase LES framework is extended to the study of breaking waves impinging on an offshore platform and the importance of the subgrid modeling to an accurate prediction of forces on the structure in demonstrated.PhDNaval Architecture & Marine EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/102319/1/gfilip_1.pd

    Folded RF-excited COâ‚‚ waveguide lasers

    Get PDF
    This thesis describes theoretical and experimental work on RF excited COâ‚‚ waveguide lasers and amplifiers.The mode coupling losses at a bend in a folded waveguide have been evaluated as a function of the selectable parameters to determine the low-loss folding geometries. A direct comparison is made between three types of optical arrangement used for folding in a compact, sealed-off, Z-fold COâ‚‚ waveguide laser excited by a transverse RF discharge. In particular, the measured laser output power as a function of discharge conditions and mirror alignment for plane and curved mirror, and partial waveguide folded resonators are compared.The Z-fold laser output power is predicted by incorporating the known and estimated laser parameters into a Rigrod-type analysis. A simultaneous solution of the Rigrod equations predicting the laser powers for different intra-cavity gain lengths is used with the experimental data to derive the discharge and resonator parameters. Experimental results are in good agreement with the theoretical predictions, and suggest that a M% power loss per fold has been achieved with partial waveguide folding. Also, the preliminary theoretical results of a multi-mode resonator model predicting the laser output power as a function of the angular mis-alignment of one of the Z-fold laser folding mirrors are in qualitative agreement with the experimental determinations.Experiments related to laser efficiency and frequency stability are discussed briefly. These include an investigation into an automatic impedance matching scheme for dynamic optimisation of the power transfer efficiency between RF generator and the laser head; the Opto-Hertzian effect (RF equivalent to the opto-Galvanic effect) for laser frequency stabilisation; a novel parallel-resonant distributed inductance excitation technique using a multi-start solenoid; and finally, identification of hooting laser resonator modes responsible for impeding heterodyne measurements Mien a clean RF spectrum is required.In addition, theoretical and experimental studies of laser amplification are presented. The suitability of folded waveguide and non-waveguide structures for power amplification or pre-amplification is assessed to determine their applicability to coherent LiDAR. Optical amplification of wideband transmitter and/or receiver signals is considered a favourable way of improving the discrimination of range and velocity determinations.Finally, as a result of this work, up to 53.4 Watts output power in a high quality fundamental Gaussian beam has been obtained from a compact, sealed-off, Z-fold COâ‚‚ waveguide laser with a 115 cm discharge length, which implies a specific power performance of 0.46 W/cm. Efficiencies (laser output power/RF input power) of up to 9.2% have also been observed

    IMAGE AND VIDEO UNDERSTANDING WITH CONSTRAINED RESOURCES

    Get PDF
    Recent advances in computer vision tasks have been driven by high-capacity deep neural networks, particularly Convolutional Neural Networks (CNNs) with hundreds of layers trained in a supervised manner. However, this poses two significant challenges: (1) the increased depth in CNNs that leads to significant improvements over competitive benchmarks at the same time, limits their deployment in real-world scenarios due to high computational cost, (2) the need to collect millions of human labeled samples for training prevents such approaches to scale, especially for fine-grained image understanding like semantic segmentation, where dense annotations are extremely expensive to obtain. To mitigate these issues, we focus on image and video understanding with constrained resources, in the forms of computational resources and annotation resources. In particular, we present approaches that (1) investigate dynamic computation frameworks which adaptively allocate computing resources on-the-fly given a novel image/video to manage the trade-off between accuracy and computational complexity; (2) derive robust representations with minimal human supervision through exploring context relationships or using shared information across domains. With this in mind, we first introduce BlockDrop, a conditional computation approach that learns to dynamically choose which layers of a deep network to execute during inference so as to best reduce total computation without degrading prediction accuracy. Next, we generalize the idea of conditional computation of images to videos by presenting AdaFrame, a framework that adaptively selects relevant frames on a per-input basis for fast video recognition. AdaFrame assumes access to all frames in videos, and hence can be only used in offline settings. To mitigate this issue, we introduce LiteEval, a simple yet effective coarse-to-fine framework for resource efficient video recognition, suitable for both online and offline scenarios. To derive robust feature representations with limited annotation resources, we first explore the power of spatial context as a supervisory signal for learning visual representations. In addition, we also propose to learn from synthetic data rendered by modern computer graphics tools, where ground-truth labels are readily available. We propose Dual Channel-wise Alignment Networks (DCAN), a simple yet effective approach to reduce domain shift at both pixel-level and feature-level, for unsupervised scene adaptation

    Michigan experimental multispectral mapping system: A description of the M7 airborne sensor and its performance

    Get PDF
    The development and characteristics of a multispectral band scanner for an airborne mapping system are discussed. The sensor operates in the ultraviolet, visual, and infrared frequencies. Any twelve of the bands may be selected for simultaneous, optically registered recording on a 14-track analog tape recorder. Multispectral imagery recorded on magnetic tape in the aircraft can be laboratory reproduced on film strips for visual analysis or optionally machine processed in analog and/or digital computers before display. The airborne system performance is analyzed

    Scale-space and the implicit coding of luminance in V1.

    Get PDF
    This thesis pursues a single line of enquiry: lightness, brightness, and visual illusions. In particular, it focuses on White's effect, simultaneous brightness contrast, and low-level theories that can account for both phenomenon. In the first part (Chapters 1-2), the problem-space is defined before a review of lightness and brightness theories from both low- and high-level vision. In the second part (Chapter 3), the only two low-level VI models of brightness, capable of accounting for both White's effect and simultaneous brightness contrast, are shown to be reliant on the amplification of low spatial frequency information derived for large-scale RFs, to accurately reconstruct images and account for the illusory brightness apparent in both effects. It is argued that these large-scale RFs do not exist in VI. and that the global re-weighting and re-normalisation schemes employed by these models are not constrained by the known local nature of intra-cortical connections. Hence, it was concluded that these models are not biologically plausible. In the third part (Chapter 4), the issue of recovering low spatial frequency and local mean luminance information without explicitly sampling it, is considered. The problem is formally defined in the Scale-Space framework and solved analytically. That is, an algorithm for recovering local mean-luminance (and low spatial frequencies), from the information implicit in contrast coding cells typically found in VI, is constructed, and is referred to as the Implicit Luminance Coding (ILC) model. It is argued that the ILC model is not biologically-plausible, by virtue of its global optimisation framework being unconstrained by the known local nature of intra-cortical connections. Subsequently, a new algorithm is proposed, based on a numerical approximation to the analytical solution. The biologically-plausible ILC algorithm is developed into a complete low-level model of brightness, which makes use of the information present in multiple scale channels. The model is shown to be capable of accounting for both White's effect and simultaneous brightness contrast, by means of an interplay between two independent assimilation and contrast mechanisms. The final part (Chapter 5). is concerned with the application of the model to visual phenomenon synonymous with lightness and brightness, including all known variants of White's effect and simultaneous brightness contrast, and some effects that are traditionally accounted for by appealing to mechanisms from high-level vision, thus facilitating the delineation of low-level from higher-level phenomena. The biologically-plausible ILC model is shown to be in good accordance with this experimental data. Furthermore, qualitative accounts for the temporal evolution of the filling-in process were provided and shown to be in agreement with experiment, and novel predictions as to the temporal evolution of White's effect relative to simultaneous brightness contrast are described

    Multiple graph matching and applications

    Get PDF
    En aplicaciones de reconocimiento de patrones, los grafos con atributos son en gran medida apropiados. Normalmente, los vértices de los grafos representan partes locales de los objetos i las aristas relaciones entre estas partes locales. No obstante, estas ventajas vienen juntas con un severo inconveniente, la distancia entre dos grafos no puede ser calculada en un tiempo polinómico. Considerando estas características especiales el uso de los prototipos de grafos es necesariamente omnipresente. Las aplicaciones de los prototipos de grafos son extensas, siendo las más habituales clustering, clasificación, reconocimiento de objetos, caracterización de objetos i bases de datos de grafos entre otras. A pesar de la diversidad de aplicaciones de los prototipos de grafos, el objetivo del mismo es equivalente en todas ellas, la representación de un conjunto de grafos. Para construir un prototipo de un grafo todos los elementos del conjunto de enteramiento tienen que ser etiquetados comúnmente. Este etiquetado común consiste en identificar que nodos de que grafos representan el mismo tipo de información en el conjunto de entrenamiento. Una vez este etiquetaje común esta hecho, los atributos locales pueden ser combinados i el prototipo construido. Hasta ahora los algoritmos del estado del arte para calcular este etiquetaje común mancan de efectividad o bases teóricas. En esta tesis, describimos formalmente el problema del etiquetaje global i mostramos una taxonomía de los tipos de algoritmos existentes. Además, proponemos seis nuevos algoritmos para calcular soluciones aproximadas al problema del etiquetaje común. La eficiencia de los algoritmos propuestos es evaluada en diversas bases de datos reales i sintéticas. En la mayoría de experimentos realizados los algoritmos propuestos dan mejores resultados que los existentes en el estado del arte.In pattern recognition, the use of graphs is, to a great extend, appropriate and advantageous. Usually, vertices of the graph represent local parts of an object while edges represent relations between these local parts. However, its advantages come together with a sever drawback, the distance between two graph cannot be optimally computed in polynomial time. Taking into account this special characteristic the use of graph prototypes becomes ubiquitous. The applicability of graphs prototypes is extensive, being the most common applications clustering, classification, object characterization and graph databases to name some. However, the objective of a graph prototype is equivalent to all applications, the representation of a set of graph. To synthesize a prototype all elements of the set must be mutually labeled. This mutual labeling consists in identifying which nodes of which graphs represent the same information in the training set. Once this mutual labeling is done the set can be characterized and combined to create a graph prototype. We call this initial labeling a common labeling. Up to now, all state of the art algorithms to compute a common labeling lack on either performance or theoretical basis. In this thesis, we formally describe the common labeling problem and we give a clear taxonomy of the types of algorithms. Six new algorithms that rely on different techniques are described to compute a suboptimal solution to the common labeling problem. The performance of the proposed algorithms is evaluated using an artificial and several real datasets. In addition, the algorithms have been evaluated on several real applications. These applications include graph databases and group-wise image registration. In most of the tests and applications evaluated the presented algorithms have showed a great improvement in comparison to state of the art applications
    • …
    corecore