7 research outputs found

    Gravity and known size calibrate visual information to time parabolic trajectories

    Full text link
    Catching a ball in a parabolic flight is a complex task in which the time and area of interception are strongly coupled, making interception possible for a short period. Although this makes the estimation of time-to-contact (TTC) from visual information in parabolic trajectories very useful, previous attempts to explain our precision in interceptive tasks circumvent the need to estimate TTC to guide our action. Obtaining TTC from optical variables alone in parabolic trajectories would imply very complex transformations from 2D retinal images to a 3D layout. We propose based on previous work and show by using simulations that exploiting prior distributions of gravity and known physical size makes these transformations much simpler, enabling predictive capacities from minimal early visual information. Optical information is inherently ambiguous, and therefore, it is necessary to explain how these prior distributions generate predictions. Here is where the role of prior information comes into play: it could help to interpret and calibrate visual information to yield meaningful predictions of the remaining TTC. The objective of this work is: (1) to describe the primary sources of information available to the observer in parabolic trajectories; (2) unveil how prior information can be used to disambiguate the sources of visual information within a Bayesian encoding-decoding framework; (3) show that such predictions might be robust against complex dynamic environments; and (4) indicate future lines of research to scrutinize the role of prior knowledge calibrating visual information and prediction for action control

    Orientation of tabular mafic intrusions controls convective vigour and crystallization style

    Get PDF
    The microstructure in basaltic dykes is significantly different from that in sills and lava lakes of the same bulk composition. For a given width of intrusion (or depth of lava lake), vertical tabular bodies are coarser grained than horizontal bodies, with an invariant plagioclase shape across the intrusion. When comparing samples from sills and dykes for which the average grain size is the same, the dyke samples contain fewer small grains and fewer large grains than the sill samples. In contrast, the variation of median clinopyroxene-plagioclase-plagioclase dihedral angles in dykes correlates precisely with that observed in sills and is a function of the rate of diffusive heat loss. These patterns can be accounted for if the early stages of crystallization in dykes primarily involve the growth of isolated grains suspended in a well-mixed convecting magma, with the final stage (during which dihedral angles form) occurring in a crystal-rich static magma in which heat loss is primarily diffusive. In contrast, crystallization in sills occurs predominantly in marginal solidification fronts, suggesting that any convective motions are insufficient to entrain crystals from the marginal mushy layers and to keep them suspended while they grow. An exception to this general pattern is provided by members of the Mull SolitDykes, which propagated 100-1000 km SE from the Mull Palaeogene Igneous Centre, Scotland, through the shallow crust. These dykes, where sampled >100 km from Mull, have a microstructure indistinguishable from that of a sill of comparable thickness. We suggest that sufficient nucleation and crystallization occurred in these dykes to increase the viscosity sufficiently to damp convection once unidirectional flow had ceased

    Estimating motion and time to contact in 3D environments: Priors matter

    Get PDF
    [eng] Until the present moment, an extensive amount of research has been done on how humans estimate motion or parameters of a task, such as the timeto- contact in simple scenarios. However, most avoid questioning how we extract 3D information from 2D optic information. A Bayesian approach based on a combination of optic and prior knowledge about statistical regularities of the environment would allow solving the ambiguity when translating 2D into 3D estimates. The present dissertation aims to analyse if the estimation of motion and time-to-contact in complex 3D environments is compatible with a combination of visual and prior information. In the first study, we analyse the predictions of a Bayesian model with a preference for slow speeds to estimate the direction of an object. The information available to judge movement in depth is much less precise than information about the lateral movement. Thus, combining both sources of information with a prior with preference for low speeds, estimates of motion in depth will be proportionally more attracted to low speeds than estimates of lateral motion. Thus, the perceived direction would depend on stimulus speed when estimating the ball’s direction. Our experimental results showed that the bias in perceived direction increased at higher speeds, which would be congruent with increasingly less precise motion estimates (consistent with Weber’s law). In the second study, we analyse the existing evidence on using a priori knowledge of the Earth’s gravitational acceleration and the size of objects to estimate the time to contact in parabolic trajectories. We analysed the existing evidence for using knowledge of the Earth’s gravity and the size of an object in the interaction with the surrounding environment. Next, we simulate predictions of the GS model. This model allows predicting the time to contact based on a combination of a priori variables (gravity and ball size) and optic variables. We compare the accuracy of the predictions of time-to-contact with an alternative only using optic variables, showing that relying on priors of gravitation and ball size solves the ambiguity in the estimation of the time-to-contact. Finally, we offer scenarios where the GS model would lead to predictions with systematic errors, which we will test in the following studies. In the third study, we created trajectories for which the GS model gives accurate predictions of the time to contact at different flight times but provides different systematic errors at any other time. We hypothesized that if the ball’s visibility is restricted to a short time window, the participants would prefer to see the ball during the time windows in which the model predictions are accurate. Our results showed that observers preferred to use a relatively constant ball viewing time. However, we showed evidence that the direction of the errors made by the participants for the different trajectories tested corresponded to the direction predicted by the GS model. In the fourth and final study, we investigated the role of a priori knowledge of the Earth’s gravitational acceleration and ball size in estimating the time of flight and the direction of motion of an observer towards the interception point. We introduced our participants in an environment where both gravitational acceleration and ball size was randomized trial-to-trial. The observers’ task was to move towards the interception point and predict the remaining flight time after a short occlusion. Our results provide evidence for using prior knowledge of gravity and ball size to estimate the time-to-contact. We also find evidence that gravitational acceleration may play a role in guiding locomotion towards the interception point. In summary, in this thesis, we contribute to answering a fundamental question in Perception: how we interpret information to act in the world. To do so, we show evidence that humans apply their knowledge about regularities in the environment in the form of a priori knowledge of the Earth’s gravitational acceleration, the size of the ball, or that objects stand still in the world when interpreting visual information.[spa] Hasta el momento, se ha realizado una gran cantidad de investigación sobre como el ser humano estima el movimiento o los parámetros de una tarea como el tiempo de contacto en escenarios simples. Sin embargo, la mayoría evita preguntarse cómo se extrae la información 3D a partir de la información óptica 2D. Un enfoque bayesiano basado en una combinación de información óptica y a priori sobre regularidades estadísticas del entorno interiorizadas en forma de conocimiento permitiría resolver la ambigüedad a la hora de traducir claves ópticas en 2D a estimaciones sobre propiedades del mundo en 3D. El objetivo de esta tesis es analizar si la estimación del movimiento y del tiempo de contacto en entornos 3D complejos es compatible con una combinación de información visual y a priori. En el primer estudio, se analizan las predicciones de un modelo bayesiano con preferencia por las velocidades lentas para la estimación de la dirección de un objeto. La información disponible para juzgar el movimiento en profundidad es mucho menos precisa que la información sobre el movimiento lateral. Así, cuando se combinan ambas fuentes de información con un prior con preferencia por la velocidad baja, las estimaciones del movimiento en profundidad serán proporcionalmente más atraídas por el prior que las estimaciones del movimiento lateral. Por lo tanto, la dirección percibida dependería de la velocidad del estímulo. Nuestros resultados experimentales mostraron que el sesgo en la dirección percibida aumentaba a velocidades más altas, lo que sería congruente con estimaciones de movimiento cada vez menos precisas (consistente con la ley de Weber). En el segundo estudio, analizamos las evidencias existentes sobre el uso del conocimiento a priori de la aceleración gravitatoria de la Tierra y el tamaño de los objetos para estimar el tiempo de contacto en trayectorias parabólicas. Analizamos las pruebas existentes sobre el uso del conocimiento de la gravedad de la Tierra y el tamaño de un objeto en la interacción con el entorno. A continuación, simulamos las predicciones del modelo GS, un modelo que permite predecir el tiempo de contacto a partir de una combinación de variables a priori (gravedad y tamaño de pelota) y variables ópticas. Comparamos la precisión de las predicciones del tiempo de contacto con una alternativa que solo utiliza variables ópticas, mostrando que basarse en las variables a priori de la gravedad y el tamaño de la bola resuelve la ambigüedad en la estimación del tiempo de contacto. Por último, mostramos varios escenarios en los que el modelo GS conduciría a predicciones con errores sistemáticos; escenarios que pondremos a prueba en los siguientes estudios. En el tercer estudio, creamos trayectorias para las que el modelo GS da predicciones precisas del tiempo hasta el contacto en diferentes tiempos de vuelo, pero proporciona diferentes errores sistemáticos en cualquier otro momento. Hipotetizamos que, si la visibilidad de la pelota está restringida a una ventana de tiempo corta, los participantes preferirían ver la pelota durante las ventanas de tiempo en las que las predicciones del modelo son precisas. Nuestros resultados mostraron que los observadores preferían utilizar un tiempo de visualización de la pelota relativamente constante. Por otra parte, mostramos pruebas de que la dirección de los errores cometidos por los participantes para las diferentes trayectorias probadas se correspondía con la dirección predicha por el modelo GS. En el cuarto y último estudio, investigamos el papel del conocimiento a priori de la aceleración gravitatoria de la Tierra y del tamaño de la pelota en la estimación del tiempo de vuelo y la dirección de movimiento de un observador hacia el punto de interceptación. Introdujimos a nuestros participantes en un entorno en el que tanto la aceleración gravitatoria como el tamaño de la pelota se asignaban aleatoriamente ensayo a ensayo. La tarea de los observadores consistía en desplazarse hacia el punto de interceptación y predecir el tiempo de vuelo restante tras una breve oclusión. Nuestros resultados proporcionan pruebas del uso del conocimiento previo de la gravedad y el tamaño de la pelota para estimar el tiempo de contacto. También encontramos pruebas de que la aceleración gravitatoria puede desempeñar un papel en la orientación de la locomoción hacia el punto de intercepción. En resumen, en esta tesis contribuimos a responder a una cuestión fundamental en la Percepción: como interpretamos la información para actuar en el mundo. Para ello, mostramos evidencias de que los humanos aplican sus conocimientos sobre regularidades del entorno en forma de conocimiento a priori de la aceleración gravitatoria de la tierra, del tamaño de la pelota o de la estabilidad del mundo a nuestro alrededor para interpretar la información visual

    Monocular depth estimation in images and sequences using occlusion cues

    Get PDF
    When humans observe a scene, they are able to perfectly distinguish the different parts composing it. Moreover, humans can easily reconstruct the spatial position of these parts and conceive a consistent structure. The mechanisms involving visual perception have been studied since the beginning of neuroscience but, still today, not all the processes composing it are known. In usual situations, humans can make use of three different methods to estimate the scene structure. The first one is the so called divergence and it makes use of both eyes. When objects lie in front of the observed at a distance up to hundred meters, subtle differences in the image formation in each eye can be used to determine depth. When objects are not in the field of view of both eyes, other mechanisms should be used. In these cases, both visual cues and prior learned information can be used to determine depth. Even if these mechanisms are less accurate than divergence, humans can almost always infer the correct depth structure when using them. As an example of visual cues, occlusion, perspective or object size provide a lot of information about the structure of the scene. A priori information depends on each observer, but it is normally used subconsciously by humans to detect commonly known regions such as the sky, the ground or different types of objects. In the last years, since technology has been able to handle the processing burden of vision systems, there has been lots of efforts devoted to design automated scene interpreting systems. In this thesis we address the problem of depth estimation using only one point of view and using only occlusion depth cues. The thesis objective is to detect occlusions present in the scene and combine them with a segmentation system so as to generate a relative depth order depth map for a scene. We explore both static and dynamic situations such as single images, frame inside sequences or full video sequences. In the case where a full image sequence is available, a system exploiting motion information to recover depth structure is also designed. Results are promising and competitive with respect to the state of the art literature, but there is still much room for improvement when compared to human depth perception performance.Quan els humans observen una escena, son capaços de distingir perfectament les parts que la composen i organitzar-les espacialment per tal de poder-se orientar. Els mecanismes que governen la percepció visual han estat estudiats des dels principis de la neurociència, però encara no es coneixen tots els processos biològic que hi prenen part. En situacions normals, els humans poden fer servir tres eines per estimar l’estructura de l’escena. La primera és l’anomenada divergència. Aprofita l’ús de dos punts de vista (els dos ulls) i és capaç¸ de determinar molt acuradament la posició dels objectes ,que a una distància de fins a cent metres, romanen enfront de l’observador. A mesura que augmenta la distància o els objectes no es troben en el camp de visió dels dos ulls, altres mecanismes s’han d’utilitzar. Tant l’experiència anterior com certs indicis visuals s’utilitzen en aquests casos i, encara que la seva precisió és menor, els humans aconsegueixen quasi bé sempre interpretar bé el seu entorn. Els indicis visuals que aporten informació de profunditat més coneguts i utilitzats són per exemple, la perspectiva, les oclusions o el tamany de certs objectes. L’experiència anterior permet resoldre situacions vistes anteriorment com ara saber quins regions corresponen al terra, al cel o a objectes. Durant els últims anys, quan la tecnologia ho ha permès, s’han intentat dissenyar sistemes que interpretessin automàticament diferents tipus d’escena. En aquesta tesi s’aborda el tema de l’estimació de la profunditat utilitzant només un punt de vista i indicis visuals d’oclusió. L’objectiu del treball es la detecció d’aquests indicis i combinar-los amb un sistema de segmentació per tal de generar automàticament els diferents plans de profunditat presents a una escena. La tesi explora tant situacions estàtiques (imatges fixes) com situacions dinàmiques, com ara trames dins de seqüències de vídeo o seqüències completes. En el cas de seqüències completes, també es proposa un sistema automàtic per reconstruir l’estructura de l’escena només amb informació de moviment. Els resultats del treball son prometedors i competitius amb la literatura del moment, però mostren encara que la visió per computador té molt marge de millora respecte la precisió dels humans

    Autonomous real-time object detection and identification

    Get PDF
    Sensor devices are regularly used on unmanned aerial vehicles (UAVs) as reconnaissance and intelligence gathering systems and as support for front line troops on operations. This platform provides a wealth of sensor data and has limited computational power available for processing. The objective of this work is to detect and identify objects in real-time, with a low power footprint so that it can operate on a UAV. An appraisal of current computer vision methods is presented, with reference to their performance and applicability to the objectives. Experimentation with real-time methods of background subtraction and motion estimation was carried out and limitations of each method described. A new, assumption free, data driven method for object detection and identification was developed. The core ideas of the development were based on models that propose that the human vision system analyses edges of objects to detect and separate them and perceives motion separately, a function which has been modelled here by optical flow. The initial development in the temporal domain combined object and motion detection in the analysis process. This approach was found to have limitations. The second iteration used a detection component in the spatial domain that extracts texture patches based on edge contours, their profile, and internal texture structure. Motion perception was performed separately on the texture patches using optical flow. The motion and spatial location of texture patches was used to define physical objects. A clustering method is used on the rich feature set extracted by the detection method to characterise the objects. The results show that the method carries out detection and identification of both moving and static objects, in real-time, irrespective of camera motion

    Relative depth from monocularoptical flow

    No full text
    corecore