258 research outputs found
Modeling human intuitions about liquid flow with particle-based simulation
Humans can easily describe, imagine, and, crucially, predict a wide variety
of behaviors of liquids--splashing, squirting, gushing, sloshing, soaking,
dripping, draining, trickling, pooling, and pouring--despite tremendous
variability in their material and dynamical properties. Here we propose and
test a computational model of how people perceive and predict these liquid
dynamics, based on coarse approximate simulations of fluids as collections of
interacting particles. Our model is analogous to a "game engine in the head",
drawing on techniques for interactive simulations (as in video games) that
optimize for efficiency and natural appearance rather than physical accuracy.
In two behavioral experiments, we found that the model accurately captured
people's predictions about how liquids flow among complex solid obstacles, and
was significantly better than two alternatives based on simple heuristics and
deep neural networks. Our model was also able to explain how people's
predictions varied as a function of the liquids' properties (e.g., viscosity
and stickiness). Together, the model and empirical results extend the recent
proposal that human physical scene understanding for the dynamics of rigid,
solid objects can be supported by approximate probabilistic simulation, to the
more complex and unexplored domain of fluid dynamics.Comment: Under review at PLOS Computational Biolog
Envisioning the qualitative effects of robot manipulation actions using simulation-based projections
Autonomous robots that are to perform complex everyday tasks such as making pancakes have to understand how the effects of an action depend on the way the action is executed. Within Artificial Intelligence, classical planning reasons about whether actions are executable, but makes the assumption that the actions will succeed (with some probability). In this work, we have designed, implemented, and analyzed a framework that allows us to envision the physical effects of robot manipulation actions. We consider envisioning to be a qualitative reasoning method that reasons about actions and their effects based on simulation-based projections. Thereby it allows a robot to infer what could happen when it performs a task in a certain way. This is achieved by translating a qualitative physics problem into a parameterized simulation problem; performing a detailed physics-based simulation of a robot plan; logging the state evolution into appropriate data structures; and then translating these sub-symbolic data structures into interval-based first-order symbolic, qualitative representations, called timelines. The result of the envisioning is a set of detailed narratives represented by timelines which are then used to infer answers to qualitative reasoning problems. By envisioning the outcome of actions before committing to them, a robot is able to reason about physical phenomena and can therefore prevent itself from ending up in unwanted situations. Using this approach, robots can perform manipulation tasks more efficiently, robustly, and flexibly, and they can even successfully accomplish previously unknown variations of tasks
Benchmarking computational fluid dynamics models of lava flow simulation for hazard assessment, forecasting, and risk management
Abstract Numerical simulations of lava flow emplacement are valuable for assessing lava flow hazards, forecasting active flows, designing flow mitigation measures, interpreting past eruptions, and understanding the controls on lava flow behavior. Existing lava flow models vary in simplifying assumptions, physics, dimensionality, and the degree to which they have been validated against analytical solutions, experiments, and natural observations. In order to assess existing models and guide the development of new codes, we conduct a benchmarking study of computational fluid dynamics (CFD) models for lava flow emplacement, including VolcFlow, OpenFOAM, FLOW-3D, COMSOL, and MOLASSES. We model viscous, cooling, and solidifying flows over horizontal planes, sloping surfaces, and into topographic obstacles. We compare model results to physical observations made during well-controlled analogue and molten basalt experiments, and to analytical theory when available. Overall, the models accurately simulate viscous flow with some variability in flow thickness where flows intersect obstacles. OpenFOAM, COMSOL, and FLOW-3D can each reproduce experimental measurements of cooling viscous flows, and OpenFOAM and FLOW-3D simulations with temperature-dependent rheology match results from molten basalt experiments. We assess the goodness-of-fit of the simulation results and the computational cost. Our results guide the selection of numerical simulation codes for different applications, including inferring emplacement conditions of past lava flows, modeling the temporal evolution of ongoing flows during eruption, and probabilistic assessment of lava flow hazard prior to eruption. Finally, we outline potential experiments and desired key observational data from future flows that would extend existing benchmarking data sets
MULTIMODAL LEARNING FOR AUDIO AND VISUAL PROCESSING
The world contains vast amounts of information which can be sensed and captured in a variety of ways and formats. Virtual environments also lend themselves to endless possibilities and diversity of data. Often our experiences draw from these separate but complementary parts which can be combined in a way to provide a comprehensive representation of the events. Multimodal learning focuses on these types of combinations. By fusing multiple modalities, multimodal learning can improve results beyond individual mode performance. However, many of today’s state-of-the-art techniques in computer vision, robotics, and machine learning rely solely or primarily on visual inputs even when the visual data is obtained from video where corresponding audio may also be readily available to augment learning. Vision only approaches can experience challenges in cases of highly reflective, transparent, or occluded objects and scenes where, if used alone or in conjunction with, audio may improve task performance. To address these challenges, this thesis explores coupling multimodal information to enhance task performance through learning-based methods for audio and visual processing using real and synthetic data. Physically-based graphics pipelines can naturally be extended for audio and visual synthetic data generation. To enhance the rigid body sound synthesis pipeline for objects containing a liquid, I used an added mass operator for fluid-structure coupling as a pre-processing step. My method is fast and practical for use in interactive 3D systems where live sound synthesis is desired. By fusing audio and visual data from real and synthetic videos, we also demonstrate enhanced processing and performance for object classification, tracking, and reconstruction tasks. As has been shown in visual question and answering and other related work, multiple modalities have the ability to complement one another and outperform single modality systems. To the best of my knowledge, I introduced the first use of audio-visual neural networks to analyze liquid pouring sequences by classifying their weight, liquid, and receiving container. Prior work often required predefined source weights or visual data. My contribution was to use the sound from a pouring sequence—a liquid being poured into a target container- to train a multimodal convolutional neural networks (CNNs) that fuses mel-scaled spectrograms as audio inputs with corresponding visual data based on video images. I described the first use of an audio-visual neural network for tracking tabletop sized objects and enhancing visual object trackers. Like object detection of reflective surfaces, object trackers can also run into challenges when objects collide, occlude, appear similar, or come close to one another. By using the impact sounds of the objects during collision, my audio-visual object tracking (AVOT) neural network can correct trackers that drift from their original objects that were assigned before collision. Reflective and textureless surfaces not only are difficult to detect and classify, they are also often poorly reconstructed and filled with depth discontinuities and holes. I proposed the first use of an audiovisual method that uses the reflections of sound to aid in geometry and audio reconstruction, referred to as ”Echoreconstruction”. The mobile phone prototype emits pulsed audio, while recording video for RGBbased 3D reconstruction and audio-visual classification. Reflected sound and images from the video are input into our audio (EchoCNN-A) and audio-visual (EchoCNN-AV) convolutional neural networks for surface and sound source detection, depth estimation, and material classification. EchoCNN inferences from these classifications enhance scene 3D reconstructions containing open spaces and reflective surfaces by depth filtering, inpainting, and placement of unmixed sound sources in the scene. In addition to enhancing scene reconstructions, I proposed a multimodal single- and multi-frame reconstruction LSTM autoencoder for 3D reconstructions using audio-visual inputs. Our neural network produces high-quality 3D reconstructions using voxel representation. It is the first audio-visual reconstruction neural network for 3D geometry and material representation. Contributions of this thesis include new neural network designs, new enhancements to real and synthetic audio-visual datasets, and prototypes that demonstrate audio and audio-augmented performance for sound synthesis, inference, and reconstruction.Doctor of Philosoph
Stochastic particle advection velocimetry (SPAV): theory, simulations, and proof-of-concept experiments
Particle tracking velocimetry (PTV) is widely used to measure time-resolved,
three-dimensional velocity and pressure fields in fluid dynamics research.
Inaccurate localization and tracking of particles is a key source of error in
PTV, especially for single camera defocusing, plenoptic imaging, and digital
in-line holography (DIH) sensors. To address this issue, we developed
stochastic particle advection velocimetry (SPAV): a statistical data loss that
improves the accuracy of PTV. SPAV is based on an explicit particle advection
model that predicts particle positions over time as a function of the estimated
velocity field. The model can account for non-ideal effects like drag on
inertial particles. A statistical data loss that compares the tracked and
advected particle positions, accounting for arbitrary localization and tracking
uncertainties, is derived and approximated. We implement our approach using a
physics-informed neural network, which simultaneously minimizes the SPAV data
loss, a Navier-Stokes physics loss, and a wall boundary loss, where
appropriate. Results are reported for simulated and experimental DIH-PTV
measurements of laminar and turbulent flows. Our statistical approach
significantly improves the accuracy of PTV reconstructions compared to a
conventional data loss, resulting in an average reduction of error close to
50%. Furthermore, our framework can be readily adapted to work with other data
assimilation techniques like state observer, Kalman filter, and
adjoint-variational methods
Recommended from our members
Grain refinement and nucleation processes in Aluminium alloys through liquid shearing
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.The industrial practice of grain refinement of aluminium alloys involves the addition of inoculant particles to initiate alpha-aluminium grains at small undercoolings. This results in a uniformly fine, equiaxed as-cast microstructure and is commonly achieved using Al-Ti-B additions. The phase responsible for initiation of grains in aluminium melts inoculated with Al-Ti-B was determined during the 1990s; since that time, scientific understanding of grain refinement has advanced rapidly. However, one of the main problems of addition inoculants is impurities which is added to the melt and may affect the desired characteristics of the product. With regards to this problem other methods of refinement and the mechanisms of refining have not been fully understood and prediction of as-cast Microstructures in aluminium alloys has much scope for improvement. In this thesis:
1-Factors in establishing equiaxed microstructure were analysed and the origin of equiaxed grains were explored. Then the nucleation process and the involved mechanisms were investigated in depth and control of nucleation process to achieve a fine and uniform structure was set as target.
2-Refinement of microstructure with introduction of shearing was evaluated and the process of refinement in the mushy zone (semisolid state) as a base line was established. Then introduction of shearing above liquidus as a development was analysed and outstanding refinement was seen with shearing above liquidus which have not been investigated properly elsewhere.
3- The mechanisms of refinement by introducing shearing were investigated and the refining mechanisms below and specifically above liquidus were investigated systematically. As results an appropriate understanding about the mechanisms of nucleation and refinement above liquidus was established.
4- Finally, with simulation the most dominant factor in approaching fine grain size by applying shear was identified and the results of experimental examination was verified by simulation.UK Department of Trade and Industry (DTI
Dark, Beyond Deep: A Paradigm Shift to Cognitive AI with Humanlike Common Sense
Recent progress in deep learning is essentially based on a "big data for
small tasks" paradigm, under which massive amounts of data are used to train a
classifier for a single narrow task. In this paper, we call for a shift that
flips this paradigm upside down. Specifically, we propose a "small data for big
tasks" paradigm, wherein a single artificial intelligence (AI) system is
challenged to develop "common sense", enabling it to solve a wide range of
tasks with little training data. We illustrate the potential power of this new
paradigm by reviewing models of common sense that synthesize recent
breakthroughs in both machine and human vision. We identify functionality,
physics, intent, causality, and utility (FPICU) as the five core domains of
cognitive AI with humanlike common sense. When taken as a unified concept,
FPICU is concerned with the questions of "why" and "how", beyond the dominant
"what" and "where" framework for understanding vision. They are invisible in
terms of pixels but nevertheless drive the creation, maintenance, and
development of visual scenes. We therefore coin them the "dark matter" of
vision. Just as our universe cannot be understood by merely studying observable
matter, we argue that vision cannot be understood without studying FPICU. We
demonstrate the power of this perspective to develop cognitive AI systems with
humanlike common sense by showing how to observe and apply FPICU with little
training data to solve a wide range of challenging tasks, including tool use,
planning, utility inference, and social learning. In summary, we argue that the
next generation of AI must embrace "dark" humanlike common sense for solving
novel tasks.Comment: For high quality figures, please refer to
http://wellyzhang.github.io/attach/dark.pd
- …