622 research outputs found
A low cost reconfigurable soft processor for multimedia applications: design synthesis and programming model
This paper presents an FPGA implementation of a low cost 8 bit reconfigurable processor core for media processing applications. The core is optimized to provide all basic arithmetic and logic functions required by the media processing and other domains, as well as to make it easily integrable into a 2D array. This paper presents an investigation of the feasibility of the core as a potential soft processing architecture for FPGA platforms. The core was synthesized on the entire Virtex FPGA family to evaluate its overall performance, scalability and portability. A special feature of the proposed architecture is its simple programming model which allows low level programming. Throughput results for popular benchmarks coded using the programming model and cycle accurate simulator are presented
ChainQueen: A Real-Time Differentiable Physical Simulator for Soft Robotics
Physical simulators have been widely used in robot planning and control.
Among them, differentiable simulators are particularly favored, as they can be
incorporated into gradient-based optimization algorithms that are efficient in
solving inverse problems such as optimal control and motion planning.
Simulating deformable objects is, however, more challenging compared to rigid
body dynamics. The underlying physical laws of deformable objects are more
complex, and the resulting systems have orders of magnitude more degrees of
freedom and therefore they are significantly more computationally expensive to
simulate. Computing gradients with respect to physical design or controller
parameters is typically even more computationally challenging. In this paper,
we propose a real-time, differentiable hybrid Lagrangian-Eulerian physical
simulator for deformable objects, ChainQueen, based on the Moving Least Squares
Material Point Method (MLS-MPM). MLS-MPM can simulate deformable objects
including contact and can be seamlessly incorporated into inference, control
and co-design systems. We demonstrate that our simulator achieves high
precision in both forward simulation and backward gradient computation. We have
successfully employed it in a diverse set of control tasks for soft robots,
including problems with nearly 3,000 decision variables.Comment: In submission to ICRA 2019. Supplemental Video:
https://www.youtube.com/watch?v=4IWD4iGIsB4 Project Page:
https://github.com/yuanming-hu/ChainQuee
ComPASS: a tool for distributed parallel finite volume discretizations on general unstructured polyhedral meshes
International audienceThe objective of the ComPASS project is to develop a parallel multiphase Darcy flow simulator adapted to general unstructured polyhedral meshes (in a general sense with possibly non planar faces) and to the parallelization of advanced finite volume discretizations with various choices of the degrees of freedom such as cell centres, vertices, or face centres. The main targeted applications are the simulation of CO2 geological storage, nuclear waste repository and reservoir simulations. The CEMRACS 2012 summer school devoted to high performance computing has been an ideal framework to start this collaborative project. This paper describes what has been achieved during the four weeks of the CEMRACS project which has been focusing on the implementation of basic features of the code such as the distributed unstructured polyhedral mesh, the synchronization of the degrees of freedom, and the connection to scientific libraries including the partitioner METIS, the visualization tool PARAVIEW, and the parallel linear solver library PETSc. The parallel efficiency of this first version of the ComPASS code has been validated on a toy parabolic problem using the Vertex Approximate Gradient finite volume spacial discretization with both cell and vertex degrees of freedom, combined with an Euler implicit time integration
Application of streamline simulation to gas displacement processes
Performance evaluation of miscible and near-miscible gas injection processes is available through conventional finite difference (FD) compositional simulation. Streamline methods have also been developed in which fluid is transported along the streamlines instead of using the finite difference grid. In streamline-based simulation, a 3D flow problem is decoupled into a set of 1D problems solved along streamlines. This reduces simulation time relative to FD simulation, and suppresses the numerical dispersion errors that are present in FD simulations. Larger time steps and higher spatial resolution can be achieved in these simulations. Thus, streamline-based reservoir simulation can be orders of magnitude faster than the conventional finite difference methods. Streamline methods are traditionally only applied to incompressible flow processes. In this paper, the method is adopted and assessed for application to compressible flow processes. A detailed comparison is given between the results of conventional FD simulation and the streamline approach for gas displacement processes. Finally, some guidelines are given on how the streamline method can potentially be used to good effect for gas displacement processes
A time-predictable many-core processor design for critical real-time embedded systems
Critical Real-Time Embedded Systems (CRTES) are in charge of controlling fundamental parts of embedded system, e.g. energy harvesting solar panels in satellites, steering and breaking in cars, or flight management systems in airplanes. To do so, CRTES require strong evidence of correct functional and timing behavior. The former guarantees that the system operates correctly in response of its inputs; the latter ensures that its operations are performed within a predefined time budget.
CRTES aim at increasing the number and complexity of functions. Examples include the incorporation of \smarter" Advanced Driver Assistance System (ADAS) functionality in modern cars or advanced collision avoidance systems in Unmanned Aerial Vehicles (UAVs). All these new features, implemented in software, lead to an exponential growth in both performance requirements and software development complexity. Furthermore, there is a strong need to integrate multiple functions into the same computing platform to reduce the number of processing units, mass and space requirements, etc. Overall, there is a clear need to increase the computing power of current CRTES in order to support new sophisticated and complex functionality, and integrate multiple systems into a single platform.
The use of multi- and many-core processor architectures is increasingly seen in the CRTES industry as the solution to cope with the performance demand and cost constraints of future CRTES. Many-cores supply higher performance by exploiting the parallelism of applications while providing a better performance per watt as cores are maintained simpler with respect to complex single-core processors. Moreover, the parallelization capabilities allow scheduling multiple functions into the same processor, maximizing the hardware utilization.
However, the use of multi- and many-cores in CRTES also brings a number of challenges related to provide evidence about the correct operation of the system, especially in the timing domain. Hence, despite the advantages of many-cores and the fact that they are nowadays a reality in the embedded domain (e.g. Kalray MPPA, Freescale NXP P4080, TI Keystone II), their use in CRTES still requires finding efficient ways of providing reliable evidence about the correct operation of the system.
This thesis investigates the use of many-core processors in CRTES as a means to satisfy performance demands of future complex applications while providing the necessary timing guarantees. To do so, this thesis contributes to advance the state-of-the-art towards the exploitation of parallel capabilities of many-cores in CRTES contributing in two different computing domains. From the hardware domain, this thesis proposes new many-core designs that enable deriving reliable and tight timing guarantees. From the software domain, we present efficient scheduling and timing analysis techniques to exploit the parallelization capabilities of many-core architectures and to derive tight and trustworthy Worst-Case Execution Time (WCET) estimates of CRTES.Los sistemas críticos empotrados de tiempo real (en ingles Critical Real-Time Embedded Systems, CRTES) se encargan de controlar partes fundamentales de los sistemas integrados, e.g. obtención de la energía de los paneles solares en satélites, la dirección y frenado en automóviles, o el control de vuelo en aviones. Para hacerlo, CRTES requieren fuerte evidencias del correcto comportamiento funcional y temporal. El primero garantiza que el sistema funciona correctamente en respuesta de sus entradas; el último asegura que sus operaciones se realizan dentro de unos limites temporales establecidos previamente. El objetivo de los CRTES es aumentar el número y la complejidad de las funciones. Algunos ejemplos incluyen los sistemas inteligentes de asistencia a la conducción en automóviles modernos o los sistemas avanzados de prevención de colisiones en vehiculos aereos no tripulados. Todas estas nuevas características, implementadas en software,conducen a un crecimiento exponencial tanto en los requerimientos de rendimiento como en la complejidad de desarrollo de software. Además, existe una gran necesidad de integrar múltiples funciones en una sóla plataforma para así reducir el número de unidades de procesamiento, cumplir con requisitos de peso y espacio, etc. En general, hay una clara necesidad de aumentar la potencia de cómputo de los actuales CRTES para soportar nueva funcionalidades sofisticadas y complejas e integrar múltiples sistemas en una sola plataforma. El uso de arquitecturas multi- y many-core se ve cada vez más en la industria CRTES como la solución para hacer frente a la demanda de mayor rendimiento y las limitaciones de costes de los futuros CRTES. Las arquitecturas many-core proporcionan un mayor rendimiento explotando el paralelismo de aplicaciones al tiempo que proporciona un mejor rendimiento por vatio ya que los cores se mantienen más simples con respecto a complejos procesadores de un solo core. Además, las capacidades de paralelización permiten programar múltiples funciones en el mismo procesador, maximizando la utilización del hardware. Sin embargo, el uso de multi- y many-core en CRTES también acarrea ciertos desafíos relacionados con la aportación de evidencias sobre el correcto funcionamiento del sistema, especialmente en el ámbito temporal. Por eso, a pesar de las ventajas de los procesadores many-core y del hecho de que éstos son una realidad en los sitemas integrados (por ejemplo Kalray MPPA, Freescale NXP P4080, TI Keystone II), su uso en CRTES aún precisa de la búsqueda de métodos eficientes para proveer evidencias fiables sobre el correcto funcionamiento del sistema. Esta tesis ahonda en el uso de procesadores many-core en CRTES como un medio para satisfacer los requisitos de rendimiento de aplicaciones complejas mientras proveen las garantías de tiempo necesarias. Para ello, esta tesis contribuye en el avance del estado del arte hacia la explotación de many-cores en CRTES en dos ámbitos de la computación. En el ámbito del hardware, esta tesis propone nuevos diseños many-core que posibilitan garantías de tiempo fiables y precisas. En el ámbito del software, la tesis presenta técnicas eficientes para la planificación de tareas y el análisis de tiempo para aprovechar las capacidades de paralelización en arquitecturas many-core, y también para derivar estimaciones de peor tiempo de ejecución (Worst-Case Execution Time, WCET) fiables y precisas
Recommended from our members
Parallel simulation of coupled flow and geomechanics in porous media
textIn this research we consider developing a reservoir simulator capable of simulating complex coupled poromechanical processes on massively parallel computers. A variety of problems arising from petroleum and environmental engineering inherently necessitate the understanding of interactions between fluid flow and solid mechanics. Examples in petroleum engineering include reservoir compaction, wellbore collapse, sand production, and hydraulic fracturing. In environmental engineering, surface subsidence, carbon sequestration, and waste disposal are also coupled poromechanical processes. These economically and environmentally important problems motivate the active pursuit of robust, efficient, and accurate simulation tools for coupled poromechanical problems. Three coupling approaches are currently employed in the reservoir simulation community to solve the poromechanics system, namely, the fully implicit coupling (FIM), the explicit coupling, and the iterative coupling. The choice of the coupling scheme significantly affects the efficiency of the simulator and the accuracy of the solution. We adopt the fixed-stress iterative coupling scheme to solve the coupled system due to its advantages over the other two. Unlike the explicit coupling, the fixed-stress split has been theoretically proven to converge to the FIM for linear poroelasticity model. In addition, it is more efficient and easier to implement than the FIM. Our computational results indicate that this approach is also valid for multiphase flow. We discretize the quasi-static linear elasticity model for geomechanics in space using the continuous Galerkin (CG) finite element method (FEM) on general hexahedral grids. Fluid flow models are discretized by locally mass conservative schemes, specifically, the mixed finite element method (MFE) for the equation of state compositional flow on Cartesian grids and the multipoint flux mixed finite element method (MFMFE) for the single phase and two-phase flows on general hexahedral grids. While both the MFE and the MFMFE generate cell-centered stencils for pressure, the MFMFE has advantages in handling full tensor permeabilities and general geometry and boundary conditions. The MFMFE also obtains accurate fluxes at cell interfaces. These characteristics enable the simulation of more practical problems. For many reservoir simulation applications, for instance, the carbon sequestration simulation, we need to account for thermal effects on the compositional flow phase behavior and the solid structure stress evolution. We explicitly couple the poromechanics equations to a simplified energy conservation equation. A time-split scheme is used to solve heat convection and conduction successively. For the convection equation, a higher order Godunov method is employed to capture the sharp temperature front; for the conduction equation, the MFE is utilized. Simulations of coupled poromechanical or thermoporomechanical processes in field scales with high resolution usually require parallel computing capabilities. The flow models, the geomechanics model, and the thermodynamics model are modularized in the Integrated Parallel Accurate Reservoir Simulator (IPARS) which has been developed at the Center for Subsurface Modeling at the University of Texas at Austin. The IPARS framework handles structured (logically rectangular) grids and was originally designed for element-based data communication, such as the pressure data in the flow models. To parallelize the node-based geomechanics model, we enhance the capabilities of the IPARS framework for node-based data communication. Because the geomechanics linear system is more costly to solve than those of flow and thermodynamics models, the performance of linear solvers for the geomechanics model largely dictates the speed and scalability of the coupled simulator. We use the generalized minimal residual (GMRES) solver with the BoomerAMG preconditioner from the hypre library and the geometric multigrid (GMG) solver from the UG4 software toolbox to solve the geomechanics linear system. Additionally, the multilevel k-way mesh partitioning algorithm from METIS is used to generate high quality mesh partitioning to improve solver performance. Numerical examples of coupled poromechanics and thermoporomechanics simulations are presented to show the capabilities of the coupled simulator in solving practical problems accurately and efficiently. These examples include a real carbon sequestration field case with stress-dependent permeability, a synthetic thermoporoelastic reservoir simulation, poroelasticity simulations on highly distorted hexahedral grids, and parallel scalability tests on a massively parallel computer.Engineering Mechanic
- …