184 research outputs found

    Symmetry in Applied Mathematics

    Get PDF
    Applied mathematics and symmetry work together as a powerful tool for problem reduction and solving. We are communicating applications in probability theory and statistics (A Test Detecting the Outliers for Continuous Distributions Based on the Cumulative Distribution Function of the Data Being Tested, The Asymmetric Alpha-Power Skew-t Distribution), fractals - geometry and alike (Khovanov Homology of Three-Strand Braid Links, Volume Preserving Maps Between p-Balls, Generation of Julia and Mandelbrot Sets via Fixed Points), supersymmetry - physics, nanostructures -chemistry, taxonomy - biology and alike (A Continuous Coordinate System for the Plane by Triangular Symmetry, One-Dimensional Optimal System for 2D Rotating Ideal Gas, Minimal Energy Configurations of Finite Molecular Arrays, Noether-Like Operators and First Integrals for Generalized Systems of Lane-Emden Equations), algorithms, programs and software analysis (Algorithm for Neutrosophic Soft Sets in Stochastic Multi-Criteria Group Decision Making Based on Prospect Theory, On a Reduced Cost Higher Order Traub-Steffensen-Like Method for Nonlinear Systems, On a Class of Optimal Fourth Order Multiple Root Solvers without Using Derivatives) to specific subjects (Facility Location Problem Approach for Distributed Drones, Parametric Jensen-Shannon Statistical Complexity and Its Applications on Full-Scale Compartment Fire Data). Diverse topics are thus combined to map out the mathematical core of practical problems

    Optimización del rendimiento y la eficiencia energética en sistemas masivamente paralelos

    Get PDF
    RESUMEN Los sistemas heterogéneos son cada vez más relevantes, debido a sus capacidades de rendimiento y eficiencia energética, estando presentes en todo tipo de plataformas de cómputo, desde dispositivos embebidos y servidores, hasta nodos HPC de grandes centros de datos. Su complejidad hace que sean habitualmente usados bajo el paradigma de tareas y el modelo de programación host-device. Esto penaliza fuertemente el aprovechamiento de los aceleradores y el consumo energético del sistema, además de dificultar la adaptación de las aplicaciones. La co-ejecución permite que todos los dispositivos cooperen para computar el mismo problema, consumiendo menos tiempo y energía. No obstante, los programadores deben encargarse de toda la gestión de los dispositivos, la distribución de la carga y la portabilidad del código entre sistemas, complicando notablemente su programación. Esta tesis ofrece contribuciones para mejorar el rendimiento y la eficiencia energética en estos sistemas masivamente paralelos. Se realizan propuestas que abordan objetivos generalmente contrapuestos: se mejora la usabilidad y la programabilidad, a la vez que se garantiza una mayor abstracción y extensibilidad del sistema, y al mismo tiempo se aumenta el rendimiento, la escalabilidad y la eficiencia energética. Para ello, se proponen dos motores de ejecución con enfoques completamente distintos. EngineCL, centrado en OpenCL y con una API de alto nivel, favorece la máxima compatibilidad entre todo tipo de dispositivos y proporciona un sistema modular extensible. Su versatilidad permite adaptarlo a entornos para los que no fue concebido, como aplicaciones con ejecuciones restringidas por tiempo o simuladores HPC de dinámica molecular, como el utilizado en un centro de investigación internacional. Considerando las tendencias industriales y enfatizando la aplicabilidad profesional, CoexecutorRuntime proporciona un sistema flexible centrado en C++/SYCL que dota de soporte a la co-ejecución a la tecnología oneAPI. Este runtime acerca a los programadores al dominio del problema, posibilitando la explotación de estrategias dinámicas adaptativas que mejoran la eficiencia en todo tipo de aplicaciones.ABSTRACT Heterogeneous systems are becoming increasingly relevant, due to their performance and energy efficiency capabilities, being present in all types of computing platforms, from embedded devices and servers to HPC nodes in large data centers. Their complexity implies that they are usually used under the task paradigm and the host-device programming model. This strongly penalizes accelerator utilization and system energy consumption, as well as making it difficult to adapt applications. Co-execution allows all devices to simultaneously compute the same problem, cooperating to consume less time and energy. However, programmers must handle all device management, workload distribution and code portability between systems, significantly complicating their programming. This thesis offers contributions to improve performance and energy efficiency in these massively parallel systems. The proposals address the following generally conflicting objectives: usability and programmability are improved, while ensuring enhanced system abstraction and extensibility, and at the same time performance, scalability and energy efficiency are increased. To achieve this, two runtime systems with completely different approaches are proposed. EngineCL, focused on OpenCL and with a high-level API, provides an extensible modular system and favors maximum compatibility between all types of devices. Its versatility allows it to be adapted to environments for which it was not originally designed, including applications with time-constrained executions or molecular dynamics HPC simulators, such as the one used in an international research center. Considering industrial trends and emphasizing professional applicability, CoexecutorRuntime provides a flexible C++/SYCL-based system that provides co-execution support for oneAPI technology. This runtime brings programmers closer to the problem domain, enabling the exploitation of dynamic adaptive strategies that improve efficiency in all types of applications.Funding: This PhD has been supported by the Spanish Ministry of Education (FPU16/03299 grant), the Spanish Science and Technology Commission under contracts TIN2016-76635-C2-2-R and PID2019-105660RB-C22. This work has also been partially supported by the Mont-Blanc 3: European Scalable and Power Efficient HPC Platform based on Low-Power Embedded Technology project (G.A. No. 671697) from the European Union’s Horizon 2020 Research and Innovation Programme (H2020 Programme). Some activities have also been funded by the Spanish Science and Technology Commission under contract TIN2016-81840-REDT (CAPAP-H6 network). The Integration II: Hybrid programming models of Chapter 4 has been partially performed under the Project HPC-EUROPA3 (INFRAIA-2016-1-730897), with the support of the EC Research Innovation Action under the H2020 Programme. In particular, the author gratefully acknowledges the support of the SPMT Department of the High Performance Computing Center Stuttgart (HLRS)

    Interactive Visualization on High-Resolution Tiled Display Walls with Network Accessible Compute- and Display-Resources

    Get PDF
    Papers number 2-7 and appendix B and C of this thesis are not available in Munin: 2. Hagen, T-M.S., Johnsen, E.S., Stødle, D., Bjorndalen, J.M. and Anshus, O.: 'Liberating the Desktop', First International Conference on Advances in Computer-Human Interaction (2008), pp 89-94. Available at http://dx.doi.org/10.1109/ACHI.2008.20 3. Tor-Magne Stien Hagen, Oleg Jakobsen, Phuong Hoai Ha, and Otto J. Anshus: 'Comparing the Performance of Multiple Single-Cores versus a Single Multi-Core' (manuscript)4. Tor-Magne Stien Hagen, Phuong Hoai Ha, and Otto J. Anshus: 'Experimental Fault-Tolerant Synchronization for Reliable Computation on Graphics Processors' (manuscript) 5. Tor-Magne Stien Hagen, Daniel Stødle and Otto J. Anshus: 'On-Demand High-Performance Visualization of Spatial Data on High-Resolution Tiled Display Walls', Proceedings of the International Conference on Imaging Theory and Applications and International Conference on Information Visualization Theory and Applications (2010), pages 112-119. Available at http://dx.doi.org/10.5220/0002849601120119 6. Bård Fjukstad, Tor-Magne Stien Hagen, Daniel Stødle, Phuong Hoai Ha, John Markus Bjørndalen and Otto Anshus: 'Interactive Weather Simulation and Visualization on a Display Wall with Many-Core Compute Nodes', Para 2010 – State of the Art in Scientific and Parallel Computing. Available at http://vefir.hi.is/para10/extab/para10-paper-60 7. Tor-Magne Stien Hagen, Daniel Stødle, John Markus Bjørndalen, and Otto Anshus: 'A Step towards Making Local and Remote Desktop Applications Interoperable with High-Resolution Tiled Display Walls', Lecture Notes in Computer Science (2011), Volume 6723/2011, 194-207. Available at http://dx.doi.org/10.1007/978-3-642-21387-8_15The vast volume of scientific data produced today requires tools that can enable scientists to explore large amounts of data to extract meaningful information. One such tool is interactive visualization. The amount of data that can be simultaneously visualized on a computer display is proportional to the display’s resolution. While computer systems in general have seen a remarkable increase in performance the last decades, display resolution has not evolved at the same rate. Increased resolution can be provided by tiling several displays in a grid. A system comprised of multiple displays tiled in such a grid is referred to as a display wall. Display walls provide orders of magnitude more resolution than typical desktop displays, and can provide insight into problems not possible to visualize on desktop displays. However, their distributed and parallel architecture creates several challenges for designing systems that can support interactive visualization. One challenge is compatibility issues with existing software designed for personal desktop computers. Another set of challenges include identifying characteristics of visualization systems that can: (i) Maintain synchronous state and display-output when executed over multiple display nodes; (ii) scale to multiple display nodes without being limited by shared interconnect bottlenecks; (iii) utilize additional computational resources such as desktop computers, clusters and supercomputers for workload distribution; and (iv) use data from local and remote compute- and data-resources with interactive performance. This dissertation presents Network Accessible Compute (NAC) resources and Network Accessible Display (NAD) resources for interactive visualization of data on displays ranging from laptops to high-resolution tiled display walls. A NAD is a display having functionality that enables usage over a network connection. A NAC is a computational resource that can produce content for network accessible displays. A system consisting of NACs and NADs is either push-based (NACs provide NADs with content) or pull-based (NADs request content from NACs). To attack the compatibility challenge, a push-based system was developed. The system enables several simultaneous users to mirror multiple regions from the desktop of their computers (NACs) onto nearby NADs (among others a 22 megapixel display wall) without requiring usage of separate DVI/VGA cables, permanent installation of third party software or opening firewall ports. The system has lower performance than that of a DVI/VGA cable approach, but increases flexibility such as the possibility to share network accessible displays from multiple computers. At a resolution of 800 by 600 pixels, the system can mirror dynamic content between a NAC and a NAD at 38.6 frames per second (FPS). At 1600x1200 pixels, the refresh rate is 12.85 FPS. The bottleneck of the system is frame buffer capturing and encoding/decoding of pixels. These two functional parts are executed in sequence, limiting the usage of additional CPU cores. By pipelining and executing these parts on separate CPU cores, higher frame rates can be expected and by a factor of two in the best case. To attack all presented challenges, a pull-based system, WallScope, was developed. WallScope enables interactive visualization of local and remote data sets on high-resolution tiled display walls. The WallScope architecture comprises a compute-side and a display-side. The compute-side comprises a set of static and dynamic NACs. Static NACs are considered permanent to the system once added. This type of NAC typically has strict underlying security and access policies. Examples of such NACs are clusters, grids and supercomputers. Dynamic NACs are compute resources that can register on-the-fly to become compute nodes in the system. Examples of this type of NAC are laptops and desktop computers. The display-side comprises of a set of NADs and a data set containing data customized for the particular application domain of the NADs. NADs are based on a sort-first rendering approach where a visualization client is executed on each display-node. The state of these visualization clients is provided by a separate state server, enabling central control of load and refresh-rate. Based on the state received from the state server, the visualization clients request content from the data set. The data set is live in that it translates these requests into compute messages and forwards them to available NACs. Results of the computations are returned to the NADs for the final rendering. The live data set is close to the NADs, both in terms of bandwidth and latency, to enable interactive visualization. WallScope can visualize the Earth, gigapixel images, and other data available through the live data set. When visualizing the Earth on a 28-node display wall by combining the Blue Marble data set with the Landsat data set using a set of static NACs, the bottleneck of WallScope is the computation involved in combining the data sets. However, the time used to combine data sets on the NACs decreases by a factor of 23 when going from 1 to 26 compute nodes. The display-side can decode 414.2 megapixels of images per second (19 frames per second) when visualizing the Earth. The decoding process is multi-threaded and higher frame rates are expected using multi-core CPUs. WallScope can rasterize a 350-page PDF document into 550 megapixels of image-tiles and display these image-tiles on a 28-node display wall in 74.66 seconds (PNG) and 20.66 seconds (JPG) using a single quad-core desktop computer as a dynamic NAC. This time is reduced to 4.20 seconds (PNG) and 2.40 seconds (JPG) using 28 quad-core NACs. This shows that the application output from personal desktop computers can be decoupled from the resolution of the local desktop and display for usage on high-resolution tiled display walls. It also shows that the performance can be increased by adding computational resources giving a resulting speedup of 17.77 (PNG) and 8.59 (JPG) using 28 compute nodes. Three principles are formulated based on the concepts and systems researched and developed: (i) Establishing the end-to-end principle through customization, is a principle stating that the setup and interaction between a display-side and a compute-side in a visualization context can be performed by customizing one or both sides; (ii) Personal Computer (PC) – Personal Compute Resource (PCR) duality states that a user’s computer is both a PC and a PCR, implying that desktop applications can be utilized locally using attached interaction devices and display(s), or remotely by other visualization systems for domain specific production of data based on a user’s personal desktop install; and (iii) domain specific best-effort synchronization stating that for distributed visualization systems running on tiled display walls, state handling can be performed using a best-effort synchronization approach, where visualization clients eventually will get the correct state after a given period of time. Compared to state-of-the-art systems presented in the literature, the contributions of this dissertation enable utilization of a broader range of compute resources from a display wall, while at the same time providing better control over where to provide functionality and where to distribute workload between compute-nodes and display-nodes in a visualization context

    Chaotic Sensing

    Get PDF
    We propose a sparse imaging methodology called Chaotic Sensing (ChaoS) that enables the use of limited yet deterministic linear measurements through fractal sampling. A novel fractal in the discrete Fourier transform is introduced that always results in the artefacts being turbulent in nature. These chaotic artefacts have characteristics that are image independent, facilitating their removal through dampening (via image denoising) and obtaining the maximum likelihood solution. In contrast with existing methods, such as compressed sensing, the fractal sampling is based on digital periodic lines that form the basis of discrete projected views of the image without requiring additional transform domains. This allows the creation of finite iterative reconstruction schemes in recovering an image from its fractal sampling that is also new to discrete tomography. As a result, ChaoS supports linear measurement and optimisation strategies, while remaining capable of recovering a theoretically exact representation of the image. We apply the method to simulated and experimental limited magnetic resonance (MR) imaging data, where restrictions imposed by MR physics typically favour linear measurements for reducing acquisition time

    Stochastic cosmology, theories of perturbations and Lifshitz gravity

    Full text link
    We review some works of E M Lifshitz connected with gravity and cosmology and also some later works, connected with his ideas. The main topics of this review are the stochastic cosmology of an anisotropic universe and of an isotropic universe with the scalar field, the quasi-isotropic (gradient) expansion in cosmology and Horava-Lifshitz gravity and cosmology.Comment: 30 pages, to appear in Physics - Uspekhi. arXiv admin note: text overlap with arXiv:0901.3775 by other author

    The Poetry of Logical Ideas: Towards a Mathematical Genealogy of Media Art

    Get PDF
    In this dissertation I chart a mathematical genealogy of media art, demonstrating that mathematical thought has had a significant influence on contemporary experimental moving image production. Rather than looking for direct cause and effect relationships between mathematics and the arts, I will instead examine how mathematical developments have acted as a cultural zeitgeist, an indirect, but significant, influence on the humanities and the arts. In particular, I will be narrowing the focus of this study to the influence mathematical thought has had on cinema (and by extension media art), given that mathematics lies comfortably between the humanities and sciences, and that cinema is the object par excellence of such a study, since cinema and media studies arrived at a time when the humanities and sciences were held by many to be mutually exclusive disciplines. It is also shown that many media scholars have been implicitly engaging with mathematical concepts without necessarily recognizing them as such. To demonstrate this, I examine many concepts from media studies that demonstrate or derive from mathematical concepts. For instance, Claude Shannon's mathematical model of communication is used to expand on Stuart Hall's cultural model, and the mathematical concept of the fractal is used to expand on Rosalind Krauss' argument that video is a medium that lends itself to narcissism. Given that the influence of mathematics on the humanities and the arts often occurs through a misuse or misinterpretation of mathematics, I mobilize the concept of a productive misinterpretation and argue that this type of misreading has the potential to lead to novel innovations within the humanities and the arts. In this dissertation, it is also established that there are many mathematical concepts that can be utilized by media scholars to better analyze experimental moving images. In particular, I explore the mathematical concepts of symmetry, infinity, fractals, permutations, the Axiom of Choice, and the algorithmic to moving images works by Hollis Frampton, Barbara Lattanzi, Dana Plays, T. Marie, and Isiah Medina, among others. It is my desire that this study appeal to scientists with an interest in cinema and media art, and to media theorists with an interest in experimental cinema and other contemporary moving image practices

    Massively-parallel and concurrent SVM architectures

    Get PDF
    This work presents several Support Vector Machine (SVM) architectures developed by the Author with the intent of exploiting the inherent parallel structures and potential- concurrency underpinning the SVM’s mathematical operation. Two SVM training sub- system prototypes are presented - a brute-force search classification training architecture, and, Artificial Neural Network (ANN)-mapped optimisation architectures for both SVM classification training and SVM regression training. This work also proposes and proto- types a set of parallelised SVM Digital Signal Processor (DSP) pipeline architectures. The parallelised SVM DSP pipeline architectures have been modelled in C and implemented in VHDL for the synthesis and fitting on an Altera Stratix V FPGA. Each system pre- sented in this work has been applied to a problem domain application appropriate to the SVM system’s architectural limitations - including the novel application of the SVM as a chaotic and non-linear system parameter-identification tool. The SVM brute-force search classification training architecture has been modelled for datasets of 2 dimensions and composed of linear and non-linear problems requiring only 4 support vectors by utilising the linear kernel and the polynomial kernel respectively. The system has been implemented in Matlab and non-exhaustively verified using the holdout method with a trivial linearly separable classification problem dataset and a trivial non- linear XOR classification problem dataset. While the architecture was a feasible design for software-based implementations targeting 2-dimensional datasets the architectural com- plexity and unmanageable number of parallelisable operations introduced by increasing data-dimensionality and the number of support vectors subsequently resulted in the Au- thor pursuing different parallelised-architecture strategies. Two distinct ANN-mapped optimisation strategies developed and proposed for SVM classification training and SVM regression training have been modelled in Matlab; the architectures have been designed such that any dimensionality dataset can be applied by configuring the appropriate dimensionality and support vector parameters. Through Monte-Carlo testing using the datasets examined in this work the gain parameters in- herent in the architectural design of the systems were found to be difficult to tune, and, system convergence to acceptable sets of training support vectors were unachieved. The ANN-mapped optimisation strategies were thus deemed inappropriate for SVM training with the applied datasets without more design effort and architectural modification work. The parallelised SVM DSP pipeline architecture prototypes data-set dimensionality, sup- port vector set counts, and latency ranges follow. In each case the Field Programmable Gate Array (FPGA) pipeline prototype latency unsurprisingly outclassed the correspond- ing C-software model execution times by at least 3 orders of magnitude. The SVM classi- fication training DSP pipeline FPGA prototypes are compatible with data-sets spanning 2 to 8 dimensions, support vector sets of up to 16 support vectors, and have a pipeline latency range spanning from a minimum of 0.18 microseconds to a maximum of 0.28 mi- croseconds. The SVM classification function evaluation DSP pipeline FPGA prototypes are compatible with data-sets spanning 2 to 8 dimensions, support vector sets of up to 32 support vectors, and have a pipeline latency range spanning from a minimum of 0.16 microseconds to a maximum of 0.24 microseconds. The SVM regression training DSP pipeline FPGA prototypes are compatible with data-sets spanning 2 to 8 dimensions, support vector sets of up to 16 support vectors, and have a pipeline latency range span- ning from a minimum of 0.20 microseconds to a maximum of 0.30 microseconds. The SVM regression function evaluation DSP pipeline FPGA prototypes are compatible with data-sets spanning 2 to 8 dimensions, support vector sets of up to 16 support vectors, and have a pipeline latency range spanning from a minimum of 0.20 microseconds to a maximum of 0.30 microseconds. Finally, utilising LIBSVM training and the parallelised SVM DSP pipeline function eval- uation architecture prototypes, SVM classification and SVM regression was successfully applied to Rajkumar’s oil and gas pipeline fault detection and failure system legacy data- set yielding excellent results. Also utilising LIBSVM training, and, the parallelised SVM DSP pipeline function evaluation architecture prototypes, both SVM classification and SVM regression was applied to several chaotic systems as a feasibility study into the ap- plication of the SVM machine learning paradigm for chaotic and non-linear dynamical system parameter-identification. SVM classification was applied to the Lorenz Attrac- tor and an ANN-based chaotic oscillator to a reasonably acceptable degree of success. SVM classification was applied to the Mackey-Glass attractor yielding poor results. SVM regression was applied Lorenz Attractor and an ANN-based chaotic oscillator yielding av- erage but encouraging results. SVM regression was applied to the Mackey-Glass attractor yielding poor results

    Modeling and simulating chemical weapon dispersal patterns in DIRSIG

    Get PDF
    Fieldable thermal infrared hyperspectral imaging spectrometers have made it possible to design and construct new instruments for better detection of battlefield hazards such as chemical weapon clouds. The availability of spectroscopic measurements of these clouds can be used not only for the detection and identification of specific chemical agents but also to potentially quantify the lethality of the cloud. The simulation of chemical weapon dispersal patterns in a synthetic imaging environment offers significant benefits to sensor designers. Such an environment allows designers to easily develop trade spaces to test detection and quantification algorithms without the need for expensive and dangerous field releases. This research focuses on the implementation of a generic gas dispersion model that has been integrated into the Digital Imaging and Remote Sensing Generation (DIRSIG) model. The gas cloud model utilizes a 3D Gaussian distribution based on theory to predict factory stack gas plumes. The model incorporates first order dynamics (drift and dispersion) to drive the macro-scale cloud development and movement. The model also attempts to account for turbulence by using fractal fractional Brownian motion techniques to reproduce the micro-scale variances within the cloud. The cloud pathlength concentrations are then processed by the DIRSIG radiometry sub-model to compute the emission and transmission of the cloud body on a per-pixel basis. Example hyperspectral image cubes containing common agents and release amounts are presented. Time lapse sequences are also provided to demonstrate the evolution of the cloud over time. Finally, recommendations and limitations of the model are listed for future improvements

    Domänen parallele Maschinen

    Get PDF
    A computational model is introduced, which abstracts and idealizes computers with access to fragment shaders. While the set of functions computable by this model remains the same, the running times can be drastically reduced through parallelization compared to conventional models. Some of the algorithms designed for the model can be approximated using fragment shaders. With an automatic transcompilation scheme, fragment shader programs can be generated automatically from a description in a high-level language.In dieser Arbeit wird ein Rechenmodell, das Computer mit Zugriff zu Fragment Shader abstrahiert und idealisiert, eingeführt. Zwar bleibt der Umfang der durch dieses Modell berechenbarer Funktionen gleich, jedoch können die Laufzeiten durch Parallelisierung im Vergleich zu herkömmlichen Modellen drastisch verkürzt werden. Einige der für das Modell entworfenen Algorithmen lassen sich mithilfe von Fragment Shadern approximieren. In einer Hochsprache beschriebene Algorithmen werden automatisiert in Fragment Shader Programme übersetzt
    corecore