287 research outputs found

    Protein Structure Prediction with Parallel Algorithms Orthogonal to Parallel Platforms

    Get PDF
    The problem of Protein Structure Prediction (PSP) is known to be computationally expensive, which calls for the application of high performance techniques. In this project, parallel PSP algorithms found in the literature are being accelerated and ported to different parallel platforms, producing a set of algorithms that it is diverse in terms of the parallel architectures and parallel programming models used. The algorithms are intended to help other research projects and they have also been made publicly available so as to support the development of more elaborate prediction algorithms. We have thus far produced a set of 16 algorithms (mixing CUDA, OpenMP, MPI and/or complexity reduction optimizations); during its development, two algorithms that promote high performance were proposed, and they have been written in an article that was accepted in the International Conference on Computational Science (ICCS)

    Microspacecraft and Earth observation: Electrical field (ELF) measurement project

    Get PDF
    The Utah State University space system design project for 1989 to 1990 focuses on the design of a global electrical field sensing system to be deployed in a constellation of microspacecraft. The design includes the selection of the sensor and the design of the spacecraft, the sensor support subsystems, the launch vehicle interface structure, on board data storage and communications subsystems, and associated ground receiving stations. Optimization of satellite orbits and spacecraft attitude are critical to the overall mapping of the electrical field and, thus, are also included in the project. The spacecraft design incorporates a deployable sensor array (5 m booms) into a spinning oblate platform. Data is taken every 0.1 seconds by the electrical field sensors and stored on-board. An omni-directional antenna communicates with a ground station twice per day to down link the stored data. Wrap-around solar cells cover the exterior of the spacecraft to generate power. Nine Pegasus launches may be used to deploy fifty such satellites to orbits with inclinations greater than 45 deg. Piggyback deployment from other launch vehicles such as the DELTA 2 is also examined

    Hough Transform Track Reconstruction in the Cathode Strip Chambers in ATLAS

    Get PDF
    The world's largest and highest energy particle accelerator, the Large Hadron Collider (LHC), will collide two highly energetic proton beams in an attempt to discover a wide range of new physics. Among which, the primary ambitions are the discovery of the Higgs boson and suppersymmetric particles. ATLAS, one of its primary particle detectors, was designed as a general-purpose detector covering a broad range of energies and physical processes. A special emphasis on accurate muon tracking has led the ATLAS collaboration to design a stand-alone Muon Spectrometer, an extremely large tracking system extending all the way around the detector. Due to its immense size and range, parts of the spectrometer were designed to withstand a high rate of radiation, sifting the muon signals from the rest of the signals (primarily neutrons and photons). The Cathode Strip Chambers (CSCs) are special multiwire proportional chambers placed in the high η\eta region on of the Muon Spectrometer, where flux of background particles is highest. Their purpose is to efficiently filter out the background particle, tracking only the muons traversing it with high degree of accuracy. In order to do that, this special algorithm was designed using a novel modification of the Hough Transform. This thesis will detail the key elements of this algorithm, how it is used for better muon track detection and parameterization, and give a preliminary evaluation of the perform ance of this algorithm

    Token bus LAN performance : modeling and simulation

    Get PDF
    A simulation model based on CSIM, a process oriented simulated language, to analyze the performance of the Token Bus protocol is developed. Performance measures such as throughput, average delay and maximum delay per packet are presented. System performance is analyzed for different loads, number of stations, network lengths, different physical and logical distribution of the stations with packet length as a parameter. Previous studies were based on the delay-throughput analysis with no discussion on the effect of variation of the logical and physical distribution of the stations on the performance of the model which is done in the present thesis. The load is offered to the network in the form of a stream of data packets with uniformly distributed inter-arrival times. A comparison of the Token Bus model with that of a CSNIA/CD model shows that the physical distribution of the stations has a minimum effect on the performance of the model in the case of the Token Bus model but has a considerable effect on that of the CSMA/CD model

    Improvement of Data-Intensive Applications Running on Cloud Computing Clusters

    Get PDF
    MapReduce, designed by Google, is widely used as the most popular distributed programming model in cloud environments. Hadoop, an open-source implementation of MapReduce, is a data management framework on large cluster of commodity machines to handle data-intensive applications. Many famous enterprises including Facebook, Twitter, and Adobe have been using Hadoop for their data-intensive processing needs. Task stragglers in MapReduce jobs dramatically impede job execution on massive datasets in cloud computing systems. This impedance is due to the uneven distribution of input data and computation load among cluster nodes, heterogeneous data nodes, data skew in reduce phase, resource contention situations, and network configurations. All these reasons may cause delay failure and the violation of job completion time. One of the key issues that can significantly affect the performance of cloud computing is the computation load balancing among cluster nodes. Replica placement in Hadoop distributed file system plays a significant role in data availability and the balanced utilization of clusters. In the current replica placement policy (RPP) of Hadoop distributed file system (HDFS), the replicas of data blocks cannot be evenly distributed across cluster\u27s nodes. The current HDFS must rely on a load balancing utility for balancing the distribution of replicas, which results in extra overhead for time and resources. This dissertation addresses data load balancing problem and presents an innovative replica placement policy for HDFS. It can perfectly balance the data load among cluster\u27s nodes. The heterogeneity of cluster nodes exacerbates the issue of computational load balancing; therefore, another replica placement algorithm has been proposed in this dissertation for heterogeneous cluster environments. The timing of identifying the straggler map task is very important for straggler mitigation in data-intensive cloud computing. To mitigate the straggler map task, Present progress and Feedback based Speculative Execution (PFSE) algorithm has been proposed in this dissertation. PFSE is a new straggler identification scheme to identify the straggler map tasks based on the feedback information received from completed tasks beside the progress of the current running task. Straggler reduce task aggravates the violation of MapReduce job completion time. Straggler reduce task is typically the result of bad data partitioning during the reduce phase. The Hash partitioner employed by Hadoop may cause intermediate data skew, which results in straggler reduce task. In this dissertation a new partitioning scheme, named Balanced Data Clusters Partitioner (BDCP), is proposed to mitigate straggler reduce tasks. BDCP is based on sampling of input data and feedback information about the current processing task. BDCP can assist in straggler mitigation during the reduce phase and minimize the job completion time in MapReduce jobs. The results of extensive experiments corroborate that the algorithms and policies proposed in this dissertation can improve the performance of data-intensive applications running on cloud platforms

    OpenFPM: A scalable environment for particle and particle-mesh codes on parallel computers

    Get PDF
    Scalable and efficient numerical simulations continue to gain importance, as computation is firmly established tool of discovery, together with theory and experiment. Meanwhile, the performance of computing hardware grows with increasing heterogeneous hardware, enabling simulations of ever more complex models. However, efficiently implementing scalable codes on heterogeneous, distributed hardware systems becomes the bottleneck. This bottleneck can be alleviated by intermediate software layers that provide higher-level abstractions closer to the problem domain, hence allowing the computational scientist to focus on the simulation. Here, we present OpenFPM, an open and scalable framework that provides an abstraction layer for numerical simulations using particles and/or meshes. OpenFPM provides transparent and scalable infrastructure for shared-memory and distributed-memory implementations of particles-only and hybrid particle-mesh simulations of both discrete and continuous models, as well as non-simulation codes. This infrastructure is complemented with frequently used numerical routines, as well as interfaces to third-party libraries. This thesis will present the architecture and design of OpenFPM, detail the underlying abstractions, and benchmark the framework in applications ranging from Smoothed-Particle Hydrodynamics (SPH) to Molecular Dynamics (MD), Discrete Element Methods (DEM), Vortex Methods, stencil codes, high-dimensional Monte Carlo sampling (CMA-ES), and Reaction-Diffusion solvers, comparing it to the current state of the art and existing software frameworks

    Visualization and inspection of the geometry of particle packings

    Get PDF
    Gegenstand dieser Dissertation ist die Entwicklung von effizienten Verfahren zur Visualisierung und Inspektion der Geometrie von Partikelmischungen. Um das Verhalten der Simulation für die Partikelmischung besser zu verstehen und zu überwachen, sollten nicht nur die Partikel selbst, sondern auch spezielle von den Partikeln gebildete Bereiche, die den Simulationsfortschritt und die räumliche Verteilung von Hotspots anzeigen können, visualisiert werden können. Dies sollte auch bei großen Packungen mit Millionen von Partikeln zumindest mit einer interaktiven Darstellungsgeschwindigkeit möglich sein. . Da die Simulation auf der Grafikkarte (GPU) durchgeführt wird, sollten die Visualisierungstechniken die Daten des GPU-Speichers vollständig nutzen. Um die Qualität von trockenen Partikelmischungen wie Beton zu verbessern, wurde der Korngrößenverteilung große Aufmerksamkeit gewidmet, die die Raumfüllungsrate hauptsächlich beeinflusst und daher zwei der wichtigsten Eigenschaften des Betons bestimmt: die strukturelle Robustheit und die Haltbarkeit. Anhand der Korngrößenverteilung kann die Raumfüllungsrate durch Computersimulationen bestimmt werden, die analytischen Ansätzen in der Praxis wegen der breiten Größenverteilung der Partikel oft überlegen sind. Eine der weit verbreiteten Simulationsmethoden ist das Collective Rearrangement, bei dem die Partikel zunächst an zufälligen Positionen innerhalb eines Behälters platziert werden. Später werden Überlappungen zwischen Partikeln aufgelöst, indem überlappende Partikel voneinander weggedrückt werden. Durch geschickte Anpassung der Behältergröße während der Simulation, kann die Collective Rearrangement-Methode am Ende eine ziemlich dichte Partikelpackung generieren. Es ist jedoch sehr schwierig, den gesamten Simulationsprozess ohne ein interaktives Visualisierungstool zu optimieren oder dort Fehler zu finden. Ausgehend von der etablierten rasterisierungsbasierten Methode zum Darstellen einer großen Menge von Kugeln, bietet diese Dissertation zunächst schnelle und pixelgenaue Methoden zur neuartigen Visualisierung der Überlappungen und Freiräume zwischen kugelförmigen Partikeln innerhalb eines Behälters.. Die auf Rasterisierung basierenden Verfahren funktionieren gut für kleinere Partikelpackungen bis ca. eine Million Kugeln. Bei größeren Packungen entstehen Probleme durch die lineare Laufzeit und den Speicherverbrauch. Zur Lösung dieses Problems werden neue Methoden mit Hilfe von Raytracing zusammen mit zwei neuen Arten von Bounding-Volume-Hierarchien (BVHs) bereitgestellt. Diese können den Raytracing-Prozess deutlich beschleunigen --- die erste kann die vorhandene Datenstruktur für die Simulation wiederverwenden und die zweite ist speichereffizienter. Beide BVHs nutzen die Idee des Loose Octree und sind die ersten ihrer Art, die die Größe von Primitiven für interaktives Raytracing mit häufig aktualisierten Beschleunigungsdatenstrukturen berücksichtigen. Darüber hinaus können die Visualisierungstechniken in dieser Dissertation auch angepasst werden, um Eigenschaften wie das Volumen bestimmter Bereiche zu berechnen. All diese Visualisierungstechniken werden dann auf den Fall nicht-sphärischer Partikel erweitert, bei denen ein nicht-sphärisches Partikel durch ein starres System von Kugeln angenähert wird, um die vorhandene kugelbasierte Simulation wiederverwenden zu können. Dazu wird auch eine neue GPU-basierte Methode zum effizienten Füllen eines nicht-kugelförmigen Partikels mit polydispersen überlappenden Kugeln vorgestellt, so dass ein Partikel mit weniger Kugeln gefüllt werden kann, ohne die Raumfüllungsrate zu beeinträchtigen. Dies erleichtert sowohl die Simulation als auch die Visualisierung. Basierend auf den Arbeiten in dieser Dissertation können ausgefeiltere Algorithmen entwickelt werden, um großskalige nicht-sphärische Partikelmischungen effizienter zu visualisieren. Weiterhin kann in Zukunft Hardware-Raytracing neuerer Grafikkarten anstelle des in dieser Dissertation eingesetzten Software-Raytracing verwendet werden. Die neuen Techniken können auch als Grundlage für die interaktive Visualisierung anderer partikelbasierter Simulationen verwendet werden, bei denen spezielle Bereiche wie Freiräume oder Überlappungen zwischen Partikeln relevant sind.The aim of this dissertation is to find efficient techniques for visualizing and inspecting the geometry of particle packings. Simulations of such packings are used e.g. in material sciences to predict properties of granular materials. To better understand and supervise the behavior of these simulations, not only the particles themselves but also special areas formed by the particles that can show the progress of the simulation and spatial distribution of hot spots, should be visualized. This should be possible with a frame rate that allows interaction even for large scale packings with millions of particles. Moreover, given the simulation is conducted in the GPU, the visualization techniques should take full use of the data in the GPU memory. To improve the performance of granular materials like concrete, considerable attention has been paid to the particle size distribution, which is the main determinant for the space filling rate and therefore affects two of the most important properties of the concrete: the structural robustness and the durability. Given the particle size distribution, the space filling rate can be determined by computer simulations, which are often superior to analytical approaches due to irregularities of particles and the wide range of size distribution in practice. One of the widely adopted simulation methods is the collective rearrangement, for which particles are first placed at random positions inside a container, later overlaps between particles will be resolved by letting overlapped particles push away from each other to fill empty space in the container. By cleverly adjusting the size of the container according to the process of the simulation, the collective rearrangement method could get a pretty dense particle packing in the end. However, it is very hard to fine-tune or debug the whole simulation process without an interactive visualization tool. Starting from the well-established rasterization-based method to render spheres, this dissertation first provides new fast and pixel-accurate methods to visualize the overlaps and free spaces between spherical particles inside a container. The rasterization-based techniques perform well for small scale particle packings but deteriorate for large scale packings due to the large memory requirements that are hard to be approximated correctly in advance. To address this problem, new methods based on ray tracing are provided along with two new kinds of bounding volume hierarchies (BVHs) to accelerate the ray tracing process --- the first one can reuse the existing data structure for simulation and the second one is more memory efficient. Both BVHs utilize the idea of loose octree and are the first of their kind to consider the size of primitives for interactive ray tracing with frequently updated acceleration structures. Moreover, the visualization techniques provided in this dissertation can also be adjusted to calculate properties such as volumes of the specific areas. All these visualization techniques are then extended to non-spherical particles, where a non-spherical particle is approximated by a rigid system of spheres to reuse the existing simulation. To this end a new GPU-based method is presented to fill a non-spherical particle with polydisperse possibly overlapping spheres efficiently, so that a particle can be filled with fewer spheres without sacrificing the space filling rate. This eases both simulation and visualization. Based on approaches presented in this dissertation, more sophisticated algorithms can be developed to visualize large scale non-spherical particle mixtures more efficiently. Besides, one can try to exploit the hardware ray tracing of more recent graphic cards instead of maintaining the software ray tracing as in this dissertation. The new techniques can also become the basis for interactively visualizing other particle-based simulations, where special areas such as free space or overlaps between particles are of interest

    Leveraging Resources on Anonymous Mobile Edge Nodes

    Get PDF
    Smart devices have become an essential component in the life of mankind. The quick rise of smartphones, IoTs, and wearable devices enabled applications that were not possible few years ago, e.g., health monitoring and online banking. Meanwhile, smart sensing laid the infrastructure for smart homes and smart cities. The intrusive nature of smart devices granted access to huge amounts of raw data. Researchers seized the moment with complex algorithms and data models to process the data over the cloud and extract as much information as possible. However, the pace and amount of data generation, in addition to, networking protocols transmitting data to cloud servers failed short in touching more than 20% of what was generated on the edge of the network. On the other hand, smart devices carry a large set of resources, e.g., CPU, memory, and camera, that sit idle most of the time. Studies showed that for plenty of the time resources are either idle, e.g., sleeping and eating, or underutilized, e.g. inertial sensors during phone calls. These findings articulate a problem in processing large data sets, while having idle resources in the close proximity. In this dissertation, we propose harvesting underutilized edge resources then use them in processing the huge data generated, and currently wasted, through applications running at the edge of the network. We propose flipping the concept of cloud computing, instead of sending massive amounts of data for processing over the cloud, we distribute lightweight applications to process data on users\u27 smart devices. We envision this approach to enhance the network\u27s bandwidth, grant access to larger datasets, provide low latency responses, and more importantly involve up-to-date user\u27s contextual information in processing. However, such benefits come with a set of challenges: How to locate suitable resources? How to match resources with data providers? How to inform resources what to do? and When? How to orchestrate applications\u27 execution on multiple devices? and How to communicate between devices on the edge? Communication between devices at the edge has different parameters in terms of device mobility, topology, and data rate. Standard protocols, e.g., Wi-Fi or Bluetooth, were not designed for edge computing, hence, does not offer a perfect match. Edge computing requires a lightweight protocol that provides quick device discovery, decent data rate, and multicasting to devices in the proximity. Bluetooth features wide acceptance within the IoT community, however, the low data rate and unicast communication limits its use on the edge. Despite being the most suitable communication protocol for edge computing and unlike other protocols, Bluetooth has a closed source code that blocks lower layer in front of all forms of research study, enhancement, and customization. Hence, we offer an open source version of Bluetooth and then customize it for edge computing applications. In this dissertation, we propose Leveraging Resources on Anonymous Mobile Edge Nodes (LAMEN), a three-tier framework where edge devices are clustered by proximities. On having an application to execute, LAMEN clusters discover and allocate resources, share application\u27s executable with resources, and estimate incentives for each participating resource. In a cluster, a single head node, i.e., mediator, is responsible for resource discovery and allocation. Mediators orchestrate cluster resources and present them as a virtually large homogeneous resource. For example, two devices each offering either a camera or a speaker are presented outside the cluster as a single device with both camera and speaker, this can be extended to any combination of resources. Then, mediator handles applications\u27 distribution within a cluster as needed. Also, we provide a communication protocol that is customizable to the edge environment and application\u27s need. Pushing lightweight applications that end devices can execute over their locally generated data have the following benefits: First, avoid sharing user data with cloud server, which is a privacy concern for many of them; Second, introduce mediators as a local cloud controller closer to the edge; Third, hide the user\u27s identity behind mediators; and Finally, enhance bandwidth utilization by keeping raw data at the edge and transmitting processed information. Our evaluation shows an optimized resource lookup and application assignment schemes. In addition to, scalability in handling networks with large number of devices. In order to overcome the communication challenges, we provide an open source communication protocol that we customize for edge computing applications, however, it can be used beyond the scope of LAMEN. Finally, we present three applications to show how LAMEN enables various application domains on the edge of the network. In summary, we propose a framework to orchestrate underutilized resources at the edge of the network towards processing data that are generated in their proximity. Using the approaches explained later in the dissertation, we show how LAMEN enhances the performance of applications and enables a new set of applications that were not feasible

    Physiological system modelling

    Get PDF
    Computer graphics has a major impact in our day-to-day life. It is used in diverse areas such as displaying the results of engineering and scientific computations and visualization, producing television commercials and feature films, simulation and analysis of real world problems, computer aided design, graphical user interfaces that increases the communication bandwidth between humans and machines, etc Scientific visualization is a well-established method for analysis of data, originating from scientific computations, simulations or measurements. The development and implementation of the 3Dgen software was developed by the author using OpenGL and C language was presented in this report 3Dgen was used to visualize threedimensional cylindrical models such as pipes and also for limited usage in virtual endoscopy. Using the developed software a model was created using the centreline data input by the user or from the output of some other program, stored in a normal text file. The model was constructed by drawing surface polygons between two adjacent centreline points. The software allows the user to view the internal and external surfaces of the model. The software was designed in such a way that it runs in more than one operating systems with minimal installation procedures Since the size of the software is very small it can be stored in a 1 44 Megabyte floppy diskette. Depending on the processing speed of the PC the software can generate models of any length and size Compared to other packages, 3Dgen has minimal input procedures was able to generate models with smooth bends. It has both modelling and virtual exploration features. For models with sharp bends the software generates an overshoot
    corecore