Search CORE

449,036 research outputs found

Evaluation of Single-Chip, Real-Time Tomographic Data Processing on FPGA - SoC Devices

Author: Białas P.
Curceanu C.
Czerwiński E.
Dulski K.
Flak B.
Gajos A.
Gorgol M.
Głowacz B.
Hiesmayr B. C.
Jasińska B.
Kacprzak K.
Kajetanowicz M.
Kisielewska D.
Korcyl G.
Kowalski P.
Kozik T.
Krawczyk N.
Krzemień W.
Kubicz E.
Mohammed M.
Moskal P.
Niedźwiecki Sz.
Pawlik-Niedźwiecka M.
Pałka M.
Raczyński L.
Rajda P.
Rudy Z.
Salabura P.
Sharma N. G.
Sharma S.
Shopa R. Y.
Silarski M.
Skurzok M.
Strzempek P.
Wieczorek A.
Wiślicki W.
Zaleski R.
Zgardzińska B.
Zieliński M.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2018
Field of study

A novel approach to tomographic data processing has been developed and evaluated using the Jagiellonian PET (J-PET) scanner as an example. We propose a system in which there is no need for powerful, local to the scanner processing facility, capable to reconstruct images on the fly. Instead we introduce a Field Programmable Gate Array (FPGA) System-on-Chip (SoC) platform connected directly to data streams coming from the scanner, which can perform event building, filtering, coincidence search and Region-Of-Response (ROR) reconstruction by the programmable logic and visualization by the integrated processors. The platform significantly reduces data volume converting raw data to a list-mode representation, while generating visualization on the fly.Comment: IEEE Transactions on Medical Imaging, 17 May 201

arXiv.org e-Print Archive

Crossref

Jagiellonian Univeristy Repository

Parallel VLSI architecture emulation and the organization of APSA/MPP

Author: Odonnell John T.
Publication venue
Publication date
Field of study

The Applicative Programming System Architecture (APSA) combines an applicative language interpreter with a novel parallel computer architecture that is well suited for Very Large Scale Integration (VLSI) implementation. The Massively Parallel Processor (MPP) can simulate VLSI circuits by allocating one processing element in its square array to an area on a square VLSI chip. As long as there are not too many long data paths, the MPP can simulate a VLSI clock cycle very rapidly. The APSA circuit contains a binary tree with a few long paths and many short ones. A skewed H-tree layout allows every processing element to simulate a leaf cell and up to four tree nodes, with no loss in parallelism. Emulation of a key APSA algorithm on the MPP resulted in performance 16,000 times faster than a Vax. This speed will make it possible for the APSA language interpreter to run fast enough to support research in parallel list processing algorithms

NASA Technical Reports Server

STS-41 Space Shuttle mission report

Author: Camp David W.
Germany D. M.
Nicholson Leonard S.
Publication venue
Publication date
Field of study

The STS-41 Space Shuttle Program Mission Report contains a summary of the vehicle subsystem activities on this thirty-sixth flight of the Space Shuttle and the eleventh flight of the Orbiter vehicle, Discovery (OV-103). In addition to the Discovery vehicle, the flight vehicle consisted of an External Tank (ET) (designated as ET-39/LWT-32), three Space Shuttle main engines (SSME's) (serial numbers 2011, 2031, and 2107), and two Solid Rocket Boosters (SRB's), designated as BI-040. The primary objective of the STS-41 mission was to successfully deploy the Ulysses/inertial upper stage (IUS)/payload assist module (PAM-S) spacecraft. The secondary objectives were to perform all operations necessary to support the requirements of the Shuttle Backscatter Ultraviolet (SSBUV) Spectrometer, Solid Surface Combustion Experiment (SSCE), Space Life Sciences Training Program Chromosome and Plant Cell Division in Space (CHROMEX), Voice Command System (VCS), Physiological Systems Experiment (PSE), Radiation Monitoring Experiment - 3 (RME-3), Investigations into Polymer Membrane Processing (IPMP), Air Force Maui Optical Calibration Test (AMOS), and Intelsat Solar Array Coupon (ISAC) payloads. The sequence of events for this mission is shown in tabular form. Summarized are the significant problems that occurred in the Orbiter subsystems during the mission. The official problem tracking list is presented. In addition, each Orbiter problem is cited in the subsystem discussion

NASA Technical Reports Server

Methods of forming an expert assessment of the criteria of an information system for managing projects and programs

Author: Boiko Yevheniia
Publication venue: 'OU Scientific Route'
Publication date: 28/11/2018
Field of study

The article presents a method for determining and ranking significance of the criteria of an information system for managing projects and programs (hereinafter, PMIS) based on the concept of subjective probability with the help of expert assessments. The method of expert assessments is implemented by processing the opinions of experienced specialists on the possible values of losses and (or) the probability of their occurrence. It is also used in non-formalizable problem situations, when the lack of a sufficient array of information or its unreliability does not allow the use of purely formal mathematical methods. When analyzing the PMIS choice, expert assessments can be used, firstly, to form a subjective assessment of one or another PMIS with the subsequent use of this information in order to quantify it using statistical methods. Secondly, for a qualitative assessment of the PMIS choice in terms of determining their rank significance, priority in an ordered list of PMIS criteria. As the main stages of the proposed methodology, the following are proposed: 1) development of a list of assessed PMIS criteria and formation of a list of experts; 2) conducting a survey of experts in order to obtain a set of individual expert assessments according to the PMIS criteria; 3) calculation of the average assessment criteria of the PMIS; 4) checking the consistency of expert opinions on the rank significance of the assessed PMIS criteria based on the Kendall coefficient of concordance; 5) summing up the results of expert assessment of the PMIS criteria. The practical aspects of the expert assessment are considered: calculation tables, the method of filling them, processing and analyzing the results. The method of expert assessment of the PMIS criteria was further developed, thanks to which a set of effective and functional criteria was determined, which will be taken into account when developing technical requirements for this syste

Conference Technology Transfer: fundamental and innovative technical solutions

ZENODO

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Method of up-front load balancing for local memory parallel processors

Author: Baffes Paul Thomas
Publication venue
Publication date: 24/04/1990
Field of study

In a parallel processing computer system with multiple processing units and shared memory, a method is disclosed for uniformly balancing the aggregate computational load in, and utilizing minimal memory by, a network having identical computations to be executed at each connection therein. Read-only and read-write memory are subdivided into a plurality of process sets, which function like artificial processing units. Said plurality of process sets is iteratively merged and reduced to the number of processing units without exceeding the balance load. Said merger is based upon the value of a partition threshold, which is a measure of the memory utilization. The turnaround time and memory savings of the instant method are functions of the number of processing units available and the number of partitions into which the memory is subdivided. Typical results of the preferred embodiment yielded memory savings of from sixty to seventy five percent

NASA Technical Reports Server

Recommended from our members

Parallel data compression

Author: Hirschberg Daniel S.
Stauffer Lynn M.
Publication venue: eScholarship, University of California
Publication date: 01/05/1991
Field of study

Data compression schemes remove data redundancy in communicated and stored data and increase the effective capacities of communication and storage devices. Parallel algorithms and implementations for textual data compression are surveyed. Related concepts from parallel computation and information theory are briefly discussed. Static and dynamic methods for codeword construction and transmission on various models of parallel computation are described. Included are parallel methods which boost system speed by coding data concurrently, and approaches which employ multiple compression techniques to improve compression ratios. Theoretical and empirical comparisons are reported and areas for future research are suggested

eScholarship - University of California

Breadth First Search Vectorization on the Intel Xeon Phi

Author: Lujan Mikel
Paredes Mireya
Riley Graham
Publication venue
Publication date: 11/04/2016
Field of study

Breadth First Search (BFS) is a building block for graph algorithms and has recently been used for large scale analysis of information in a variety of applications including social networks, graph databases and web searching. Due to its importance, a number of different parallel programming models and architectures have been exploited to optimize the BFS. However, due to the irregular memory access patterns and the unstructured nature of the large graphs, its efficient parallelization is a challenge. The Xeon Phi is a massively parallel architecture available as an off-the-shelf accelerator, which includes a powerful 512 bit vector unit with optimized scatter and gather functions. Given its potential benefits, work related to graph traversing on this architecture is an active area of research. We present a set of experiments in which we explore architectural features of the Xeon Phi and how best to exploit them in a top-down BFS algorithm but the techniques can be applied to the current state-of-the-art hybrid, top-down plus bottom-up, algorithms. We focus on the exploitation of the vector unit by developing an improved highly vectorized OpenMP parallel algorithm, using vector intrinsics, and understanding the use of data alignment and prefetching. In addition, we investigate the impact of hyperthreading and thread affinity on performance, a topic that appears under researched in the literature. As a result, we achieve what we believe is the fastest published top-down BFS algorithm on the version of Xeon Phi used in our experiments. The vectorized BFS top-down source code presented in this paper can be available on request as free-to-use software

arXiv.org e-Print Archive

Crossref

The University of Manchester - Institutional Repository

Self-organizing lists on the Xnet

Author: Hirschberg Daniel S.
Stauffer Lynn M.
Publication venue: eScholarship, University of California
Publication date: 28/09/1992
Field of study

The first parallel designs for implementing self-organizing lists on the Xnet interconnection network are presented. Self-organizing lists permute the order of list entries after an entry is accessed according to some update hueristic. The heuristic attempts to place frequently requested entries closer to the front of the list. This paper outlines Xnet systems for self-organizing lists under the move-to-front and transpose update heuristics. Our novel designs can be used to achieve high-speed lossless text compression

CiteSeerX

eScholarship - University of California

GraphH: High Performance Big Graph Analytics in Small Clusters

Author: Duong Ta Nguyen Binh
Sun Peng
Wen Yonggang
Xiao Xiaokui
Publication venue
Publication date: 07/08/2017
Field of study

It is common for real-world applications to analyze big graphs using distributed graph processing systems. Popular in-memory systems require an enormous amount of resources to handle big graphs. While several out-of-core approaches have been proposed for processing big graphs on disk, the high disk I/O overhead could significantly reduce performance. In this paper, we propose GraphH to enable high-performance big graph analytics in small clusters. Specifically, we design a two-stage graph partition scheme to evenly divide the input graph into partitions, and propose a GAB (Gather-Apply-Broadcast) computation model to make each worker process a partition in memory at a time. We use an edge cache mechanism to reduce the disk I/O overhead, and design a hybrid strategy to improve the communication performance. GraphH can efficiently process big graphs in small clusters or even a single commodity server. Extensive evaluations have shown that GraphH could be up to 7.8x faster compared to popular in-memory systems, such as Pregel+ and PowerGraph when processing generic graphs, and more than 100x faster than recently proposed out-of-core systems, such as GraphD and Chaos when processing big graphs

arXiv.org e-Print Archive

Crossref

Processing Posting Lists Using OpenCL

Author: Kotipalli Radha
Publication venue: SJSU ScholarWorks
Publication date: 20/05/2016
Field of study

One of the main requirements of internet search engines is the ability to retrieve relevant results with faster response times. Yioop is an open source search engine designed and developed in PHP by Dr. Chris Pollett. The goal of this project is to explore the possibilities of enhancing the performance of Yioop by substituting resource-intensive existing PHP functions with C based native PHP extensions and the parallel data processing technology OpenCL. OpenCL leverages the Graphical Processing Unit (GPU) of a computer system for performance improvements. Some of the critical functions in search engines are resource-intensive in terms of processing power, memory, and I/O usage. The processing times vary based on the complexity and magnitude of data involved. This project involves different phases such as identifying critical resource intensive functions, initially replacing such methods with PHP Extensions, and eventually experimenting with OpenCL code. We also ran performance tests to measure the reduction in processing times. From our results, we concluded that PHP Extensions and OpenCL processing resulted in performance improvements

SJSU ScholarWorks