4 research outputs found

    GPU-Accelerated Web Application for Metocean Descriptive Statistics

    Get PDF
    In today’s real world application, many researchers and developers are using Graphic Processing Unit (GPU) to accelerate non-graphical application. Modern GPUs which are massively parallel general-purpose processors has a big advantages on big data analytics in terms of power efficiency, compute density and scalability. In oil and gas industries, metocean data is being generated, collected and analyzed at an unprecedented scale. Metocean data which is observed measurements of current, wave, sea level and meterological data are regularly collected by major oil and gas companies. This data, is usually collected by specialist companies and distributed to paying parties who will deploy their scientist and engineers to analyze and forecast information based on the information. The analyses of metocean data provide crucial information needed for operation or design work that has health, safety and environment (HSE) and economic consequences. Therefore, this paper proposed GPU-accelerated web applications for metocean descriptive statistics to improve the current CPU based implementation. Metocean descriptive statistics is the analysis of metocean data that helps provide important information required for operation that has health and safety as well as economic consequences. The application will utilize GPU to perform descriptive statistics for metocean data. The implementations of GPU for metocean descriptive statistics is expected to provide a better raw performance, better cost-performance ratios, and better energy performance ratios. The main objective of this project is to develop a GPU-accelerated application for metocean descriptive statistic and a web-based application that links to the GPU-accelerated application, besides demonstrate the capabilities of GPU in performing non-graphical calculatio

    Efficient similarity computations on parallel machines using data shaping

    Get PDF
    Similarity computation is a fundamental operation in all forms of data. Big Data is, typically, characterized by attributes such as volume, velocity, variety, veracity, etc. In general, Big Data variety appears as structured, semi-structured or unstructured forms. The volume of Big Data in general, and semi-structured data in particular, is increasing at a phenomenal rate. Big Data phenomenon is posing new set of challenges to similarity computation problems occurring in semi-structured data. Technology and processor architecture trends suggest very strongly that future processors shall have ten\u27s of thousands of cores (hardware threads). Another crucial trend is that ratio between on-chip and off-chip memory to core counts is decreasing. State-of-the-art parallel computing platforms such as General Purpose Graphics Processors (GPUs) and MICs are promising for high performance as well high throughput computing. However, processing semi-structured component of Big Data efficiently using parallel computing systems (e.g. GPUs) is challenging. Reason being most of the emerging platforms (e.g. GPUs) are organized as Single Instruction Multiple Thread/Data machines which are highly structured, where several cores (streaming processors) operate in lock-step manner, or they require a high degree of task-level parallelism. We argue that effective and efficient solutions to key similarity computation problems need to operate in a synergistic manner with the underlying computing hardware. Moreover, semi-structured form input data needs to be shaped or reorganized with the goal to exploit the enormous computing power of \textit{state-of-the-art} highly threaded architectures such as GPUs. For example, shaping input data (via encoding) with minimal data-dependence can facilitate flexible and concurrent computations on high throughput accelerators/co-processors such as GPU, MIC, etc. We consider various instances of traditional and futuristic problems occurring in intersection of semi-structured data and data analytics. Preprocessing is an operation common at initial stages of data processing pipelines. Typically, the preprocessing involves operations such as data extraction, data selection, etc. In context of semi-structured data, twig filtering is used in identifying (and extracting) data of interest. Duplicate detection and record linkage operations are useful in preprocessing tasks such as data cleaning, data fusion, and also useful in data mining, etc., in order to find similar tree objects. Likewise, tree edit is a fundamental metric used in context of tree problems; and similarity computation between trees another key problem in context of Big Data. This dissertation makes a case for platform-centric data shaping as a potent mechanism to tackle the data- and architecture-borne issues in context of semi-structured data processing on GPU and GPU-like parallel architecture machines. In this dissertation, we propose several data shaping techniques for tree matching problems occurring in semi-structured data. We experiment with real world datasets. The experimental results obtained reveal that the proposed platform-centric data shaping approach is effective for computing similarities between tree objects using GPGPUs. The techniques proposed result in performance gains up to three orders of magnitude, subject to problem and platform

    GPU-Accelerated Web Application for Metocean Descriptive Statistics

    Get PDF
    In today’s real world application, many researchers and developers are using Graphic Processing Unit (GPU) to accelerate non-graphical application. Modern GPUs which are massively parallel general-purpose processors has a big advantages on big data analytics in terms of power efficiency, compute density and scalability. In oil and gas industries, metocean data is being generated, collected and analyzed at an unprecedented scale. Metocean data which is observed measurements of current, wave, sea level and meterological data are regularly collected by major oil and gas companies. This data, is usually collected by specialist companies and distributed to paying parties who will deploy their scientist and engineers to analyze and forecast information based on the information. The analyses of metocean data provide crucial information needed for operation or design work that has health, safety and environment (HSE) and economic consequences. Therefore, this paper proposed GPU-accelerated web applications for metocean descriptive statistics to improve the current CPU based implementation. Metocean descriptive statistics is the analysis of metocean data that helps provide important information required for operation that has health and safety as well as economic consequences. The application will utilize GPU to perform descriptive statistics for metocean data. The implementations of GPU for metocean descriptive statistics is expected to provide a better raw performance, better cost-performance ratios, and better energy performance ratios. The main objective of this project is to develop a GPU-accelerated application for metocean descriptive statistic and a web-based application that links to the GPU-accelerated application, besides demonstrate the capabilities of GPU in performing non-graphical calculatio

    BigKernel -- High Performance CPU-GPU Communication Pipelining for Big Data-Style Applications

    No full text
    corecore