697 research outputs found

    Data Mining auf Unfalldaten

    Get PDF
    Aufgabe war die Untersuchung von Unfalldaten, insbesondere die Vorhersage eines Unfallscores mittels Data-Mining-Methoden. Dazu wurde das inkrementelle Modell des Data Mining dreimal durchlaufen. Jeder Ansatz brachte neue Erkenntnisse. Somit war der letzte am aufschlussreichsten. Eine Übersicht mit den besten Ergebnissen zeigt Tab. 14, die Erkennungsraten sollten dabei möglichst hoch und die Kosten möglichst niedrig sein. Kosten sind nur relativ innerhalb des gleichen Ansatzes vergleichbar. FĂŒr vergleichbare Zahlen beziehen sich die Daten von Ansatz 1 und 3 auf Crossvalidation, Ansatz 2 auf die Gesamtmenge. Ansatz 3 benutzt den neu gebildeten ScoreB mit 6 statt 9 Klassen. Deutlich sind die in jedem Ansatz ansteigenden Erkennungsraten und sinkenden Kosten erkennbar. Trotz der zuletzt hohen Erkennungsraten darf nicht ĂŒbersehen werden, was diese eigentlich bedeuten. Hier können nur in verschiedenen Abstufungen schwere von leichten UnfĂ€llen unterschieden werden. Die hohe Anzahl leichter UnfĂ€lle sorgt bei deren guter Erkennung fĂŒr eine hohe Gesamterkennungsrate. Wichtiger sind jedoch die schweren UnfĂ€lle mit Gefahr fĂŒr Leib und Leben, welche deutlich schwieriger vorherzusagen sind. Je weniger Abstufungen im Score gemacht wurden, desto besser war die Vorhersage der hohen Scores. Bei nur zwei Unterscheidungen in Sachschaden und auch Personenschaden lag die Erkennungsrate fĂŒr schwere UnfĂ€lle im besten Fall bei 59,4% mit einer Ungenauigkeit (Fehlalarm) von 21,4%. Dieses Ergebnis wurde mit einer Kombination aus Random Forest, ID3 und DecisionStump erreicht. Als bester einzelner Klassifikations-Algorithmus hat sich hier Random Forest herausgestellt. --

    A Benchmark Suite for Evaluating Parallel Programming Models : Introduction and Preliminary Results

    Get PDF
    The transition to multi-core processors enforces software developers to explicitly exploit thread-level parallelism to increase performance. The associated programmability problem has led to the introduction of a plethora of parallel programming models that aim at simplifying software development by raising the abstraction level. Since industry has not settled for a single model, however, multiple significantly different approaches exist. This work presents a benchmark suite which can be used to classify and compare such parallel programming models and, therefore, aids in selecting the appropriate programming model for a given task. After a detailed explanation of the suite's design, preliminary results for two programming models, Pthreads and OmpSs/SMPSs, are presented and analyzed, leading to an outline of further extensions of the suite.EC/FP7/248647/EU/ENabling technologies for a programmable many-CORE/ENCOR

    Programming parallel embedded and consumer applications in OpenMP superscalar

    Get PDF
    In this paper, we evaluate the performance and usability of the parallel programming model OpenMP Superscalar (OmpSs), apply it to 10 different benchmarks and compare its performance with corresponding POSIX threads implementations

    Using OpenMP superscalar for parallelization of embedded and consumer applications

    Get PDF
    In the past years, research and industry have introduced several parallel programming models to simplify the development of parallel applications. A popular class among these models are task-based programming models which proclaim ease-of-use, portability, and high performance. A novel model in this class, OpenMP Superscalar, combines advanced features such as automated runtime dependency resolution, while maintaining simple pragma-based programming for C/C++. OpenMP Superscalar has proven to be effective in leveraging parallelism in HPC workloads. Embedded and consumer applications, however, are currently still mainly parallelized using traditional thread-based programming models. In this work, we investigate how effective OpenMP Superscalar is for embedded and consumer applications in terms of usability and performance. To determine the usability of OmpSs, we show in detail how to implement complex parallelization strategies such as ones used in parallel H.264 decoding. To evaluate the performance we created a collection of ten embedded and consumer benchmarks parallelized in both OmpSs and Pthreads.EC/FP7/248647/EU/ENabling technologies for a programmable many-CORE/ENCOR

    On latency in GPU throughput microarchitectures

    Get PDF
    Modern GPUs provide massive processing power (arithmetic throughput) as well as memory throughput. Presently, while it appears to be well understood how performance can be improved by increasing throughput, it is less clear what the effects of micro-architectural latencies are on the performance of throughput-oriented GPU architectures. In fact, little is publicly known about the values, behavior, and performance impact of microarchitecture latency components in modern GPUs. This work attempts to fill that gap by analyzing both the idle (static) as well as loaded (dynamic) latency behavior of GPU microarchitectural components. Our results show that GPUs are not as effective in latency hiding as commonly thought and based on that, we argue that latency should also be a GPU design consideration besides throughput

    Spatio-temporal SIMT and scalarization for improving GPU efficiency

    Get PDF
    Temporal SIMT (TSIMT) has been suggested as an alternative to conventional (spatial) SIMT for improving GPU performance on branch-intensive code. Although TSIMT has been briefly mentioned before, it was not evaluated. We present a complete design and evaluation of TSIMT GPUs, along with the inclusion of scalarization and a combination of temporal and spatial SIMT, named Spatiotemporal SIMT (STSIMT). Simulations show that TSIMT alone results in a performance reduction, but a combination of scalarization and STSIMT yields a mean performance enhancement of 19.6% and improves the energy-delay product by 26.2% compared to SIMT.EC/FP7/288653/EU/Low-Power Parallel Computing on GPUs/LPGP

    How a single chip causes massive power bills : GPUSimPow: A GPGPU power simulator

    Get PDF
    Modern GPUs are true power houses in every meaning of the word: While they offer general-purpose (GPGPU) compute performance an order of magnitude higher than that of conventional CPUs, they have also been rapidly approaching the infamous “power wall”, as a single chip sometimes consumes more than 300W. Thus, the design space of GPGPU microarchitecture has been extended by another dimension: power. While GPU researchers have previously relied on cycle-accurate simulators for estimating performance during design cycles, there are no simulation tools that include power as well. To mitigate this issue, we introduce the GPUSimPow power estimation framework for GPGPUs consisting of both analytical and empirical models for regular and irregular hardware components. To validate this framework, we build a custom measurement setup to obtain power numbers from real graphics cards. An evaluation on a set of well-known benchmarks reveals an average relative error of 11.7% between simulated and hardware power for GT240 and an average relative error of 10.8% for GTX580. The simulator has been made available to the public [1].EC/FP7/288653/EU/Low-Power Parallel Computing on GPUs/LPGP

    GPGPU workload characteristics and performance analysis

    Get PDF
    GPUs are much more power-efficient devices compared to CPUs, but due to several performance bottlenecks, the performance per watt of GPUs is often much lower than what could be achieved theoretically. To sustain and continue high performance computing growth, new architectural and application techniques are required to create power-efficient computing systems. To find such techniques, however, it is necessary to study the power consumption at a detailed level and understand the bottlenecks which cause low performance. Therefore, in this paper, we study GPU power consumption at component level and investigate the bottlenecks that cause low performance and low energy efficiency. We divide the low performance kernels into low occupancy and full occupancy categories. For the low occupancy category, we study if increasing the occupancy helps in increasing performance and energy efficiency. For the full occupancy category, we investigate if these kernels are limited by memory bandwidth, coalescing efficiency, or SIMD utilization.EC/FP7/288653/EU/Low-Power Parallel Computing on GPUs/LPGP

    CD171- and GD2-specific CAR-T cells potently target retinoblastoma cells in preclinical in vitro testing

    Get PDF
    BACKGROUND: Chimeric antigen receptor (CAR)-based T cell therapy is in early clinical trials to target the neuroectodermal tumor, neuroblastoma. No preclinical or clinical efficacy data are available for retinoblastoma to date. Whereas unilateral intraocular retinoblastoma is cured by enucleation of the eye, infiltration of the optic nerve indicates potential diffuse scattering and tumor spread leading to a major therapeutic challenge. CAR-T cell therapy could improve the currently limited therapeutic strategies for metastasized retinoblastoma by simultaneously killing both primary tumor and metastasizing malignant cells and by reducing chemotherapy-related late effects. METHODS: CD171 and GD2 expression was flow cytometrically analyzed in 11 retinoblastoma cell lines. CD171 expression and T cell infiltration (CD3+) was immunohistochemically assessed in retrospectively collected primary retinoblastomas. The efficacy of CAR-T cells targeting the CD171 and GD2 tumor-associated antigens was preclinically tested against three antigen-expressing retinoblastoma cell lines. CAR-T cell activation and exhaustion were assessed by cytokine release assays and flow cytometric detection of cell surface markers, and killing ability was assessed in cytotoxic assays. CAR constructs harboring different extracellular spacer lengths (short/long) and intracellular co-stimulatory domains (CD28/4-1BB) were compared to select the most potent constructs. RESULTS: All retinoblastoma cell lines investigated expressed CD171 and GD2. CD171 was expressed in 15/30 primary retinoblastomas. Retinoblastoma cell encounter strongly activated both CD171-specific and GD2-specific CAR-T cells. Targeting either CD171 or GD2 effectively killed all retinoblastoma cell lines examined. Similar activation and killing ability for either target was achieved by all CAR constructs irrespective of the length of the extracellular spacers and the co-stimulatory domain. Cell lines differentially lost tumor antigen expression upon CAR-T cell encounter, with CD171 being completely lost by all tested cell lines and GD2 further down-regulated in cell lines expressing low GD2 levels before CAR-T cell challenge. Alternating the CAR-T cell target in sequential challenges enhanced retinoblastoma cell killing. CONCLUSION: Both CD171 and GD2 are effective targets on human retinoblastoma cell lines, and CAR-T cell therapy is highly effective against retinoblastoma in vitro. Targeting of two different antigens by sequential CAR-T cell applications enhanced tumor cell killing and preempted tumor antigen loss in preclinical testing
    • 

    corecore