96 research outputs found
High throughput image compression and decompression on GPUs
Diese Arbeit befasst sich mit der Entwicklung eines GPU-freundlichen, intra-only, Wavelet-basierten Videokompressionsverfahrens mit hohem Durchsatz, das für visuell verlustfreie Anwendungen optimiert ist. Ausgehend von der Beobachtung, dass der JPEG 2000 Entropie-Kodierer ein Flaschenhals ist, werden verschiedene algorithmische Änderungen vorgeschlagen und bewertet. Zunächst wird der JPEG 2000 Selective Arithmetic Coding Mode auf der GPU realisiert, wobei sich die Erhöhung des Durchsatzes hierdurch als begrenzt zeigt. Stattdessen werden zwei nicht standard-kompatible Änderungen vorgeschlagen, die (1) jede Bitebebene in nur einem einzelnen Pass verarbeiten (Single-Pass-Modus) und (2) einen echten Rohcodierungsmodus einführen, der sample-weise parallelisierbar ist und keine aufwendige Kontextmodellierung erfordert. Als nächstes wird ein alternativer Entropiekodierer aus der Literatur, der Bitplane Coder with Parallel Coefficient Processing (BPC-PaCo), evaluiert. Er gibt Signaladaptivität zu Gunsten von höherer Parallelität auf und daher wird hier untersucht und gezeigt, dass ein aus verschiedensten Testsequenzen gemitteltes statisches Wahrscheinlichkeitsmodell eine kompetitive Kompressionseffizienz erreicht. Es wird zudem eine Kombination von BPC-PaCo mit dem Single-Pass-Modus vorgeschlagen, der den Speedup gegenüber dem JPEG 2000 Entropiekodierer von 2,15x (BPC-PaCo mit zwei Pässen) auf 2,6x (BPC-PaCo mit Single-Pass-Modus) erhöht auf Kosten eines um 0,3 dB auf 1,0 dB erhöhten Spitzen-Signal-Rausch-Verhältnis (PSNR). Weiter wird ein paralleler Algorithmus zur Post-Compression Ratenkontrolle vorgestellt sowie eine parallele Codestream-Erstellung auf der GPU. Es wird weiterhin ein theoretisches Laufzeitmodell formuliert, das es durch Benchmarking von einer GPU ermöglicht die Laufzeit einer Routine auf einer anderen GPU vorherzusagen. Schließlich wird der erste JPEG XS GPU Decoder vorgestellt und evaluiert. JPEG XS wurde als Low Complexity Codec konzipiert und forderte erstmals explizit GPU-Freundlichkeit bereits im Call for Proposals. Ab Bitraten über 1 bpp ist der Decoder etwa 2x schneller im Vergleich zu JPEG 2000 und 1,5x schneller als der schnellste hier vorgestellte Entropiekodierer (BPC-PaCo mit Single-Pass-Modus). Mit einer GeForce GTX 1080 wird ein Decoder Durchsatz von rund 200 fps für eine UHD-4:4:4-Sequenz erreicht.This work investigates possibilities to create a high throughput, GPU-friendly, intra-only, Wavelet-based video compression algorithm optimized for visually lossless applications. Addressing the key observation that JPEG 2000’s entropy coder is a bottleneck and might be overly complex for a high bit rate scenario, various algorithmic alterations are proposed. First, JPEG 2000’s Selective Arithmetic Coding mode is realized on the GPU, but the gains in terms of an increased throughput are shown to be limited. Instead, two independent alterations not compliant to the standard are proposed, that (1) give up the concept of intra-bit plane truncation points and (2) introduce a true raw-coding mode that is fully parallelizable and does not require any context modeling. Next, an alternative block coder from the literature, the Bitplane Coder with Parallel Coefficient Processing (BPC-PaCo), is evaluated. Since it trades signal adaptiveness for increased parallelism, it is shown here how a stationary probability model averaged from a set of test sequences yields competitive compression efficiency. A combination of BPC-PaCo with the single-pass mode is proposed and shown to increase the speedup with respect to the original JPEG 2000 entropy coder from 2.15x (BPC-PaCo with two passes) to 2.6x (proposed BPC-PaCo with single-pass mode) at the marginal cost of increasing the PSNR penalty by 0.3 dB to at most 1 dB. Furthermore, a parallel algorithm is presented that determines the optimal code block bit stream truncation points (given an available bit rate budget) and builds the entire code stream on the GPU, reducing the amount of data that has to be transferred back into host memory to a minimum. A theoretical runtime model is formulated that allows, based on benchmarking results on one GPU, to predict the runtime of a kernel on another GPU. Lastly, the first ever JPEG XS GPU-decoder realization is presented. JPEG XS was designed to be a low complexity codec and for the first time explicitly demanded GPU-friendliness already in the call for proposals. Starting at bit rates above 1 bpp, the decoder is around 2x faster compared to the original JPEG 2000 and 1.5x faster compared to JPEG 2000 with the fastest evaluated entropy coder (BPC-PaCo with single-pass mode). With a GeForce GTX 1080, a decoding throughput of around 200 fps is achieved for a UHD 4:4:4 sequence
Sample-Parallel Execution of EBCOT in Fast Mode
JPEG 2000’s most computationally expensive building
block is the Embedded Block Coder with Optimized Truncation
(EBCOT). This paper evaluates how encoders targeting a parallel
architecture such as a GPU can increase their throughput in use
cases where very high data rates are used. The compression
efficiency in the less significant bit-planes is then often poor and
it is beneficial to enable the Selective Arithmetic Coding Bypass
style (fast mode) in order to trade a small loss in compression
efficiency for a reduction of the computational complexity. More
importantly, this style exposes a more finely grained parallelism
that can be exploited to execute the raw coding passes, including
bit-stuffing, in a sample-parallel fashion. For a latency- or
memory critical application that encodes one frame at a time,
EBCOT’s tier-1 is sped up between 1.1x and 2.4x compared to an
optimized GPU-based implementation. When a low GPU
occupancy has already been addressed by encoding multiple
frames in parallel, the throughput can still be improved by 5%
for high-entropy images and 27% for low-entropy images. Best
results are obtained when enabling the fast mode after the fourth
significant bit-plane. For most of the test images the compression
rate is within 1% of the original
Evaluation of GPU/CPU Co-Processing Models for JPEG 2000 Packetization
With the bottom-line goal of increasing the
throughput of a GPU-accelerated JPEG 2000 encoder, this paper
evaluates whether the post-compression rate control and
packetization routines should be carried out on the CPU or on
the GPU. Three co-processing models that differ in how the
workload is split among the CPU and GPU are introduced. Both
routines are discussed and algorithms for executing them in
parallel are presented. Experimental results for compressing a
detail-rich UHD sequence to 4 bits/sample indicate speed-ups of
200x for the rate control and 100x for the packetization
compared to the single-threaded implementation in the
commercial Kakadu library. These two routines executed on the
CPU take 4x as long as all remaining coding steps on the GPU
and therefore present a bottleneck. Even if the CPU bottleneck
could be avoided with multi-threading, it is still beneficial to
execute all coding steps on the GPU as this minimizes the
required device-to-host transfer and thereby speeds up the
critical path from 17.2 fps to 19.5 fps for 4 bits/sample and to
22.4 fps for 0.16 bits/sample
Comparison of Code-Pass-Skipping Strategies for Accelerating a JPEG 2000 Decoder
Code-Pass-Skipping allows a JPEG 2000 decoder to be accelerated by sacrificing the output precision. This paper presents an evaluation on how the speed gain can be maximized and the quality loss minimized. In particular, the scenario of rendering a 24-bit preview of a Digital Cinema Package (DCP) with the maximum permitted bitrate is examined. A comparison shows, that a new proposed strategy outperforms the reference implementation from Kakadu Software v6 by up to 1 dB. Furthermore, it is shown what speed gain can be achieved for a given acceptable quality loss
Current Oncological Treatment of Patients with Pancreatic Cancer in Germany: Results from a National Survey on behalf of the Arbeitsgemeinschaft Internistische Onkologie and the Chirurgische Arbeitsgemeinschaft Onkologie of the Germany Cancer Society
Background: No data have previously been available regarding the current treatment of patients with pancreatic cancer (PC) in German hospitals and medical practices. Methods: Between February 2007 and March 2008 we conducted a national survey {[}on behalf of the Arbeitsgemeinschaft Internistische Onkologie (AIO) and the Chirurgische Arbeitsgemeinschaft Onkologie (CAO)] regarding the current surgical and oncological treatment of PC in Germany. Standardized questionnaires were sent via mailing lists to members of the AIO and CAO (n = 1,130). The data were analyzed using SPSS software (version 16.0). Pre-defined subgroup analysis was performed by grouping the results of each question with regard to the professional site of the responding physician and to the number of patients treated in their institution by year. Results: 181 (16%) of the oncological questionnaires were sent back. For 61% of the participating centers, a histological confirmation of PC diagnosis is obligatory. 21% of physicians offer neoadjuvant therapy to patients with potentially resectable PC. In the adjuvant treatment after curative-intent surgery, gemcitabine (Gem) is regarded as standard of care by 71% after R0 resection and 62% after R1 resection. For patients with locally advanced PC, 52% of the participating centers recommend systemic chemotherapy, 17% prefer combined primary chemoradiotherapy. Most centers (59%) base their decision of combination regimens for metastatic disease on the performance status of their patients. In patients with a good status, 28% apply single-agent Gem, 3% use Gem + capecitabine, 12% Gem + erlotinib, 16% Gem + oxaliplatin, and 8% Gem + cisplatin. Only 28% of the survey doctors offer second-line treatment to the majority of their patients with advanced PC. Conclusion: Not every PC patient in Germany is treated according to the present S3 guidelines. Diagnosis and treatment of PC in Germany still need to be improved. Copyright (C) 2009 S. Karger AG, Base
Oral capecitabine in gemcitabine-pretreated patients with advanced pancreatic cancer
Objective: To date, no standard regimen for salvage chemotherapy after gemcitabine (Gem) failure has been defined for patients with advanced pancreatic cancer (PC). Oral capecitabine (Cap) has shown promising activity in first-line chemotherapy trials in PC patients. Methods: Within a prospective single-center study, Cap was offered to patients who had already received at least 1 previous treatment regimen containing full-dose Gem (as a single agent, as part of a combination chemotherapy regimen or sequentially within a chemoradiotherapy protocol). Cap was administered orally at a dose of 1,250 mg/m(2) twice daily for 14 days followed by 7 days of rest. Study endpoints were objective tumor response rate by imaging criteria (according to RECIST), carbohydrate antigen 19-9 (CA19-9) tumor marker response, time to progression, overall survival and toxicity. Results: A median of 3 treatment cycles (range 1-36) was given to 39 patients. After a median follow-up of 6.6 months, 27 patients were evaluable for response: no complete or partial responses were observed, but 15 patients (39%) had stable disease. A CA19-9 reduction of >20% after 2 cycles of Cap was documented in 6 patients (15%). Median time to progression was 2.3 months (range 0.5-45.1) and median overall survival (since start of Cap treatment) was 7.6 months (range 0.7-45.1). Predominant grade 2 and 3 toxicities (per patient analysis) were hand-foot syndrome 28% (13% grade 3); anemia 23%; leg edema 15%; diarrhea 13%; nausea/vomiting 10%, and leukocytopenia 10%. Conclusion: Single-agent Cap is a safe treatment option for Gem-pretreated patients with advanced PC. Further evaluation of Cap in controlled clinical trials of Gem-pretreated patients with advanced PC is recommended. Copyright (C) 2008 S. Karger AG, Basel
Explaining and Evaluating Deep Tissue Classification by Visualizing Activations of Most Relevant Intermediate Layers
Deep Learning-based tissue classification may support pathologists in analyzing digitized whole slide images. However, in such critical tasks, only approaches that can be validated by medical experts in advance to deployment, are suitable. We present an approach that contributes to making automated tissue classification more transparent. We step beyond broadly used visualizations for last layers of a convolutional neural network by identifying most relevant intermediate layers applying Grad-CAM. A visual evaluation by a pathologist shows that these layers assign relevance, where important morphological structures are present in case of correct class decisions. We introduce a tool that can be easily used by medical experts for such validation purposes for any convolutional neural network and any layer. Visual explanations for intermediate layers provide insights into a neural network’s decision for histopathological tissue classification. In future research also the context of the input data must be considered
Banks' risk assessment of Swedish SMEs
Building on the literatures on asymmetric information and risk taking, this paper applies conjoint experiments to investigate lending officers' probabilities of supporting credit to established or existing SMEs. Using a sample of 114 Swedish lending officers, we test hypotheses concerning how information on the borrower's ability to repay the loan; alignment of risk preferences; and risk sharing affect their willingness to grant credit. Results suggest that features that reduce the risk to the bank and shift the risk to the borrower have the largest impact. The paper highlights the interaction between factors that influence the credit decision. Implications for SMEs, banks and research are discussed
- …