26 research outputs found
Type-Directed Program Synthesis and Constraint Generation for Library Portability
Fast numerical libraries have been a cornerstone of scientific computing for
decades, but this comes at a price. Programs may be tied to vendor specific
software ecosystems resulting in polluted, non-portable code. As we enter an
era of heterogeneous computing, there is an explosion in the number of
accelerator libraries required to harness specialized hardware. We need a
system that allows developers to exploit ever-changing accelerator libraries,
without over-specializing their code.
As we cannot know the behavior of future libraries ahead of time, this paper
develops a scheme that assists developers in matching their code to new
libraries, without requiring the source code for these libraries.
Furthermore, it can recover equivalent code from programs that use existing
libraries and automatically port them to new interfaces. It first uses program
synthesis to determine the meaning of a library, then maps the synthesized
description into generalized constraints which are used to search the program
for replacement opportunities to present to the developer.
We applied this approach to existing large applications from the scientific
computing and deep learning domains. Using our approach, we show speedups
ranging from 1.1 to over 10 on end to end performance when
using accelerator libraries.Comment: Accepted to PACT 201
From constraint programming to heterogeneous parallelism
The scaling limitations of multi-core processor development have led to a diversification of the processor cores used within individual computers. Heterogeneous computing has become widespread, involving the cooperation of several structurally different processor cores. Central processor (CPU) cores are most frequently complemented with graphics processors (GPUs), which despite their name are suitable for many highly parallel computations besides computer graphics. Furthermore, deep learning accelerators are rapidly gaining relevance.
Many applications could profit from heterogeneous computing but are held back by the surrounding software ecosystems. Heterogeneous systems are a challenge for compilers in particular, which usually target only the increasingly marginalised homogeneous CPU cores. Therefore, heterogeneous acceleration is primarily accessible via libraries and domain-specific languages (DSLs), requiring application rewrites and resulting in vendor lock-in.
This thesis presents a compiler method for automatically targeting heterogeneous hardware from existing sequential C/C++ source code. A new constraint programming method enables the declarative specification and automatic detection of computational idioms within compiler intermediate representation code. Examples of computational idioms are stencils, reductions, and linear algebra. Computational idioms denote algorithmic structures that commonly occur in performance-critical loops. Consequently, well-designed accelerator DSLs and libraries support computational idioms with their programming models and function interfaces. The detection of computational idioms in their middle end enables compilers to incorporate DSL and library backends for code generation. These backends leverage domain knowledge for the efficient utilisation of heterogeneous hardware.
The constraint programming methodology is first derived on an abstract model and then implemented as an extension to LLVM. Two constraint programming languages are designed to target this implementation: the Compiler Analysis Description Language (CAnDL), and the extended Idiom Detection Language (IDL). These languages are evaluated on a range of different compiler problems, culminating in a complete heterogeneous acceleration pipeline integrated with the Clang C/C++ compiler. This pipeline was evaluated on the established benchmark collections NPB and Parboil. The approach was applicable to 10 of the benchmark programs, resulting in significant speedups from 1.26× on “histo” to 275× on “sgemm” when starting from sequential baseline versions.
In summary, this thesis shows that the automatic recognition of computational idioms during compilation enables the heterogeneous acceleration of sequential C/C++ programs. Moreover, the declarative specification of computational idioms is derived in novel declarative programming languages, and it is demonstrated that constraint programming on Single Static Assignment intermediate code is a suitable method for their automatic detection
Automatic matching of legacy code to heterogeneous APIs: An idiomatic approach
Heterogeneous accelerators often disappoint. They provide
the prospect of great performance, but only deliver it when
using vendor specific optimized libraries or domain specific
languages. This requires considerable legacy code modifications,
hindering the adoption of heterogeneous computing.
This paper develops a novel approach to automatically
detect opportunities for accelerator exploitation. We focus
on calculations that are well supported by established APIs:
sparse and dense linear algebra, stencil codes and generalized
reductions and histograms. We call them idioms and use a
custom constraint-based Idiom Description Language (IDL)
to discover them within user code. Detected idioms are then
mapped to BLAS libraries, cuSPARSE and clSPARSE and two
DSLs: Halide and Lift.
We implemented the approach in LLVM and evaluated
it on the NAS and Parboil sequential C/C++ benchmarks,
where we detect 60 idiom instances. In those cases where
idioms are a significant part of the sequential execution time,
we generate code that achieves 1.26× to over 20× speedup
on integrated and external GPUs
Tick-borne encephalitis virus (TBEV) prevalence in field-collected ticks (Ixodes ricinus) and phylogenetic, structural and virulence analysis in a TBE high-risk endemic area in southwestern Germany
Background Tick-borne encephalitis (TBE) is the most common viral CNS infection with incidences much higher than all other virus infections together in many risk areas of central and eastern Europe. The Odenwald Hill region (OWH) in southwestern Germany is classified as a TBE risk region and frequent case numbers but also more severe infections have been reported within the past decade. The objective of the present study was to survey the prevalence of tick-borne encephalitis virus (TBEV) in Ixodes ricinus and to associate TBEV genetic findings with TBE infections in the OWH. Methods Ticks were collected by the flagging methods supported by a crowdsourcing project implementing the interested public as collectors to cover completely and collect randomly a 3532 km2 area of the OWH TBE risk region. Prevalence of TBEV in I. ricinus was analysed by reversed transcription quantitative real-time PCR. Phylogeographic analysis was performed to classify OWH TBEV isolates within a European network of known TBEV strains. Mutational sequence analysis including 3D modelling of envelope protein pE was performed and based on a clinical database, a spatial association of TBE case frequency and severity was undertaken. Results Using the crowd sourcing approach we could analyse a total of 17,893 ticks. The prevalence of TBEV in I. ricinus in the OWH varied, depending on analysed districts from 0.12% to 0% (mean 0.04%). Calculated minimum infection rate (MIR) was one decimal power higher. All TBEV isolates belonged to the European subtype. Sequence analysis revealed a discontinuous segregation pattern of OWH isolates with two putative different lineages and a spatial association of two isolates with increased TBE case numbers as well as exceptional severe to fatal infection courses. Conclusions TBEV prevalence within the OWH risk regions is comparatively low which is probably due to our methodological approach and may more likely reflect prevalence of natural TBEV foci. As for other European regions, TBEV genetics show a discontinuous phylogeny indicating among others an association with bird migration. Mutations within the pE gene are associated with more frequent, severe and fatal TBE infections in the OWH risk region
Automatically Harnessing Sparse Acceleration
Sparse linear algebra is central to many scientific programs, yet compilers
fail to optimize it well. High-performance libraries are available, but
adoption costs are significant. Moreover, libraries tie programs into
vendor-specific software and hardware ecosystems, creating non-portable code.
In this paper, we develop a new approach based on our specification Language
for implementers of Linear Algebra Computations (LiLAC). Rather than requiring
the application developer to (re)write every program for a given library, the
burden is shifted to a one-off description by the library implementer. The
LiLAC-enabled compiler uses this to insert appropriate library routines without
source code changes.
LiLAC provides automatic data marshaling, maintaining state between calls and
minimizing data transfers. Appropriate places for library insertion are
detected in compiler intermediate representation, independent of source
languages.
We evaluated on large-scale scientific applications written in FORTRAN;
standard C/C++ and FORTRAN benchmarks; and C++ graph analytics kernels. Across
heterogeneous platforms, applications and data sets we show speedups of
1.1 to over 10 without user intervention.Comment: Accepted to CC 202
TelomereHunter – in silico estimation of telomere content and composition from cancer genomes
Background: Establishment of telomere maintenance mechanisms is a universal step in tumor development to achieve replicative immortality. These processes leave molecular footprints in cancer genomes in the form of altered telomere content and aberrations in telomere composition. To retrieve these telomere characteristics from high-throughput sequencing data the available computational approaches need to be extended and optimized to fully exploit the information provided by large scale cancer genome data sets.
Results: We here present TelomereHunter, a software for the detailed characterization of telomere maintenance mechanism footprints in the genome. The tool is implemented for the analysis of large cancer genome cohorts and provides a variety of diagnostic diagrams as well as machine-readable output for subsequent analysis. A novel key feature is the extraction of singleton telomere variant repeats, which improves the identification and subclassification of the alternative lengthening of telomeres phenotype. We find that whole genome sequencing-derived telomere content estimates strongly correlate with telomere qPCR measurements (r = 0.94). For the first time, we determine the correlation of in silico telomere content quantification from whole genome sequencing and whole genome bisulfite sequencing data derived from the same tumor sample (r = 0.78). An analogous comparison of whole exome sequencing data and whole genome sequencing data measured slightly lower correlation (r = 0.79). However, this is considerably improved by normalization with matched controls (r = 0.91).
Conclusions: TelomereHunter provides new functionality for the analysis of the footprints of telomere maintenance mechanisms in cancer genomes. Besides whole genome sequencing, whole exome sequencing and whole genome bisulfite sequencing are suited for in silico telomere content quantification, especially if matched control samples are available. The software runs under a GPL license and is available at https://www.dkfz.de/en/applied-bioinformatics/telomerehunter/telomerehunter.html
Genomic footprints of activated telomere maintenance mechanisms in cancer.
Cancers require telomere maintenance mechanisms for unlimited replicative potential. They achieve this through TERT activation or alternative telomere lengthening associated with ATRX or DAXX loss. Here, as part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium, we dissect whole-genome sequencing data of over 2500 matched tumor-control samples from 36 different tumor types aggregated within the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium to characterize the genomic footprints of these mechanisms. While the telomere content of tumors with ATRX or DAXX mutations (ATRX/DAXXtrunc) is increased, tumors with TERT modifications show a moderate decrease of telomere content. One quarter of all tumor samples contain somatic integrations of telomeric sequences into non-telomeric DNA. This fraction is increased to 80% prevalence in ATRX/DAXXtrunc tumors, which carry an aberrant telomere variant repeat (TVR) distribution as another genomic marker. The latter feature includes enrichment or depletion of the previously undescribed singleton TVRs TTCGGG and TTTGGG, respectively. Our systematic analysis provides new insight into the recurrent genomic alterations associated with telomere maintenance mechanisms in cancer