2,368 research outputs found

    Developing performance-portable molecular dynamics kernels in Open CL

    Get PDF
    This paper investigates the development of a molecular dynamics code that is highly portable between architectures. Using OpenCL, we develop an implementation of Sandia’s miniMD benchmark that achieves good levels of performance across a wide range of hardware: CPUs, discrete GPUs and integrated GPUs. We demonstrate that the performance bottlenecks of miniMD’s short-range force calculation kernel are the same across these architectures, and detail a number of platform- agnostic optimisations that improve its performance by at least 2x on all hardware considered. Our complete code is shown to be 1.7x faster than the original miniMD, and at most 2x slower than implementations individually hand-tuned for a specific architecture

    Coagulant recovery from water treatment residuals: a review of applicable technologies

    Get PDF
    Conventional water treatment consumes large quantities of coagulant and produces even greater volumes of sludge. Coagulant recovery (CR) presents an opportunity to reduce both the sludge quantities and the costs they incur, by regenerating and purifying coagulant before reuse. Recovery and purification must satisfy stringent potable regulations for harmful contaminants, while remaining competitive with commercial coagulants. These challenges have restricted uptake and lead research towards lower-gain, lower-risk alternatives. This review documents the context in which CR must be considered, before comparing the relative efficacies and bottlenecks of potential technologies, expediting identification of the major knowledge gaps and future research requirements

    Parallelising wavefront applications on general-purpose GPU devices

    Get PDF
    Pipelined wavefront applications form a large portion of the high performance scientific computing workloads at supercomputing centres. This paper investigates the viability of graphics processing units (GPUs) for the acceleration of these codes, using NVIDIA's Compute Unified Device Architecture (CUDA). We identify the optimisations suitable for this new architecture and quantify the characteristics of those wavefront codes that are likely to experience speedups

    Experiences with porting and modelling wavefront algorithms on many-core architectures

    Get PDF
    We are currently investigating the viability of many-core architectures for the acceleration of wavefront applications and this report focuses on graphics processing units (GPUs) in particular. To this end, we have implemented NASA’s LU benchmark – a real world production-grade application – on GPUs employing NVIDIA’s Compute Unified Device Architecture (CUDA). This GPU implementation of the benchmark has been used to investigate the performance of a selection of GPUs, ranging from workstation-grade commodity GPUs to the HPC "Tesla” and "Fermi” GPUs. We have also compared the performance of the GPU solution at scale to that of traditional high perfor- mance computing (HPC) clusters based on a range of multi- core CPUs from a number of major vendors, including Intel (Nehalem), AMD (Opteron) and IBM (PowerPC). In previous work we have developed a predictive “plug-and-play” performance model of this class of application running on such clusters, in which CPUs communicate via the Message Passing Interface (MPI). By extending this model to also capture the performance behaviour of GPUs, we are able to: (1) comment on the effects that architectural changes will have on the performance of single-GPU solutions, and (2) make projections regarding the performance of multi-GPU solutions at larger scale

    WMTrace : a lightweight memory allocation tracker and analysis framework

    Get PDF
    The diverging gap between processor and memory performance has been a well discussed aspect of computer architecture literature for some years. The use of multi-core processor designs has, however, brought new problems to the design of memory architectures - increased core density without matched improvement in memory capacity is reduc- ing the available memory per parallel process. Multiple cores accessing memory simultaneously degrades performance as a result of resource con- tention for memory channels and physical DIMMs. These issues combine to ensure that memory remains an on-going challenge in the design of parallel algorithms which scale. In this paper we present WMTrace, a lightweight tool to trace and analyse memory allocation events in parallel applications. This tool is able to dynamically link to pre-existing application binaries requiring no source code modification or recompilation. A post-execution analysis stage enables in-depth analysis of traces to be performed allowing memory allocations to be analysed by time, size or function. The second half of this paper features a case study in which we apply WMTrace to five parallel scientific applications and benchmarks, demonstrating its effectiveness at recording high-water mark memory consumption as well as memory use per-function over time. An in-depth analysis is provided for an unstructured mesh benchmark which reveals significant memory allocation imbalance across its participating processes

    THM and HAA formation from NOM in raw and treated surface waters

    Get PDF
    The disinfection by-product (DBP) formation potential (FP) of natural organic matter (NOM) in surface water sources has been studied with reference to the key water quality determinants (WQDs) of UV absorption (UV254), colour, and dissolved organic carbon (DOC) concentration. The data set used encompassed raw and treated water sampled over a 30-month period from 30 water treatment works (WTWs) across Scotland, all employing conventional clarification. Both trihalomethane (THM) and haloacetic acid (HAA) FPs were considered. In addition to the standard bulk WQDs, the DOC content was fractionated and analysed for the hydrophobic (HPO) and hydrophilic (HPI) fractions. Results were quantified in terms of the yield (dDBPFP/dWQD) and the linear regression coefficient R2 of the yield trend. The NOM in the raw waters was found to comprise 30–84% (average 66%) of the more reactive HPO material, with this proportion falling to 18–63% (average 50%) in the treated water. Results suggested UV254 to be as good an indicator of DBPFP as DOC or HPO for the raw waters, with R2 values ranging from 0.79 to 0.82 for THMs and from 0.71 to 0.73 for HAAs for these three determinants. For treated waters the corresponding values were significantly lower at 0.52–0.67 and 0.46–0.47 respectively, reflecting the lower HPO concentration and thus UV254 absorption and commensurately reduced precision due to the limit of detection of the analytical instrument. It is concluded that fractionation offers little benefit in attempting to discern or predict chlorinated carbonaceous DBP yield for the waters across the geographical region studied. UV254 offered an adequate estimate of DBPFP based on a mean yield of ∼2600 and ∼2800 μg per cm−1 absorbance for THMFP for the raw and treated waters respectively and ∼3800 and2900 μg cm−1 for HAAFP, albeit with reduced precision for the treated waters

    Pilot-scale spiral wound membrane assessment for THM precursor rejection from upland waters

    Get PDF
    The outcomes of a pilot-scale study of the rejection of trihalomethanes (THMs) precursors by commercial ultrafiltration/nanofiltration (UF/NF) spiral-wound membrane elements are presented based on a single surface water source in Scotland. The study revealed the expected trend of increased flux and permeability with increasing pore size for the UF membranes; the NF membranes provided similar fluxes despite the lower nominal pore size. The dissolved organic carbon (DOC) passage decreased with decreasing molecular weight cut-off, with a less than one-third the passage recorded for the NF membranes than for the UF ones. The yield (weight % total THMs per DOC) varied between 2.5% and 8% across all membranes tested, in reasonable agreement with the literature, with the aromatic polyamide membrane providing both the lowest yield and lowest DOC passage. The proportion of the hydrophobic (HPO) fraction removed was found to increase with decreasing membrane selectivity (increasing pore size), and THM generation correlated closely (R2 = 0.98) with the permeate HPO fractional concentration

    Automated lay-up of composite blades

    Get PDF
    "Automated Lay-Up of Composite Blades" describes the Author’s contribution to a joint research project between Dowty Aerospace Propellers and the University of Durham into the automated lay-up of complex, three dimensional carbon fibre composite propfan blade preforms. The emphasis of the highly applied Project, now continuing at Brunei University, has been to develop an operational research demonstrator cell. The existing manual lay-up techniques employed by Dowty have been reviewed and a new met ho logy devised which can be far more easily automated. To implement the new met ho logy, a specialized lay-up station has been developed along with a practical prototype vacuum gripper technology capable of manipulating the range of large, complex, flexible and easily distorted plies required for propfan preform manufacture. Both the gripper technology and the Lay-Up Station have been successfully tested, the latter in an industrial environment to manufacture "real life” propfan blades

    Acidified and ultrafiltered recovered coagulants from water treatment works sludge for removal of phosphorus from wastewater

    Get PDF
    This study used a range of treated water treatment works sludge options for the removal of phosphorus (P) from primary wastewater. These options included the application of ultrafiltration for recovery of the coagulant from the sludge. The treatment performance and whole life cost (WLC) of the various recovered coagulant (RC) configurations have been considered in relation to fresh ferric sulphate (FFS). Pre-treatment of the sludge with acid followed by removal of organic and particulate contaminants using a 2kD ultrafiltration membrane resulted in a reusable coagulant that closely matched the performance FFS. Unacidified RC showed 53% of the phosphorus removal efficiency of FFS, at a dose of 20 mg/L as Fe and a contact time of 90 min. A longer contact time of 8 h improved performance to 85% of FFS. P removal at the shorter contact time improved to 88% relative to FFS by pre-acidifying the sludge to pH 2, using an acid molar ratio of 5.2:1 mol H+:Fe. Analysis of the removal of P showed that rapid phosphate precipitation accounted for >65% of removal with FFS. However, for the acidified RC a slower adsorption mechanism dominated; this was accelerated at a lower pH. A cost-benefit analysis showed that relative to dosing FFS and disposing waterworks sludge to land, the 20 year WLC was halved by transporting acidified or unacidified sludge up to 80 km for reuse in wastewater treatment. A maximum inter-site distance was determined to be 240 km above the current disposal route at current prices. Further savings could be made if longer contact times were available to allow greater P removal with unacidified RC

    Coagulant recovery and reuse for drinking water treatment

    Get PDF
    Coagulant recovery and reuse from waterworks sludge has the potential to significantly reduce waste disposal and chemicals usage for water treatment. Drinking water regulations demand purification of recovered coagulant before they can be safely reused, due to the risk of disinfection by-product precursors being recovered from waterworks sludge alongside coagulant metals. While several full-scale separation technologies have proven effective for coagulant purification, none have matched virgin coagulant treatment performance. This study examines the individual and successive separation performance of several novel and existing ferric coagulant recovery purification technologies to attain virgin coagulant purity levels. The new suggested approach of alkali extraction of dissolved organic compounds (DOC) from waterworks sludge prior to acidic solubilisation of ferric coagulants provided the same 14:1 selectivity ratio (874 mg/L Fe vs. 61 mg/L DOC) to the more established size separation using ultrafiltration (1285 mg/L Fe vs. 91 mg/L DOC). Cation exchange Donnan membranes were also examined: while highly selective (2555 mg/L Fe vs. 29 mg/L DOC, 88:1 selectivity), the low pH of the recovered ferric solution impaired subsequent treatment performance. The application of powdered activated carbon (PAC) to ultrafiltration or alkali pre-treated sludge, dosed at 80 mg/mg DOC, reduced recovered ferric DOC contamination to <1 mg/L but in practice, this option would incur significant costs. The treatment performance of the purified recovered coagulants was compared to that of virgin reagent with reference to key water quality parameters. Several PAC-polished recovered coagulants provided the same or improved DOC and turbidity removal as virgin coagulant, as well as demonstrating the potential to reduce disinfection byproducts and regulated metals to levels comparable to that attained from virgin material
    corecore