Search CORE

103 research outputs found

Quick and practical run-time evaluation of multiple program optimizations

Author: Cohen Albert
Fursin Grigori
O'Boyle Michael
Temam Olivier
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 17/11/2005
Field of study

This article aims at making iterative optimization practical and usable by speeding up the evaluation of a large range of optimizations. Instead of using a full run to evaluate a single program optimization, we take advantage of periods of stable performance, called phases. For that purpose, we propose a low-overhead phase detection scheme geared toward fast optimization space pruning, using code instrumentation and versioning implemented in a production compiler. Our approach is driven by simplicity and practicality. We show that a simple phase detection scheme can be sufficient for optimization space pruning. We also show it is possible to search for complex optimizations at run-time without resorting to sophisticated dynamic compilation frameworks. Beyond iterative optimization, our approach also enables one to quickly design selftuned applications. Considering 5 representative SpecFP2000 benchmarks, our approach speeds up iterative search for the best program optimizations by a factor of 32 to 962. Phase prediction is 99.4% accurate on average, with an overhead of only 2.6%. The resulting self-tuned implementations bring an average speed-up of 1.4

INRIA a CCSD electronic archive server

Proceedings Virtual Imaging Trials in Medicine 2024

Author: Abadi Ehsan
Badano Aldo
Bakic Predrag
Bliznakova Kristina
Bosmans Hilde
Carton Ann-Katherine
Frangi Alejandro
Glick Stephen
Kinahan Paul
Lo Joseph
Maidment Andrew
Ria Francesco
Samei Ehsan
Sechopoulos Ioannis
Segars Paul
Tanaka Rie
Vancoillie Liesbeth
Publication venue
Publication date: 08/05/2024
Field of study

This submission comprises the proceedings of the 1st Virtual Imaging Trials in Medicine conference, organized by Duke University on April 22-24, 2024. The listed authors serve as the program directors for this conference. The VITM conference is a pioneering summit uniting experts from academia, industry and government in the fields of medical imaging and therapy to explore the transformative potential of in silico virtual trials and digital twins in revolutionizing healthcare. The proceedings are categorized by the respective days of the conference: Monday presentations, Tuesday presentations, Wednesday presentations, followed by the abstracts for the posters presented on Monday and Tuesday

The University of Manchester - Institutional Repository

Understanding and Optimizing Flash-based Key-value Systems in Data Centers

Author: Jia Yichen
Publication venue: LSU Digital Commons
Publication date: 09/03/2020
Field of study

Flash-based key-value systems are widely deployed in today’s data centers for providing high-speed data processing services. These systems deploy flash-friendly data structures, such as slab and Log Structured Merge(LSM) tree, on flash-based Solid State Drives(SSDs) and provide efficient solutions in caching and storage scenarios. With the rapid evolution of data centers, there appear plenty of challenges and opportunities for future optimizations. In this dissertation, we focus on understanding and optimizing flash-based key-value systems from the perspective of workloads, software, and hardware as data centers evolve. We first propose an on-line compression scheme, called SlimCache, considering the unique characteristics of key-value workloads, to virtually enlarge the cache space, increase the hit ratio, and improve the cache performance. Furthermore, to appropriately configure increasingly complex modern key-value data systems, which can have more than 50 parameters with additional hardware and system settings, we quantitatively study and compare five multi-objective optimization methods for auto-tuning the performance of an LSM-tree based key-value store in terms of throughput, the 99th percentile tail latency, convergence time, real-time system throughput, and the iteration process, etc. Last but not least, we conduct an in-depth, comprehensive measurement work on flash-optimized key-value stores with recently emerging 3D XPoint SSDs. We reveal several unexpected bottlenecks in the current key-value store design and present three exemplary case studies to showcase the efficacy of removing these bottlenecks with simple methods on 3D XPoint SSDs. Our experimental results show that our proposed solutions significantly outperform traditional methods. Our study also contributes to providing system implications for auto-tuning the key-value system on flash-based SSDs and optimizing it on revolutionary 3D XPoint based SSDs

Louisiana State University

A Parametric Approach for Efficient Speech Storage, Flexible Synthesis and Voice Conversion

Author: Nurminen Jani
Publication venue: Tampere University of Technology
Publication date: 01/01/2013
Field of study

During the past decades, many areas of speech processing have benefited from the vast increases in the available memory sizes and processing power. For example, speech recognizers can be trained with enormous speech databases and high-quality speech synthesizers can generate new speech sentences by concatenating speech units retrieved from a large inventory of speech data. However, even in today's world of ever-increasing memory sizes and computational resources, there are still lots of embedded application scenarios for speech processing techniques where the memory capacities and the processor speeds are very limited. Thus, there is still a clear demand for solutions that can operate with limited resources, e.g., on low-end mobile devices. This thesis introduces a new segmental parametric speech codec referred to as the VLBR codec. The novel proprietary sinusoidal speech codec designed for efficient speech storage is capable of achieving relatively good speech quality at compression ratios beyond the ones offered by the standardized speech coding solutions, i.e., at bitrates of approximately 1 kbps and below. The efficiency of the proposed coding approach is based on model simplifications, mode-based segmental processing, and the method of adaptive downsampling and quantization. The coding efficiency is also further improved using a novel flexible multi-mode matrix quantizer structure and enhanced dynamic codebook reordering. The compression is also facilitated using a new perceptual irrelevancy removal method. The VLBR codec is also applied to text-to-speech synthesis. In particular, the codec is utilized for the compression of unit selection databases and for the parametric concatenation of speech units. It is also shown that the efficiency of the database compression can be further enhanced using speaker-specific retraining of the codec. Moreover, the computational load is significantly decreased using a new compression-motivated scheme for very fast and memory-efficient calculation of concatenation costs, based on techniques and implementations used in the VLBR codec. Finally, the VLBR codec and the related speech synthesis techniques are complemented with voice conversion methods that allow modifying the perceived speaker identity which in turn enables, e.g., cost-efficient creation of new text-to-speech voices. The VLBR-based voice conversion system combines compression with the popular Gaussian mixture model based conversion approach. Furthermore, a novel method is proposed for converting the prosodic aspects of speech. The performance of the VLBR-based voice conversion system is also enhanced using a new approach for mode selection and through explicit control of the degree of voicing. The solutions proposed in the thesis together form a complete system that can be utilized in different ways and configurations. The VLBR codec itself can be utilized, e.g., for efficient compression of audio books, and the speech synthesis related methods can be used for reducing the footprint and the computational load of concatenative text-to-speech synthesizers to levels required in some embedded applications. The VLBR-based voice conversion techniques can be used to complement the codec both in storage applications and in connection with speech synthesis. It is also possible to only utilize the voice conversion functionality, e.g., in games or other entertainment applications

Trepo - Institutional Repository of Tampere University

Effective memory management for mobile environments

Author: Hussein Ahmed Mohamed Abd-elhaffiez
Publication venue: 'Purdue University (bepress)'
Publication date: 01/01/2016
Field of study

Smartphones, tablets, and other mobile devices exhibit vastly different constraints compared to regular or classic computing environments like desktops, laptops, or servers. Mobile devices run dozens of so-called “apps” hosted by independent virtual machines (VM). All these VMs run concurrently and each VM deploys purely local heuristics to organize resources like memory, performance, and power. Such a design causes conflicts across all layers of the software stack, calling for the evaluation of VMs and the optimization techniques specific for mobile frameworks. In this dissertation, we study the design of managed runtime systems for mobile platforms. More specifically, we deepen the understanding of interactions between garbage collection (GC) and system layers. We develop tools to monitor the memory behavior of Android-based apps and to characterize GC performance, leading to the development of new techniques for memory management that address energy constraints, time performance, and responsiveness. We implement a GC-aware frequency scaling governor for Android devices. We also explore the tradeoffs of power and performance in vivo for a range of realistic GC variants, with established benchmarks and real applications running on Android virtual machines. We control for variation due to dynamic voltage and frequency scaling (DVFS), Just-in-time (JIT) compilation, and across established dimensions of heap memory size and concurrency. Finally, we provision GC as a global service that collects statistics from all running VMs and then makes an informed decision that optimizes across all them (and not just locally), and across all layers of the stack. Our evaluation illustrates the power of such a central coordination service and garbage collection mechanism in improving memory utilization, throughput, and adaptability to user activities. In fact, our techniques aim at a sweet spot, where total on-chip energy is reduced (20–30%) with minimal impact on throughput and responsiveness (5–10%). The simplicity and efficacy of our approach reaches well beyond the usual optimization techniques

Purdue E-Pubs

Boron neutron capture therapy treatment planning improvements

Author: Goorley John Timothy, 1974-
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/1998
Field of study

Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Nuclear Engineering, 1998.Includes bibliographical references.The Boron Neutron Capture Therapy (BNCT) treatment planning process of the Harvard/MIT team used for their clinical Phase I trials is very time consuming. If BNCT proves to be a successful treatment, this process must be made more efficient. Since the Monte Carlo treatment planning calculations were the most time consuming aspect of the treatment planning process, requiring more than thirty six hours for scoping calculations of three to five beams and final calculations for two beams, it was targeted for improvement. Three approaches were used to reduce the calculation times. A statistical uncertainty analysis was performed on doses rates and showed that a fewer number of particles could not be used and still meet uncertainty requirements in the region of interest. Unused features were removed and assumptions specific to the Harvard/MIT BNCT treatment planning calculations were hard wired into MCNP by Los Alamos personnel, resulting in a thirty percent decrease in runtimes. MCNP was also installed in parallel on the treatment planning computers, allowing a factor of improvement by roughly the number of computers linked together in parallel. After theses enhancements were made, the final executable, MCNPBNCT, was tested by comparing its calculated dose rates against the previously used executable, MCNPNEHD. Since the dose rates in close agreement, MCNPBNCT was adopted. The final runtime improvement to a single beam scoping run by linking the two 200MHz Pentium Pro computers was to reduce the wall clock runtime from 2 hours thirty minutes to fifty nine minutes. It is anticipated that the addition of ten 900 MHz CPUs will further reduce this calculation to three minutes, giving the medical physicist or radiation oncologist the freedom to use an iterative approach to try different radiation beam orientations to optimize treatment. Additional aspects of the treatment planning process were improved. The previously unrecognized phenomenon of peak dose movement during irradiation and its potential for overdosing the subject was identified. A method of predicting its occurrence was developed to prevent this from occurring. The calculated dose rate was also used to create dose volume histograms and volume averaged doses. These data suggest an alternative method for categorizing the subjects, rather than by peak tissue dose.by John Timothy Goorley.S.M

DSpace@MIT

Unified Role Assignment Framework For Wireless Sensor Networks

Author: Kochhal Manish Mahendra Kumar
Publication venue: DigitalCommons@WayneState
Publication date: 01/01/2010
Field of study

Wireless sensor networks are made possible by the continuing improvements in embedded sensor, VLSI, and wireless radio technologies. Currently, one of the important challenges in sensor networks is the design of a systematic network management framework that allows localized and collaborative resource control uniformly across all application services such as sensing, monitoring, tracking, data aggregation, and routing. The research in wireless sensor networks is currently oriented toward a cross-layer network abstraction that supports appropriate fine or course grained resource controls for energy efficiency. In that regard, we have designed a unified role-based service paradigm for wireless sensor networks. We pursue this by first developing a Role-based Hierarchical Self-Organization (RBSHO) protocol that organizes a connected dominating set (CDS) of nodes called dominators. This is done by hierarchically selecting nodes that possess cumulatively high energy, connectivity, and sensing capabilities in their local neighborhood. The RBHSO protocol then assigns specific tasks such as sensing, coordination, and routing to appropriate dominators that end up playing a certain role in the network. Roles, though abstract and implicit, expose role-specific resource controls by way of role assignment and scheduling. Based on this concept, we have designed a Unified Role-Assignment Framework (URAF) to model application services as roles played by local in-network sensor nodes with sensor capabilities used as rules for role identification. The URAF abstracts domain specific role attributes by three models: the role energy model, the role execution time model, and the role service utility model. The framework then generalizes resource management for services by providing abstractions for controlling the composition of a service in terms of roles, its assignment, reassignment, and scheduling. To the best of our knowledge, a generic role-based framework that provides a simple and unified network management solution for wireless sensor networks has not been proposed previously

Digital Commons@Wayne State University

Real-time, image-based motion estimation for the Dervish landmine-clearance vehicle

Author: Haworth Christopher D.
Publication venue: The University of Edinburgh
Publication date: 01/01/2002
Field of study

Edinburgh Research Archive

Recommended from our members

Automated Testing and Debugging for Big Data Analytics

Author: Gulzar Muhammad Ali
Publication venue: eScholarship, University of California
Publication date: 01/01/2020
Field of study

The prevalence of big data analytics in almost every large-scale software system has generated a substantial push to build data-intensive scalable computing (DISC) frameworks such as Google MapReduce and Apache Spark that can fully harness the power of existing data centers. However, frameworks once used by domain experts are now being leveraged by data scientists, business analysts, and researchers. This shift in user demographics calls for immediate advancements in the development, debugging, and testing practices of big data applications, which are falling behind compared to the DISC framework design and implementation. In practice, big data applications often fail as users are unable to test all behaviors emerging from interleaving dataflow operators, user-defined functions, and framework's code. "Testing based on a random sample" rarely guarantees the reliability and "trial and error" and "print" debugging methods are expensive and time-consuming. Thus, the current practice of developing a big data application must be improved and the tools built to enhance the developer's productivity must adapt to the distinct characteristics of data-intensive scalable computing. By synthesizing ideas from software engineering and database systems, our hypothesis is that we can design effective and scalable testing and debugging algorithms for big data analytics without compromising the performance and efficiency of the underlying DISC framework. To design such techniques, we investigate how we can build interactive and responsive debugging primitives that significantly reduce the debugging time, yet do not pose much performance overhead on big data applications. Furthermore, we investigate how we can leverage data provenance techniques from databases and fault-isolation algorithms from software engineering to pinpoint the minimal subset of failure-inducing inputs efficiently. To improve the reliability of big data analytics, we investigate how we can abstract the semantics of dataflow operators and use them in tandem with the semantics of user-defined functions to generate a minimum set of synthetic test inputs capable of revealing more defects than the entire input dataset.To examine the first hypothesis, we introduce interactive, real-time debugging primitives for big data analytics through innovative and scalable debugging features such as simulated breakpoint, dynamic watchpoint, and crash culprit identification. Second, we design a new automated fault localization approach that combines insights from both the software engineering and database literature to bring delta debugging closer to a reality in the big data applications by leveraging data provenance and by constructing systems optimizations for debugging provenance queries. Lastly, we devise a new symbolic-execution based white-box testing algorithm for big data applications that abstracts the implementation of dataflow operators using logical specifications instead of modeling their implementations and combines them with the semantics of any arbitrary user-defined function. We instantiate the idea of an interactive debugging algorithm as BigDebug, the idea of an automated debugging algorithm as BigSift, and the idea of symbolic execution-based testing as BigTest. Our investigation shows that the interactive debugging primitives can scale to terabytes---our record-level tracing incurs less than 25% overhead on average and provides up to 100% time saving compared to the baseline replay debugger. Second, we observe that by combining data provenance with delta debugging, we can identify the minimum faulty input in just under 30% of the original job execution time. Lastly, we verify that by abstracting dataflow operators using logical specifications, we can efficiently generate the most concise test data suitable for local testing while revealing twice as many faults as prior approaches. Our investigations collectively demonstrate that developer productivity can be significantly improved through effective and scalable testing and debugging techniques for big data analytics, without impacting the DISC framework's performance. This dissertation affirms the feasibility of automated debugging and testing techniques for big data analytics---techniques that were previously considered infeasible for large-scale data processing

eScholarship - University of California

Integration of dynamic table translations into dynamic trajectory radiotherapy and mixed photon-electron beam radiotherapy

Author: Guyer Gian Mauro Carlo
Publication venue: Universität Bern
Publication date
Field of study

Radiotherapy aims at delivering a lethal dose of radiation to tumor cells while sparing the surrounding healthy tissue and organs. Highly specialized devices, such as C-arm linear accelerators (linacs), have been developed for external beam radiotherapy, which deliver high-energy photon and electron beams. Over the last decades, several improvements in photon beam radiotherapy, such as introducing the photon multileaf collimator (pMLC), enabled intensity-modulated radiotherapy (IMRT), resulting in improved target conformality compared to 3D conformal techniques. Volumetric modulated arc therapy (VMAT) improves the delivery efficiency while maintaining the dosimetric plan quality of IMRT by using dynamic gantry rotation during beam on. Next to the dynamic gantry rotation, also the table and the collimator can rotate dynamically during beam on. This is used in a technique called dynamic trajectory radiotherapy (DTRT). Furthermore, the table can also translate dynamically in three directions, enabling non-isocentric DTRT. However, the potential of dynamic table translations for radiotherapy on a C-arm linac is unexplored. Thus, in this thesis, treatment techniques including dynamic table translations are developed, and potential use cases are shown. A treatment planning process (TPP) for non-isocentric DTRT is developed to create treatment plans with photon beams including dynamic gantry, collimator, and table rotation and dynamic table translation. The intensity modulation optimization of the TPP is based on a hybrid column generation and simulated annealing direct aperture optimization algorithm. The TPP is used to create non-isocentric DTRT plans and several potential use cases for non-isocentric DTRT are demonstrated: While maintaining treatment plan quality, the delivery efficiency is improved by using non-isocentric DTRT instead of multi-isocentric IMRT for craniospinal irradiation. Extending the source-to-target distance in DTRT plans reduces the risk of collision between the gantry and the patient or table and enables additional beam directions, which could be exploited to improve the dosimetric treatment plan quality compared to isocentric DTRT. Contrary to photon beam radiotherapy, electron treatments are still applied using patient-specific cut-outs placed in an applicator. By using the pMLC for electron beam collimation instead of the cut-outs, efficient electron beam treatments are possible. Further, the use of the pMLC facilitates mixed photon-electron beam radiotherapy (MBRT). An MBRT technique using pMLC-collimated electron arcs instead of electron beams with a static gantry angle is developed, resulting in improved delivery efficiency while maintaining the dosimetric plan quality of MBRT plans using electron beams with a static gantry angle. One of the challenges of DTRT on C-arm linacs is accurately predicting potential collisions between the gantry, the patient, and the table during treatment planning. Thus, a collision prediction tool is developed, which is able to predict possible collision interlocks. The tool was successfully validated against measurements. The created treatment plans for non-isocentric DTRT and MBRT were shown to be accurately deliverable on a C-arm linac. For several treatment plans, the dosimetric accuracy was successfully validated using film measurements. In conclusion, this thesis demonstrates the benefits of dynamic table translations in photon and electron beam radiotherapy. With the demonstrated benefits of improved dosimetric treatment plan quality, delivery efficiency, and collision risk, dynamic table translations further facilitate the use of MBRT and DTRT treatment techniques in clinics in the future

BORIS Theses