223 research outputs found

    An Evaluation of Software Release-Consistent Protocols

    Get PDF
    This paper presents an evaluation of three software implementations of release consistency. Release consistent protocols allow data communication to be aggregated, and multiple writers to simultaneously modify a single page. We evaluated an eager invalidate protocol that enforces consistency when synchronization variables are released, a lazy invalidate protocol that enforces consistency when synchronization variables are acquired, and a lazy hybrid protocol that selectively uses update to reduce access misses. Our evaluation is based on implementations running on DECstation-5000/240s connected by an ATM LAN, and an execution driven simulator that allows us to vary network parameters. Our results show that the lazy protocols consistently outperform the eager protocol for all but one application, and that the lazy hybrid performs the best overall. However, the relative performance of the implementations is highly dependent on the relative speeds of the network, processor, and communication software. Lower bandwidths and high per byte software communication costs favor the lazy invalidate protocol, while high bandwidths and low per byte costs favor the hybrid. Performance of the eager protocol approaches that of the lazy protocols only when communication becomes essentially free

    CAS-DSM: A Compiler Assisted Software Distributed Shared Memory

    Full text link
    Traditional software Distributed Shared Memory (DSM) systems rely on the virtual memory management mechanisms to detect accesses to shared memory locations and maintain their consistency. The resulting involvement of the OS (kernel) and the associated overhead which is significant, can be avoided by careful compile time analysis and code instrumentation. In this paper, we propose such a Compiler Assisted Software support approach (CAS-DSM). In the CAS-DSM implementation, the involvement of the OS kernel is avoided by instrumenting the application code at the source level. The overhead caused by the execution of the instrumented code is reduced through several aggressive compile time optimizations. Finally, we also address the issue of reducing certain overheads in polling-based implementation of receiving asynchronous messages. We used SUIF, a public domain compiler tool, to implement compile time analysis, instrumentation and optimizations. We modified CVM, a publicly available software DSM to support the instrumentation inserted by the compiler. Detailed performance evaluation of CAS-DSM is reported using a set of Splash/Splash2 parallel application benchmarks on a distributed memory IBM SP-2 machine. CAS-DSM achieved moderate to good performance improvements for most of the applications compared to the original CVM implementation. Reducing the overheads in polling-based implementation improves the performance of CAS-DSM significantly resulting in an overall improvement of 12–52% over the original CVM implementation.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/44573/1/10766_2004_Article_482234.pd

    Efficiently and Transparently Maintaining High SIMD Occupancy in the Presence of Wavefront Irregularity

    Get PDF
    Demand is increasing for high throughput processing of irregular streaming applications; examples of such applications from scientific and engineering domains include biological sequence alignment, network packet filtering, automated face detection, and big graph algorithms. With wide SIMD, lightweight threads, and low-cost thread-context switching, wide-SIMD architectures such as GPUs allow considerable flexibility in the way application work is assigned to threads. However, irregular applications are challenging to map efficiently onto wide SIMD because data-dependent filtering or replication of items creates an unpredictable data wavefront of items ready for further processing. Straightforward implementations of irregular applications on a wide-SIMD architecture are prone to load imbalance and reduced occupancy, while more sophisticated implementations require advanced use of parallel GPU operations to redistribute work efficiently among threads. This dissertation will present strategies for addressing the performance challenges of wavefront- irregular applications on wide-SIMD architectures. These strategies are embodied in a developer framework called Mercator that (1) allows developers to map irregular applications onto GPUs ac- cording to the streaming paradigm while abstracting from low-level data movement and (2) includes generalized techniques for transparently overcoming the obstacles to high throughput presented by wavefront-irregular applications on a GPU. Mercator forms the centerpiece of this dissertation, and we present its motivation, performance model, implementation, and extensions in this work

    Cilk : efficient multithreaded computing

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1998.Includes bibliographical references (p. 170-179).by Keith H. Randall.Ph.D

    Compiler and Runtime Optimizations for Fine-Grained Distributed Shared Memory Systems

    Get PDF
    Bal, H.E. [Promotor

    A heuristic-based approach to code-smell detection

    Get PDF
    Encapsulation and data hiding are central tenets of the object oriented paradigm. Deciding what data and behaviour to form into a class and where to draw the line between its public and private details can make the difference between a class that is an understandable, flexible and reusable abstraction and one which is not. This decision is a difficult one and may easily result in poor encapsulation which can then have serious implications for a number of system qualities. It is often hard to identify such encapsulation problems within large software systems until they cause a maintenance problem (which is usually too late) and attempting to perform such analysis manually can also be tedious and error prone. Two of the common encapsulation problems that can arise as a consequence of this decomposition process are data classes and god classes. Typically, these two problems occur together – data classes are lacking in functionality that has typically been sucked into an over-complicated and domineering god class. This paper describes the architecture of a tool which automatically detects data and god classes that has been developed as a plug-in for the Eclipse IDE. The technique has been evaluated in a controlled study on two large open source systems which compare the tool results to similar work by Marinescu, who employs a metrics-based approach to detecting such features. The study provides some valuable insights into the strengths and weaknesses of the two approache

    Contingent Valuation of Environmental Goods: A Comprehensive Critique

    Get PDF
    Contingent valuation is a survey-based procedure that attempts to estimate how much households are willing to pay for specific programs that improve the environment or prevent environmental degradation. For decades, the method has been the center of debate regarding its reliability: does it really measure the value that people place on environmental changes? Bringing together leading voices in the field, this timely book tells a unified story about the interrelated features of contingent valuation and how those features affect its reliability. Through empirical analysis and review of past studies, the authors identify important deficiencies in the procedure, raising questions about the technique’s continued use

    An investigation into metaphoric competence in the L2: A linguistic approach

    Get PDF
    Within the field of L2 metaphoric competence (MC) research, Low’s (1988) and Littlemore and Low’s (2006a, 2006b) metaphor-related skills and (sub)competences have existed for 29 and 11 years respectively, but have never been elicited or used to develop tests. Consequently, the extent to which they are underpinned by more fundamental (sub)constructs is unclear. With a few exceptions (e.g., Littlemore, 2001), L2 MC tests to date have been limited in scope (e.g., Aleshtar & Dowlatabadi, 2014; Azuma, 2005; Hashemian & Nezhad, 2007; Zhao, Yu, & Yang, 2014). Available research shows that L2 MC correlates with L2 vocabulary knowledge and proficiency (Aleshtar & Dowlatabadi, 2014; Zhao et al., 2014), but negligibly with time spent in an L2 immersion setting (Azuma, 2005). However, the ability of these measures to predict L2 MC is unknown, as is the change in the receptive/productive correlation strength as L2 proficiency increases. In response to these gaps, a large battery of L2 MC tests aimed at eliciting Low’s (1988) and Littlemore and Low’s (2006a, 2006b) constructs was developed and administered to 112 NNSs of English (L1 Chinese) and 31 English NSs, along with vocabulary knowledge and (NNSs only) general proficiency tests. Data cleaning showed inherent, operationalisation problems. Exploratory Factor Analysis revealed four metaphor-related factors, with MANOVA and independent samples t-tests showing statistical NNS and NS differences for only one of these: English Grammatical Metaphoric Competence. Multiple regression revealed that the Oxford Online Placement Test best predicted L2 receptive MC, whereas L2 vocabulary depth measured by the Word Associates Test (Read, 1998) best predicted L2 productive MC. Time spent living in the UK had no predictive power, and the receptive/productive correlation weakened with increased L2 proficiency. Implications for theory, test development, the transferability of models and predictors (e.g., to NNSs with other L1s) and EFL teaching are discussed

    A Study of Client-based Caching for Parallel I/O

    Get PDF
    The trend in parallel computing toward large-scale cluster computers running thousands of cooperating processes per application has led to an I/O bottleneck that has only gotten more severe as the the number of processing cores per CPU has increased. Current parallel file systems are able to provide high bandwidth file access for large contiguous file region accesses; however, applications repeatedly accessing small file regions on unaligned file region boundaries continue to experience poor I/O throughput due to the high overhead associated with accessing parallel file system data. In this dissertation we demonstrate how client-side file data caching can improve parallel file system throughput for applications performing frequent small and unaligned file I/O. We explore the impacts of cache page size and cache capacity using the popular FLASH I/O benchmark and explore a novel cache sharing approach that leverages the trend toward multi-core processors. We also explore a technique we call progressive page caching that represents cache data using dynamic data structures rather than fixed-size pages of file data. Finally, we explore a cache aggregation scheme that leverages the high-level file I/O interfaces provided by the PVFS file system to provide further performance enhancements. In summary, our results indicate that a correctly configured middleware-based file data cache can dramatically improve the performance of I/O workloads dominated by small unaligned file accesses. Further, we demonstrate that a well designed cache can offer stable performance even when the selected cache page granularity is not well matched to the provided workload. Finally, we have shown that high-level file system interfaces can significantly accelerate application performance, and interfaces beyond those currently envisioned by the MPI-IO standard could provide further performance benefits

    Leveraging Digital Technologies for Management of Peripartum Depression to Mitigate Health Disparities

    Get PDF
    Health disparities are adverse, preventable differences in health outcomes that affect disadvantaged populations. Examples of health disparities can be seen in the condition of peripartum depression (PPD), a mood disorder affecting approximately 10-15% of peripartum women. For example, Hispanic and African-American women are less likely to start or continue PPD treatment. Digital health technologies have emerged as practical solutions for PPD care and self-management. However, existing digital solutions lack an incorporation of behavior theory and distinctive information needs based on women’s personal, social, and clinical profiles. Bridging this gap, I adapt Digilego, an integrative digital health development framework consisting of: a) mixed-methods user needs analysis, (b) behavior and health literacy theory mapping, and (c) content and feature engineering specifications for future programmatic development, to address health disparities. This enhanced framework is then used to design and develop a digital platform (MomMind) for PPD prevention among women in their peripartum period. This platform contains a digital journal, social forum, a library repository of PPD patient education materials, and a repository of PPD self-monitoring surveys. In line with the existing Digilego digital health framework, throughout my iterative process of design and development, I gather design insights from my target population (n=19) and their health providers (n=9) using qualitative research methods (e.g., interviews) and secondary analysis of peer interactions in two PPD online forums (n=55,301 posts from 9,364 users spanning years 2008-2022). These multimodal needs gathering efforts allowed me to a) compile women’s information and technology needs, and b) utilize them as a guide for MomMind intervention development and evaluation. One key MomMind strength is its grounding in theory-driven behavior change techniques (e.g., shaping knowledge) and patient engagement features (e.g., electronic questionnaires) as facilitated by Digilego. Also, I extend Digilego by incorporating literacy domains (e.g., health literacy) and cognitive processes (e.g., understanding) from the eHealth literacy framework into my content engineering approach. After an in-house usability assessment, I conducted a pilot acceptability evaluation of MomMind using cross-sectional acceptability surveys and PPD health literacy surveys administered pre-and-post use of MomMind. Interviews were also conducted to assess participant’s personal opinions and feedback. The study sample included n=30 peripartum women, of whom 16 (53.3%) were Hispanic and 17 (56.7%) were in low-income ranges. A total of 29/30 (96.6%) participants approved of MomMind, 28/30 (93.3%) deemed it a good fit, and 29/30 (96.67%) deemed it easy to use. Participants showed statistically significant improvement (
    corecore