565 research outputs found

    Improving Performance and Flexibility of Fabric-Attached Memory Systems

    Get PDF
    As demands for memory-intensive applications continue to grow, the memory capacity of each computing node is expected to grow at a similar pace. In high-performance computing (HPC) systems, the memory capacity per compute node is decided upon the most demanding application that would likely run on such a system, and hence the average capacity per node in future HPC systems is expected to grow significantly. However, diverse applications run on HPC systems with different memory requirements and memory utilization can fluctuate widely from one application to another. Since memory modules are private for a corresponding computing node, a large percentage of the overall memory capacity will likely be underutilized, especially when there are many jobs with small memory footprints. Thus, as HPC systems are moving towards the exascale era, better utilization of memory is strongly desired. Moreover, as new memory technologies come on the market, the flexibility of upgrading memory and system updates becomes a major concern since memory modules are tightly coupled with the computing nodes. To address these issues, vendors are exploring fabric-attached memories (FAM) systems. In this type of system, resources are decoupled and are maintained independently. Such a design has driven technology providers to develop new protocols, such as cache-coherent interconnects and memory semantic fabrics, to connect various discrete resources and help users leverage advances in-memory technologies to satisfy growing memory and storage demands. Using these new protocols, FAM can be directly attached to a system interconnect and be easily integrated with a variety of processing elements (PEs). Moreover, systems that support FAM can be smoothly upgraded and allow multiple PEs to share the FAM memory pools using well-defined protocols. The sharing of FAM between PEs allows efficient data sharing, improves memory utilization, reduces cost by allowing flexible integration of different PEs and memory modules from several vendors, and makes it easier to upgrade the system. However, adopting FAM in HPC systems brings in new challenges. Since memory is disaggregated and is accessed through fabric networks, latency in accessing memory (efficiency) is a crucial concern. In addition, quality of service, security from neighbor nodes, coherency, and address translation overhead to access FAM are some of the problems that require rethinking for FAM systems. To this end, we study and discuss various challenges that need to be addressed in FAM systems. Firstly, we developed a simulating environment to mimic and analyze FAM systems. Further, we showcase our work in addressing the challenges to improve the performance and increase the feasibility of such systems; enforcing quality of service, providing page migration support, and enhancing security from malicious neighbor nodes

    High Performance and Secure Execution Environments for Emerging Architectures

    Get PDF
    Energy-efficiency and performance have been the driving forces of system architectures and designers in the last century. Given the diversity of workloads and the significant performance and power improvements when running workloads on customized processing elements, system vendors are drifting towards new system architectures (e.g., FAM or HMM). Such architectures are being developed with the purpose of improving the system\u27s performance, allow easier data sharing, and reduce the overall power consumption. Additionally, current computing systems suffer from a very wide attack surface, mainly due to the fact that such systems comprise of tens to hundreds of sub-systems that could be manufactured by different vendors. Vulnerabilities, backdoors, and potentially hardware trojans injected anywhere in the system form a serious risk for confidentiality and integrity of data in computing systems. Thus, adding security features is becoming an essential requirement in modern systems. In the purpose of achieving these performance improvements and power consumption reduction, the emerging NVMs stand as a very appealing option to be the main memory building block or a part of it. However, integrating the NVMs in the memory system can lead to several challenges. First, if the NVM is used as the sole memory, incorporating security measures can exacerbate the NVM\u27s write endurance and reduce its lifetime. Second, integrating the NVM as a part of the main memory as in DRAM-NVM hybrid memory systems can lead to higher performance overheads of persistent applications. Third, Integrating the NVM as a memory extension as in fabric-attached memory architecture can cause a high contention over the security metadata cache. Additionally, in FAM architectures, the memory sharing can lead to security metadata coherence problems. In this dissertation, we study these problems and propose novel solutions to enable secure and efficient integration of NVMs in the emerging architectures

    Political implications of bilingual cognition

    Get PDF
    Though more than two-thirds of all children around the world grow up in bilingual environments (Crystal 1997) and more than half of the world's population speak more than one language in everyday life (Grosjean 2010), political science continues to operate under a Chomskian scheme in which language is characterized as "a parsimonious symbol system, [or] a type of mental algebra" (Caldwell-Harris 2014, 2). Simply put, the prevailing assumptions are that language is transparent and trivial, and that no special evaluation of its impact independent of the plain meaning is required. Taken further, this paradigm implies that phrases from different languages will be understood exactly the same way, as long as they are faithful translational equivalents. However, recent research in cognitive psychology has demonstrated otherwise. Affect research experiments have shown that the emotional impact of negative and taboo stimuli is significantly blunted when a bilingual receives them in a second language (Caldwell-Harris and Ayçiçeği-Dinn 2009, Eilola and Havelka 2010, Hsu, Jacobs and Conrad 2015). Concurrently, researchers studying implicit social cognition have found that whether an ethnic group was viewed favorably by bilingual respondents depended on the language in which they were prompted to express an opinion (Danzinger and Ward 2010, Ogunnaike, Dunham, and Banaji 2010). More recently, decision research has found that working language affected how bilinguals made moral judgments (Costa et al. 2014a) and perceived causality (Geipel, Hadjichristidis, and Surian 2015a). These findings suggest that language may carry enormous ramifications for the study of political behavior. Decreased affective response in second languages may mean that bilinguals respond less intensely to messages broadcast in their second language. Political advertisements, perhaps especially attack ads, may lose their efficacy if their keywords and symbols do not provoke a bilingual citizen as intended, or if the bilingual forgets them too quickly. Citizens who must evaluate candidates, issue platforms, or even ballot initiatives in their second language may come to very different conclusions and vote choices than they would in their native language. In bargaining, second language negotiations may enable parties working in a second language to be more objective and willing to take risks in achieving a consensus, as they may be less likely to be provoked by topics that are controversial or hold deep emotional resonance. From a methodological perspective, if a respondent’s response in his second language differs from in his native language, then the investigator will fail in his goal of accurately capturing the opinions and attitudes of the target populations, especially those of minority constituents. Researchers who should be alert to possible effects of language for multilingual respondents must recognize the challenges that lie ahead, either of discerning which attitudes and opinions are "true" ones, or of being able to measure language-specific attitudes. This dissertation investigates whether a native speaker and a non-native speaker process and react to a language in the same manner. I incorporate recent findings from psychology to explain why speaking native and non-native languages may prompt different modes of cognition, and subsequently, result in observable differences in attitudes and behaviors. I use data drawn from an original survey experiment that I designed and conducted in the People's Republic of China (PRC) in 2013 to draw larger conclusions about the potential impact of bilingualism in the political realm. This survey experiment, the first in political science to vary the working language as an experimental condition, asked university students at Capital Normal University to respond on a battery of questions including both commonly-used as well as original attitudinal and behavioral questions. The balance of this dissertation comprises three chapters, each organized thematically around findings from a different cluster of psychology research into bilingualism. The first substantive chapter provides a basic primer on the terminology used in bilingualism research and investigates the L1 affective advantage, in which one's native language (L1) usually evokes greater affect that one's second language (L2). The L1 affective advantage influences how bilinguals view choices and outcomes in a hypothetical situation, which in turn affects their decision-making. I extend this research further to examine how bilinguals assess the fairness of money sharing proposals in an Ultimatum Game. The second substantive chapter examines language in regard to the encoding specificity principle, which Tulving and Thomson (1973) defined as the improvement in recall when conditions at the time of encoding match those at the time of retrieval. Its corollary in bilingualism, the language specificity effect, asserts that memories are "more likely to be activated by the language in which the original events took place" (Pavlenko 2012, 410). I examine whether self-reported patterns of political discussion and media exposure changed as a function of the working language, and their methodological implications for political behavior research. The last substantive chapter examines cultural frame switching, defined (by Hong et al. 2000) as the process through which a bilingual accesses networks of knowledge that are associated with different cultures. Cross-cultural psychologists have demonstrated that changing the working language can change a bilingual's self-perception and influence how he views and relates to different groups of people (Hong et al., Ross, Xun, and Wilson 2002). I investigate whether monocultural bilinguals, a group commonly assumed not to be capable of displaying cultural frame switching, profess different core values, such as prioritizing group harmony over individual rights and deference toward authority, or different political judgments when the working language changed

    Towards Scalable OLTP Over Fast Networks

    Get PDF
    Online Transaction Processing (OLTP) underpins real-time data processing in many mission-critical applications, from banking to e-commerce. These applications typically issue short-duration, latency-sensitive transactions that demand immediate processing. High-volume applications, such as Alibaba's e-commerce platform, achieve peak transaction rates as high as 70 million transactions per second, exceeding the capacity of a single machine. Instead, distributed OLTP database management systems (DBMS) are deployed across multiple powerful machines. Historically, such distributed OLTP DBMSs have been primarily designed to avoid network communication, a paradigm largely unchanged since the 1980s. However, fast networks challenge the conventional belief that network communication is the main bottleneck. In particular, emerging network technologies, like Remote Direct Memory Access (RDMA), radically alter how data can be accessed over a network. RDMA's primitives allow direct access to the memory of a remote machine within an order of magnitude of local memory access. This development invalidates the notion that network communication is the primary bottleneck. Given that traditional distributed database systems have been designed with the premise that the network is slow, they cannot efficiently exploit these fast network primitives, which requires us to reconsider how we design distributed OLTP systems. This thesis focuses on the challenges RDMA presents and its implications on the design of distributed OLTP systems. First, we examine distributed architectures to understand data access patterns and scalability in modern OLTP systems. Drawing on these insights, we advocate a distributed storage engine optimized for high-speed networks. The storage engine serves as the foundation of a database, ensuring efficient data access through three central components: indexes, synchronization primitives, and buffer management (caching). With the introduction of RDMA, the landscape of data access has undergone a significant transformation. This requires a comprehensive redesign of the storage engine components to exploit the potential of RDMA and similar high-speed network technologies. Thus, as the second contribution, we design RDMA-optimized tree-based indexes — especially applicable for disaggregated databases to access remote data efficiently. We then turn our attention to the unique challenges of RDMA. One-sided RDMA, one of the network primitives introduced by RDMA, presents a performance advantage in enabling remote memory access while bypassing the remote CPU and the operating system. This allows the remote CPU to process transactions uninterrupted, with no requirement to be on hand for network communication. However, that way, specialized one-sided RDMA synchronization primitives are required since traditional CPU-driven primitives are bypassed. We found that existing RDMA one-sided synchronization schemes are unscalable or, even worse, fail to synchronize correctly, leading to hard-to-detect data corruption. As our third contribution, we address this issue by offering guidelines to build scalable and correct one-sided RDMA synchronization primitives. Finally, recognizing that maintaining all data in memory becomes economically unattractive, we propose a distributed buffer manager design that efficiently utilizes cost-effective NVMe flash storage. By leveraging low-latency RDMA messages, our buffer manager provides a transparent memory abstraction, accessing the aggregated DRAM and NVMe storage across nodes. Central to our approach is a distributed caching protocol that dynamically caches data. With this approach, our system can outperform RDMA-enabled in-memory distributed databases while managing larger-than-memory datasets efficiently

    Deconstructing “Deviance” and “Disorder” as Systems of Domination: Chicago Public Schools as a Case Study of the Effects of Zero Tolerance Discipline Policies on Educational Outcomes in US Schools

    Get PDF
    The rise of “zero tolerance” discipline practices in US primary and secondary schools has become increasingly well documented by the media and empirical studies. Despite the extensive scholarship that has emerged from these conversations, many of these analyses are limited in their scope and do not connect the phenomena of zero tolerance in schools to the diverse, shifting forces at play within American politics and policy today. As such, the goal of this work is to synthesize ideas about zero tolerance across disciplines by integrating historical thought, philosophical frameworks of punishment, shifting policy goals within the US education system, the sociological constructions of “deviance” and “disorder” in the context of the US criminal justice system, and empirical data directly from a school district to develop particular policy recommendations accordingly. The primary research question of this analysis is: What are the effects of zero tolerance discipline policies on educational outcomes? To answer this question, Chicago Public Schools will be employed as a case study from which lessons for the nation at large will be drawn. Ultimately, this analysis ends up revealing the ways in which zero tolerance policies stem from much deeper forces at play between dominant and marginal groups, and what comes to be defined as “deviance” in relation to a socially constructed system of “order.

    Justice, Resistance and Solidarity: Race and Policing in England and Wales

    Get PDF
    This edition of Perspectives focuses on racism and policing in Britain. It brings together academics, practitioners and activists to examine, and offer their outlook on, the state of policing and its effects on black and minority ethnic communities in 2015 Britain

    Energy-Efficient Workload Placement with Bounded Slowdown in Disaggregated Datacenters

    Get PDF
    Disaggregated Data Center (DDC) is a modern datacenter architecture that decouples hardware resources from monolithic servers into pools of resources that can be dynamically composed to match diverse workload requirements. While disaggregation improves resource utilization, it could negatively impact workload slowdown due to the latency of accessing disaggregated resources over the datacenter network. To this end, we consider CPU and memory disaggregation and conduct measurements to experimentally profile several popular datacenter workloads in order to characterize the impact of disaggregation on workload execution slowdown. We then develop a workload placement algorithm, called Iterative Rounding-based Placement ( IRoP), that given a set of workloads, determines where to place each workload (i.e., on which CPU) and how much local and remote memory is allocated to it. The key insight in designing IRoP is that the impact of remote memory latency on slowdown can be substantially masked by assigning workloads to higher-performing CPUs, albeit at the cost of higher power consumption. As such, IRoP aims to find a workload placement that minimizes the DDC power consumption while respecting a bounded slowdown for each workload. We provide extensive simulation results to demonstrate the flexibility of IRoP in providing a wide range of trade-offs between power consumption and workload slowdown. We also compare IRoP with several existing baselines. Our results indicate that IRoP can reduce power consumption and slowdown in the considered scenarios by up to 8% and 12%, respectively

    Accelerating Network Functions using Reconfigurable Hardware. Design and Validation of High Throughput and Low Latency Network Functions at the Access Edge

    Get PDF
    Providing Internet access to billions of people worldwide is one of the main technical challenges in the current decade. The Internet access edge connects each residential and mobile subscriber to this network and ensures a certain Quality of Service (QoS). However, the implementation of access edge functionality challenges Internet service providers: First, a good QoS must be provided to the subscribers, for example, high throughput and low latency. Second, the quick rollout of new technologies and functionality demands flexible configuration and programming possibilities of the network components; for example, the support of novel, use-case-specific network protocols. The functionality scope of an Internet access edge requires the use of programming concepts, such as Network Functions Virtualization (NFV). The drawback of NFV-based network functions is a significantly lowered resource efficiency due to the execution as software, commonly resulting in a lowered QoS compared to rigid hardware solutions. The usage of programmable hardware accelerators, named NFV offloading, helps to improve the QoS and flexibility of network function implementations. In this thesis, we design network functions on programmable hardware to improve the QoS and flexibility. First, we introduce the host bypassing concept for improved integration of hardware accelerators in computer systems, for example, in 5G radio access networks. This novel concept bypasses the system’s main memory and enables direct connectivity between the accelerator and network interface card. Our evaluations show an improved throughput and significantly lowered latency jitter for the presented approach. Second, we analyze different programmable hardware technologies for hardware-accelerated Internet subscriber handling, including three P4-programmable platforms and FPGAs. Our results demonstrate that all approaches have excellent performance and are suitable for Internet access creation. We present a fully-fledged User Plane Function (UPF) designed upon these concepts and test it in an end-to-end 5G standalone network as part of this contribution. Third, we analyze and demonstrate the usability of Active Queue Management (AQM) algorithms on programmable hardware as an expansion to the access edge. We show the feasibility of the CoDel AQM algorithm and discuss the challenges and constraints to be considered when limited hardware is used. The results show significant improvements in the QoS when the AQM algorithm is deployed on hardware. Last, we focus on network function benchmarking, which is crucial for understanding the behavior of implementations and their optimization, e.g., Internet access creation. For this, we introduce the load generation and measurement framework P4STA, benefiting from flexible software-based load generation and hardware-assisted measuring. Utilizing programmable network switches, we achieve a nanosecond time accuracy while generating test loads up to the available Ethernet link speed

    Habits-of-mind and practices of high-functioning public baccalaureate and comprehensive universities

    Get PDF
    Abstract This study provides better understanding of the practices and habits of thought of two high-functioning public institutions. Both schools, New England College and Midwest State University have received consistently high rankings from commercial ratings publications like U.S. News and World Report, and consistently high and often improving scores on the National Survey of Student Engagement (NSSE). Both schools studied had consistent success despite the economic challenge of their baccalaureate focus and the rapidly changing higher education marketplace. Both Midwest State University and New England College underwent significant change in mission and culture. Despite the disruption inherent in a significant mission change, both schools have, within that change, created practices and habits-of-mind that allowed them to react in a positive and responsive manner to challenges as they present themselves. This study examined the overarching question: How do campus faculty and administrative leaders in high-functioning baccalaureate and comprehensive institutions understand their role and practices, and how they contribute to the success of their institutions? Data collection consisted of a series of interviews with administrative and faculty leaders and a review of documents at the two case institutions. There were a total of 11 participants between the universities and each was involved in a series of three interviews. During data analysis, some common themes were revealed between the two institutions. There was, however, a theme unique to each of the case institutions. The themes shared by both Midwest State University and New England College were: teaching, faculty engagement, leadership, interdisciplinary/general education and being student centric. The theme unique to New England College was honesty; and the theme unique to Midwest State was assessment. This dissertation also provides recommendations to future campus leaders, administration and faculty at public baccalaureate and public comprehensive universities. Some recommendations may be of use to leaders at other kinds of institutions of higher education. Finally, the dissertation suggests additional paths for future research noting existing gaps in the literature
    • …
    corecore