334 research outputs found

    Cache affinity optimization techniques for scaling software transactional memory systems on multi-CMP architectures

    Get PDF
    Software transactional memory (STM) enhances both ease-of-use and concurrency, and is considered one of the next-generation paradigms for parallel programming. Application programs may see hotspots where data conflicts are intensive and seriously degrade the performance. So advanced STM systems employ dynamic concurrency control techniques to curb the conflict rate through properly throttling the rate of spawning transactions. High-end computers may have two or more multicore processors so that data sharing among cores goes through a non-uniform cache memory hierarchy. This poses challenges to concurrency control designs as improper metadata placement and sharing will introduce scalability issues to the system. Poor thread-to-core mappings that induce excessive cache invalidation are also detrimental to the overall performance. In this paper, we share our experience in designing and implementing a new dynamic concurrency controller for Tiny STM, which helps keeping the system concurrency at a near-optimal level. By decoupling unfavourable metadata sharing, our controller design avoids costly inter-processor communications. It also features an affinity-aware thread migration technique that fine-tunes thread placements by observing inter-thread transactional conflicts. We evaluate our implementation using the STAMP benchmark suite and show that the controller can bring around 21% average speedup over the baseline execution. © 2015 IEEE.postprin

    Towards Exascale Scientific Metadata Management

    Full text link
    Advances in technology and computing hardware are enabling scientists from all areas of science to produce massive amounts of data using large-scale simulations or observational facilities. In this era of data deluge, effective coordination between the data production and the analysis phases hinges on the availability of metadata that describe the scientific datasets. Existing workflow engines have been capturing a limited form of metadata to provide provenance information about the identity and lineage of the data. However, much of the data produced by simulations, experiments, and analyses still need to be annotated manually in an ad hoc manner by domain scientists. Systematic and transparent acquisition of rich metadata becomes a crucial prerequisite to sustain and accelerate the pace of scientific innovation. Yet, ubiquitous and domain-agnostic metadata management infrastructure that can meet the demands of extreme-scale science is notable by its absence. To address this gap in scientific data management research and practice, we present our vision for an integrated approach that (1) automatically captures and manipulates information-rich metadata while the data is being produced or analyzed and (2) stores metadata within each dataset to permeate metadata-oblivious processes and to query metadata through established and standardized data access interfaces. We motivate the need for the proposed integrated approach using applications from plasma physics, climate modeling and neuroscience, and then discuss research challenges and possible solutions

    Storage Solutions for Big Data Systems: A Qualitative Study and Comparison

    Full text link
    Big data systems development is full of challenges in view of the variety of application areas and domains that this technology promises to serve. Typically, fundamental design decisions involved in big data systems design include choosing appropriate storage and computing infrastructures. In this age of heterogeneous systems that integrate different technologies for optimized solution to a specific real world problem, big data system are not an exception to any such rule. As far as the storage aspect of any big data system is concerned, the primary facet in this regard is a storage infrastructure and NoSQL seems to be the right technology that fulfills its requirements. However, every big data application has variable data characteristics and thus, the corresponding data fits into a different data model. This paper presents feature and use case analysis and comparison of the four main data models namely document oriented, key value, graph and wide column. Moreover, a feature analysis of 80 NoSQL solutions has been provided, elaborating on the criteria and points that a developer must consider while making a possible choice. Typically, big data storage needs to communicate with the execution engine and other processing and visualization technologies to create a comprehensive solution. This brings forth second facet of big data storage, big data file formats, into picture. The second half of the research paper compares the advantages, shortcomings and possible use cases of available big data file formats for Hadoop, which is the foundation for most big data computing technologies. Decentralized storage and blockchain are seen as the next generation of big data storage and its challenges and future prospects have also been discussed

    Annotated Bibliography: Anticipation

    Get PDF

    Near Data Processing for Efficient and Trusted Systems

    Full text link
    We live in a world which constantly produces data at a rate which only increases with time. Conventional processor architectures fail to process this abundant data in an efficient manner as they expend significant energy in instruction processing and moving data over deep memory hierarchies. Furthermore, to process large amounts of data in a cost effective manner, there is increased demand for remote computation. While cloud service providers have come up with innovative solutions to cater to this increased demand, the security concerns users feel for their data remains a strong impediment to their wide scale adoption. An exciting technique in our repertoire to deal with these challenges is near-data processing. Near-data processing (NDP) is a data-centric paradigm which moves computation to where data resides. This dissertation exploits NDP to both process the data deluge we face efficiently and design low-overhead secure hardware designs. To this end, we first propose Compute Caches, a novel NDP technique. Simple augmentations to underlying SRAM design enable caches to perform commonly used operations. In-place computation in caches not only avoids excessive data movement over memory hierarchy, but also significantly reduces instruction processing energy as independent sub-units inside caches perform computation in parallel. Compute Caches significantly improve the performance and reduce energy expended for a suite of data intensive applications. Second, this dissertation identifies security advantages of NDP. While memory bus side channel has received much attention, a low-overhead hardware design which defends against it remains elusive. We observe that smart memory, memory with compute capability, can dramatically simplify this problem. To exploit this observation, we propose InvisiMem which uses the logic layer in the smart memory to implement cryptographic primitives, which aid in addressing memory bus side channel efficiently. Our solutions obviate the need for expensive constructs like Oblivious RAM (ORAM) and Merkle trees, and have one to two orders of magnitude lower overheads for performance, space, energy, and memory bandwidth, compared to prior solutions. This dissertation also addresses a related vulnerability of page fault side channel in which the Operating System (OS) induces page faults to learn application's address trace and deduces application secrets from it. To tackle it, we propose Sanctuary which obfuscates page fault channel while allowing the OS to manage memory as a resource. To do so, we design a novel construct, Oblivious Page Management (OPAM) which is derived from ORAM but is customized for page management context. We employ near-memory page moves to reduce OPAM overhead and also propose a novel memory partition to reduce OPAM transactions required. For a suite of cloud applications which process sensitive data we show that page fault channel can be tackled at reasonable overheads.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/144139/1/shaizeen_1.pd

    Formal Verification of a MESI-based Cache Implementation

    Get PDF
    Cache coherency is crucial to multi-core systems with a shared memory programming model. Coherency protocols have been formally verified at the architectural level with relative ease. However, several subtle issues creep into the hardware realization of cache in a multi-processor environment. The assumption, made in the abstract model, that state transitions are atomic, is invalid for the HDL implementation. Each transition is composed of many concurrent multi-core operations. As a result, even with a blocking bus, several transient states come into existence. Most modern processors optimize communication with a split-transaction bus, this results in further transient states and race conditions. Therefore, the design and verification of cache coherency is increasingly complex and challenging. Simulation techniques are insufficient to ensure memory consistency and the absence of deadlock, livelock, and starvation. At best, it is tediously complex and time consuming to reach confidence in functionality with simulation. Formal methods are ideally suited to identify the numerous race conditions and subtle failures. In this study, we perform formal property verification on the RTL of a multi-core level-1 cache design based on snooping MESI protocol. We demonstrate full-proof verification of the coherence module in JasperGold using complexity reduction techniques through parameterization. We verify that the assumptions needed to constrain inputs of the stand-alone cache coherence module are satisfied as valid assertions in the instantiation environment. We compare results obtained from formal property verification against a state-of-the-art UVM environment. We highlight the benefits of a synergistic collaboration between simulation and formal techniques. We present formal analysis as a generic toolkit with numerous usage models in the digital design process

    Communicating through sound in museum exhibitions: unravelling a field of practice

    Get PDF
    The twentieth century was the stage for several phenomena which have paved the way for museums to start exhibiting sound and to nurture a vivid and increasing interest in its potentialities. The burgeoning of sound recording technologies stands as a milestone in this respect. These have allowed sound to become a physical object and, hence, new understandings and conceptualizations to emerge. In the wake of these developments, the ways in which museum curators look at sound has gone into a huge reconfiguration. The fact that both new museology and museum practice have been turning their attention to and focus on the visitor has similarly accelerated the curators’ interest in sound as a means to build museum exhibitions. One of the latest and most striking instances in this process has been the role of ethnomusicology and sound studies in demonstrating the cultural, social, political, economic and ethical significance of sound thereby stimulating museum’s interest in dealing with sound as a mode to build both individual subjectivities and communities in museum settings. The development of audio technologies and digital and multisensorial technologies (Virtual Reality, Augmented Reality and Mixed Reality) also plays a part in this process. These have the merit to provide ways to deal with the elusiveness of sound when exhibited in museum galleries and to facilitate interactions underpinned by rationales such as experience, embodiment, and emplacement. During at least the last ten years, there has been a boost in the development of sound-based multimodal museum practices. These practices, nonetheless, have yet to be mapped, and their representational and experiential (emotional and sensorial) opportunities to be closely analysed. My thesis strives to start closing this gap by taking two analytical steps. Based on the analysis of 69 sound-based multimodal museum exhibitions staged in Europe and in the United States of America, I provide a five-use framework categorizing sound-based multimodal museum practices into sound as a “lecturing” mode, sound as an artefact, sound as “ambiance”/soundtrack, sound as art, and sound as a mode for crowd-curation. The case-study of sound art The Visitors, it unravels the communicative potential of sound for museums. In detail, the analysis stresses how sound and space comingle to articulate individual subjectivities and a sense of “togetherness.” The scope of the thesis is clearly multidisciplinary, encompassing ethnomusicology, sound studies, museum studies, and social semiotics. Overall, I seek to contribute towards the development of the study of sound in museums to develop and establish as a cohesive research field. I moreover seek to foster a sensory formation shift from a visual epistemology to one that merges the visual and the auditory.O século XX foi palco de vários fenómenos que conduziram a que os museus começassem a expor o som e a demonstrar um interesse crescente pelas suas potencialidades comunicativas. O aparecimento das tecnologias de gravação sonora constitui-se como um momento fundamental neste processo. Ao permitirem que o som se estabeleça enquanto objeto físico, vieram potenciar o aparecimento de novos entendimentos e conceptualizações sobre o som. Na sequência destes acontecimentos, a forma como os curadores de exposições começaram a olhar para o som sofreu grandes alterações. Simultaneamente, o facto de tanto os estudos museológicos como a prática museológica estarem cada vez mais preocupados com o visitante veio também acelerar o interesse dos curadores pelo som como meio para construir exposições museológicas. Os estudos musicais, em particular a etnomusicologia e os estudos de som, tiveram igualmente um papel preponderante: ao demonstrarem o valor cultural, social, político, económico e ético do som vieram claramente estimular o interesse dos curadores em usar o som como material para trabalhar noções de identidade, subjectividade e “comunhão.” É ainda de destacar o papel que o desenvolvimento de tecnologias áudio, digitais e multisensoriais (Realidade Virtual, Realidade Aumentada e Realidade Mista) têm no processo. Ao proporcionarem formas de lidar com a imaterialidade do som quando exposto em galerias, vieram também fomentar interações museológicas sustentadas pela experiência. Nos últimos dez anos, os museus têm, pois, assistido ao incrementar das práticas museológicas multimodais baseadas no som. O mapeamento e a categorização destas práticas, bem como o estudo das suas potencialidades narrativas e experienciais (emocionais e sensoriais), no entanto, está claramente por determinar. A minha tese visa dar início ao colmatar desta lacuna através de dois passos: providenciar uma estrutura classificativa das práticas multimodais baseadas em som com base na análise de 69 exposições que tiveram lugar nos últimos dez anos na Europa e nos Estados Unidos da América. A estrutura compreende as seguintes categorias: som como um modo "discursivo," som como artefacto, som como "ambiance"/banda sonora, som como arte, e som como curadoria partilhada. Simultaneamente, dar início ao desvendar do potencial comunicativo do som para exposições museológicas através do estudo de caso de arte sonora The Visitors. A análise deste estudo de caso veio demonstrar que som, em articulação com o espaço permitem trabalhar noções de identidade, subjetividade, e ainda de “comunhão.” O âmbito da tese é claramente multidisciplinar e engloba a etnomusicologia, os estudos de som, os estudos museológicos e a semiótica social. De uma forma geral, com a minha dissertação procuro contribuir para o desenvolvimento e o estabelecimento do estudo do uso do som nos museus como um campo de investigação multidisciplinar e coeso. Procuro ainda potenciar uma mudança de formação sensorial nos museus, em particular, estimular a passagem de uma epistemologia visual para uma epistemologia simultaneamente visual e auditiva

    High-level synthesis of fine-grained weakly consistent C concurrency

    Get PDF
    High-level synthesis (HLS) is the process of automatically compiling high-level programs into a netlist (collection of gates). Given an input program, HLS tools exploit its inherent parallelism and pipelining opportunities to generate efficient customised hardware. C-based programs are the most popular input for HLS tools, but these tools historically only synthesise sequential C programs. As the appeal for software concurrency rises, HLS tools are beginning to synthesise concurrent C programs, such as C/C++ pthreads and OpenCL. Although supporting software concurrency leads to better hardware parallelism, shared memory synchronisation is typically serialised to ensure correct memory behaviour, via locks. Locks are safety resources that ensure exclusive access of shared memory, eliminating data races and providing synchronisation guarantees for programmers.  As an alternative to lock-based synchronisation, the C memory model also defines the possibility of lock-free synchronisation via fine-grained atomic operations (`atomics'). However, most HLS tools either do not support atomics at all or implement atomics using locks. Instead, we treat the synthesis of atomics as a scheduling problem. We show that we can augment the intra-thread memory constraints during memory scheduling of concurrent programs to support atomics. On average, hardware generated by our method is 7.5x faster than the state-of-the-art, for our set of experiments. Our method of synthesising atomics enables several unique possibilities. Chiefly, we are capable of supporting weakly consistent (`weak') atomics, which necessitate fewer ordering constraints compared to sequentially consistent (SC) atomics. However, implementing weak atomics is complex and error-prone and hence we formally verify our methods via automated model checking to ensure our generated hardware is correct. Furthermore, since the C memory model defines memory behaviour globally, we can globally analyse the entire program to generate its memory constraints. Additionally, we can also support loop pipelining by extending our methods to generate inter-iteration memory constraints. On average, weak atomics, global analysis and loop pipelining improve performance by 1.6x, 3.4x and 1.4x respectively, for our set of experiments. Finally, we present a case study of a real-world example via an HLS-based Google PageRank algorithm, whose performance improves by 4.4x via lock-free streaming and work-stealing.Open Acces

    Topic extraction in words networks

    Get PDF

    Development of a parallel database environment

    Get PDF
    corecore