122 research outputs found

    A Survey on Compiler Autotuning using Machine Learning

    Full text link
    Since the mid-1990s, researchers have been trying to use machine-learning based approaches to solve a number of different compiler optimization problems. These techniques primarily enhance the quality of the obtained results and, more importantly, make it feasible to tackle two main compiler optimization problems: optimization selection (choosing which optimizations to apply) and phase-ordering (choosing the order of applying optimizations). The compiler optimization space continues to grow due to the advancement of applications, increasing number of compiler optimizations, and new target architectures. Generic optimization passes in compilers cannot fully leverage newly introduced optimizations and, therefore, cannot keep up with the pace of increasing options. This survey summarizes and classifies the recent advances in using machine learning for the compiler optimization field, particularly on the two major problems of (1) selecting the best optimizations and (2) the phase-ordering of optimizations. The survey highlights the approaches taken so far, the obtained results, the fine-grain classification among different approaches and finally, the influential papers of the field.Comment: version 5.0 (updated on September 2018)- Preprint Version For our Accepted Journal @ ACM CSUR 2018 (42 pages) - This survey will be updated quarterly here (Send me your new published papers to be added in the subsequent version) History: Received November 2016; Revised August 2017; Revised February 2018; Accepted March 2018

    A Taxonomy of Data Grids for Distributed Data Sharing, Management and Processing

    Full text link
    Data Grids have been adopted as the platform for scientific communities that need to share, access, transport, process and manage large data collections distributed worldwide. They combine high-end computing technologies with high-performance networking and wide-area storage management techniques. In this paper, we discuss the key concepts behind Data Grids and compare them with other data sharing and distribution paradigms such as content delivery networks, peer-to-peer networks and distributed databases. We then provide comprehensive taxonomies that cover various aspects of architecture, data transportation, data replication and resource allocation and scheduling. Finally, we map the proposed taxonomy to various Data Grid systems not only to validate the taxonomy but also to identify areas for future exploration. Through this taxonomy, we aim to categorise existing systems to better understand their goals and their methodology. This would help evaluate their applicability for solving similar problems. This taxonomy also provides a "gap analysis" of this area through which researchers can potentially identify new issues for investigation. Finally, we hope that the proposed taxonomy and mapping also helps to provide an easy way for new practitioners to understand this complex area of research.Comment: 46 pages, 16 figures, Technical Repor

    Vcluster: A Portable Virtual Computing Library For Cluster Computing

    Get PDF
    Message passing has been the dominant parallel programming model in cluster computing, and libraries like Message Passing Interface (MPI) and Portable Virtual Machine (PVM) have proven their novelty and efficiency through numerous applications in diverse areas. However, as clusters of Symmetric Multi-Processor (SMP) and heterogeneous machines become popular, conventional message passing models must be adapted accordingly to support this new kind of clusters efficiently. In addition, Java programming language, with its features like object oriented architecture, platform independent bytecode, and native support for multithreading, makes it an alternative language for cluster computing. This research presents a new parallel programming model and a library called VCluster that implements this model on top of a Java Virtual Machine (JVM). The programming model is based on virtual migrating threads to support clusters of heterogeneous SMP machines efficiently. VCluster is implemented in 100% Java, utilizing the portability of Java to address the problems of heterogeneous machines. VCluster virtualizes computational and communication resources such as threads, computation states, and communication channels across multiple separate JVMs, which makes a mobile thread possible. Equipped with virtual migrating thread, it is feasible to balance the load of computing resources dynamically. Several large scale parallel applications have been developed using VCluster to compare the performance and usage of VCluster with other libraries. The results of the experiments show that VCluster makes it easier to develop multithreading parallel applications compared to conventional libraries like MPI. At the same time, the performance of VCluster is comparable to MPICH, a widely used MPI library, combined with popular threading libraries like POSIX Thread and OpenMP. In the next phase of our work, we implemented thread group and thread migration to demonstrate the feasibility of dynamic load balancing in VCluster. We carried out experiments to show that the load can be dynamically balanced in VCluster, resulting in a better performance. Thread group also makes it possible to implement collective communication functions between threads, which have been proved to be useful in process based libraries

    Idaho National Laboratory Cultural Resource Management Plan

    Full text link

    A Survey of Large Language Models

    Full text link
    Language is essentially a complex, intricate system of human expressions governed by grammatical rules. It poses a significant challenge to develop capable AI algorithms for comprehending and grasping a language. As a major approach, language modeling has been widely studied for language understanding and generation in the past two decades, evolving from statistical language models to neural language models. Recently, pre-trained language models (PLMs) have been proposed by pre-training Transformer models over large-scale corpora, showing strong capabilities in solving various NLP tasks. Since researchers have found that model scaling can lead to performance improvement, they further study the scaling effect by increasing the model size to an even larger size. Interestingly, when the parameter scale exceeds a certain level, these enlarged language models not only achieve a significant performance improvement but also show some special abilities that are not present in small-scale language models. To discriminate the difference in parameter scale, the research community has coined the term large language models (LLM) for the PLMs of significant size. Recently, the research on LLMs has been largely advanced by both academia and industry, and a remarkable progress is the launch of ChatGPT, which has attracted widespread attention from society. The technical evolution of LLMs has been making an important impact on the entire AI community, which would revolutionize the way how we develop and use AI algorithms. In this survey, we review the recent advances of LLMs by introducing the background, key findings, and mainstream techniques. In particular, we focus on four major aspects of LLMs, namely pre-training, adaptation tuning, utilization, and capacity evaluation. Besides, we also summarize the available resources for developing LLMs and discuss the remaining issues for future directions.Comment: ongoing work; 51 page

    Mycosphaerella leaf disease on eucalypts in Western Australia - The diversity and impact

    Get PDF
    Eucalyptus plantation forestry in Western Australia (WA) is a relatively young industry and by the end of 2008, the total plantation estate (softwood and hardwood) was over 950 000 ha. The predominant plantation species is Eucalyptus globulus, native to south-eastern Australia. In Western Australia (WA), the most serious foliar disease of eucalypt plantations is Mycosphaerella Leaf Disease (MLD). However, little systematic sampling for MLD has been carried out in WA to determine its impact on plantations, yields, species involved or whether they are introduced or not. The overall aim of this thesis was to investigate MLD in south-western Australia with a particular focus on the species diversity, taxonomy and the impact on early growth on E. globulus. The increase in the number of Mycosphaerella and Teratosphaeria species associated with Mycosphaerella leaf disease (MLD) in E. globulus plantations in WA in the past decade has raised concern about the possible movement of pathogens between the native forests and plantations and vice versa. A survey of necrotic leaf spots collected from plantation and endemic eucalypts from WA and Queensland was conducted. Overall, ten new Eucalyptus host records for Mycosphaerella/ Teratosphaeria species were isolated from WA and five from Queensland. Significantly, M. nubilosa was isolated from E. grandis x resinifera and E. urophylla x globulus in WA. This is the first time M. nubilosa has been isolated from Eucalyptus hosts within the series Resinifera (see Chapter 2). An assessment of the number of fungi that may be contributing to MLD in E. globulus plantations in WA was undertaken (Chapter 3) and the changes in the number of species and their incidence since the first surveys were conducted. Four new records of Mycosphaerella were identified in this study; M. ellipsoidea, P. fori, M. suttoniae and M. tasmaniensis. Mycosphaerella ellipsoidea and P. fori are first records for Australia, and M. suttoniae and M. tasmaniensis are first records for WA. The current work shows an increase in the number of Mycosphaerella species associated with plantation eucalypts in WA and Australia. With the exception of M. cryptica, none of these species were known in WA prior to the commencement of large-scale E. globulus plantations, and with M. cryptica as the exception, none have a known impact on the major native eucalypts in the region. The ITS region of the type material of T. parva, M. grandis and M. gregaria using culture and herbarium specimens was sequenced and compared to existing sequences from GenBank (Chapter 4). This was the first study to examine and sequence the type material of M. grandis, T. parva and M. gregaria. As the sequences of the ITS region of M. grandis and T. parva were identical it was concluded that M. grandis be reduced to synonymy with T. parva. Mycosphaerella aurantia, M. buckinghamiae and M. africana also match the type sequence of M. gregaria. Therefore, these should all be synonymised to M. gregaria. Also, this study was the first to describe ITS sequence variation within the same Mycosphaerella isolate. The aim of Chapter 5 was to identify the infection pathway at the leaf surface using scanning electron microscopy and to determine the pathogenicity of M. marksii on E. globulus. The use of glycerol as a surfactant and its effect on ascospore viability was also assessed. However, this study was unable to confirm pathogenicity of M. marksii on E. globulus seedlings under laboratory conditions. However, M. marksii ascospores were able to germinate and enter E. globulus stoma 3–6 days after initial infection. Species-specific primers were successfully designed and tested for three Mycosphaerella species that occur on E. globulus in WA (Chapter 6). Meteorological conditions appeared to determine the defoliation of juvenile foliage and not MLD as levels of MLD remained relatively low throughout the trial period. The MLD levels increased throughout spring as warm wet conditions favoured the development of disease especially on the flush of new juvenile foliage. Also, new foliage emerged after late summer rainfall. As disease pressure mounted, the trees responded through defoliation. As temperatures increased and the juvenile foliage aged, there is likely to have been an increase in the defoliation of leaves. Therefore, by mid-summer defoliation levels reached a similar level to disease and insect damage. Following leaf defoliation and the emergence of new juvenile and adult leaves, the relative amount of disease on the trees decreased. This is because most of the disease was present on the older juvenile foliage which was shed. Field observations can be a reliable indication of disease progression. Although field observations at a branch level over exaggerated levels of MLD when there was a higher level of foliage, there was still a similar trend in the amount of disease when compared to the ASSESS program. Some experience in disease monitoring would indicate a more accurate assessment of MLD. It is interesting to note that the assessors tended to overestimate disease when MLD was at a higher level, and this also included the author. Infection studies of Uwebraunia dekkeri were conducted to confirm how this species enters E. globulus leaves and to determine its pathogenicity (Chapter 7). This study demonstrated that conidia of U. dekkeri could infect E. globulus leaves and that it is not a hyperparasite of M. cryptica or M. nubilosa. Conidiogenesis was both percurrent and sympodial and the phenomenon of anastomosis was observed for the first time on the leaf surface. The impact that MLD has on the wood volume has previously not been investigated in WA (Chapter 8). Through the application of pesticides and fungicides in the early stages of establishment at two plantations near Albany, tree volumes were significantly increased. However, the increase in wood volume would be offset by the pesticide and application costs. This study demonstrated that monitoring for pests and disease would be more effective than spraying of chemical treatments for the first three years. The regular use of chemical treatments is expensive to maintain and is proving to be environmentally unacceptable by some communities. This study also showed that spraying for low levels of MLD had little effect on disease incidence and/ or volume increase in E. globulus plantations in WA. The most important factors for a healthy plantation appear to be site selection, preparation and tree genetics. This study was the first to investigate the impact of MLD on the growth of Eucalyptus globulus plantations in WA. As part of this study, the biology, taxonomy and pathogenicity of the main species present in WA were investigated. The key findings were: i) the number, abundance and distribution of Mycosphaerella/ Teratosphaeria species in WA is not static and plantations should be continually monitored for the presence of new potentially threatening species; ii) spraying for MLD, although effective in reducing the prevalence and impact on growth, was not economically viable; and iii) intragenomic variation of the ribosomal genome may explain sequence variation observed in single spore isolates of Mycosphaerella/ Teratosphaeria and this has taxonomic implications. Further work would identify the impact the new records are having on the plantation estate and also if these species have the potential to spread into the neighbouring endemic forests. This study has provided a broader understanding of MLD in WA and the development of tools that could be used for further study
    • …
    corecore