446 research outputs found

    Microbial metal resistance and metabolism across dynamic landscapes: high-throughput environmental microbiology.

    Get PDF
    Multidimensional gradients of inorganic compounds influence microbial activity in diverse pristine and anthropogenically perturbed environments. Here, we suggest that high-throughput cultivation and genetics can be systematically applied to generate quantitative models linking gene function, microbial community activity, and geochemical parameters. Metal resistance determinants represent a uniquely universal set of parameters around which to study and evaluate microbial fitness because they represent a record of the environment in which all microbial life evolved. By cultivating microbial isolates and enrichments in laboratory gradients of inorganic ions, we can generate quantitative predictions of limits on microbial range in the environment, obtain more accurate gene annotations, and identify useful strategies for predicting and engineering the trajectory of natural ecosystems

    Using Agent-Based Modelling to Investigate Intervention Algorithms to Reduce Polarisation in Online Social Networks

    Get PDF
    Across much of the western world, political polarisation is on the rise. This has the effect of hindering political discourse, stifling open discussion, and in extreme cases has led to violence. The process of polarising and radicalising vulnerable individuals has migrated to social media websites, which have been implicated in several high profile terror attacks. Within this thesis we model and investigate various algorithms to prevent the spread of polarisation and extremist ideology by employing agent-based modelling techniques from the field of opinion dynamics. The contributions of our work include the following aspects. Firstly, we have developed a unified framework for opinion dynamics, allowing us to experiment easily on a number of different existing models and bringing together sometimes disparate innovations from across the field into one system. Secondly, this unified framework has been implemented in a modular simulator able to perfectly replicate results from purpose-built, stand-alone simulators for two widely used models, namely Relative Agreement and CODA, and then released to the public as the first general-purpose opinion dynamics simulator. Thirdly, we have developed two new intervention algorithms, along with a new metric for measuring the effectiveness of an intervention strategy, which aim to reduce the spread of polarisation across a network with low computational cost. These methods are compared to existing centrality-based methods upon a random network. The experimental results show our proposed approaches outperform centrality measures. We find that our ii iii algorithms are able to prevent up to 40% of non-extremist agents becoming extreme by removing only 10% of the network’s edges. Fourthly, we have investigated the efficacy of these intervention algorithms on polarisation under different scenarios (e.g. variable costs, different network structures). The experimental validation proves the proposed approach is robust and has performed favourably compared existing methods such as centrality-based methods especially on the second type of network. Finally, we have developed a broadcast-based communication system for agents, designed to mimic the one-way broadcast nature of a public social media post such as Twitter, in contrast to the existing model which emulates a two-way private conversation. The experimental result shows a lessening of the impact of our interventions, demonstrating the need for further investigation of such communication methods

    Cold Fusion: Training Seq2Seq Models Together with Language Models

    Full text link
    Sequence-to-sequence (Seq2Seq) models with attention have excelled at tasks which involve generating natural language sentences such as machine translation, image captioning and speech recognition. Performance has further been improved by leveraging unlabeled data, often in the form of a language model. In this work, we present the Cold Fusion method, which leverages a pre-trained language model during training, and show its effectiveness on the speech recognition task. We show that Seq2Seq models with Cold Fusion are able to better utilize language information enjoying i) faster convergence and better generalization, and ii) almost complete transfer to a new domain while using less than 10% of the labeled training data

    Forest Management in Coastal Pine Forests: An Investigation of Prescribed Fire Behavior, Detrital Chemical Composition, and Potential Water Quality Impacts

    Get PDF
    Prescribed fire, thinning, and mastication are common forest management practices implemented in southern pine forests. These practices affect ecosystem properties differently depending upon the intensity at which they are implemented. One ecosystem property of interest is the chemical composition of forest detritus, commonly referred to as the litter and duff. This material is largely responsible for the replenishment of organic resources into soils. It may also be a primary contributor to surface water quality. In this study we were given an opportunity to evaluate two long-term forest management strategies at two sites along the South Carolina coastal plain to determine their effects on forest detrital chemical composition and potential water quality: 1) frequent prescribed fire (annual and biennial) and 2) a combination of periodic prescribed fire (every 3-4 years) and singular implementations of tree thinning and understory mastication. Based upon our analyses, we confirmed that the prescribed fires implemented on these sites display the characteristics of low intensity, low severity surface fires. As such, fuel quantities decreased as a result of forest management at both sites. At one of our sites, the Tom Yawkey Wildlife Center in Georgetown, South Carolina, the chemical functional groups of forest detritus were not greatly altered by fire. Specific compounds within these groups may have been affected by fire, but returned to or fell below long-term unburned levels within one-year post-fire. On our other site, the Santee Experimental Forest, it appears that long-term forest management has altered overstory species composition and subsequently detrital chemical composition. At both sites, potential organic pollutants were reduced by the forest management practices. This reduction may be beneficial in terms of water treatment and human health. These results add to the long list of benefits noted in the literature for active forest management, particularly the benefits of prescribed fire

    Scaling Deep Learning on GPU and Knights Landing clusters

    Full text link
    The speed of deep neural networks training has become a big bottleneck of deep learning research and development. For example, training GoogleNet by ImageNet dataset on one Nvidia K20 GPU needs 21 days. To speed up the training process, the current deep learning systems heavily rely on the hardware accelerators. However, these accelerators have limited on-chip memory compared with CPUs. To handle large datasets, they need to fetch data from either CPU memory or remote processors. We use both self-hosted Intel Knights Landing (KNL) clusters and multi-GPU clusters as our target platforms. From an algorithm aspect, current distributed machine learning systems are mainly designed for cloud systems. These methods are asynchronous because of the slow network and high fault-tolerance requirement on cloud systems. We focus on Elastic Averaging SGD (EASGD) to design algorithms for HPC clusters. Original EASGD used round-robin method for communication and updating. The communication is ordered by the machine rank ID, which is inefficient on HPC clusters. First, we redesign four efficient algorithms for HPC systems to improve EASGD's poor scaling on clusters. Async EASGD, Async MEASGD, and Hogwild EASGD are faster \textcolor{black}{than} their existing counterparts (Async SGD, Async MSGD, and Hogwild SGD, resp.) in all the comparisons. Finally, we design Sync EASGD, which ties for the best performance among all the methods while being deterministic. In addition to the algorithmic improvements, we use some system-algorithm codesign techniques to scale up the algorithms. By reducing the percentage of communication from 87% to 14%, our Sync EASGD achieves 5.3x speedup over original EASGD on the same platform. We get 91.5% weak scaling efficiency on 4253 KNL cores, which is higher than the state-of-the-art implementation

    Training Big Random Forests with Little Resources

    Full text link
    Without access to large compute clusters, building random forests on large datasets is still a challenging problem. This is, in particular, the case if fully-grown trees are desired. We propose a simple yet effective framework that allows to efficiently construct ensembles of huge trees for hundreds of millions or even billions of training instances using a cheap desktop computer with commodity hardware. The basic idea is to consider a multi-level construction scheme, which builds top trees for small random subsets of the available data and which subsequently distributes all training instances to the top trees' leaves for further processing. While being conceptually simple, the overall efficiency crucially depends on the particular implementation of the different phases. The practical merits of our approach are demonstrated using dense datasets with hundreds of millions of training instances.Comment: 9 pages, 9 Figure
    • …
    corecore