184 research outputs found
LuxR-family 'solos': bachelor sensors/regulators of signalling molecules.
N-Acylhomoserine lactone (AHL) quorum-sensing (QS) signalling is the best-understood chemical language in proteobacteria. In the last 15 years a large amount of research in several bacterial species has revealed in detail the genetic, molecular and biochemical mechanisms underlying AHL signalling. These studies have revealed the role played by protein pairs of the AHL synthase belonging to the LuxI family and cognate LuxR-family AHL sensor–regulator. Proteobacteria however commonly possess a QS LuxR-family protein for which there is no obvious cognate LuxI synthase; these proteins are found in bacteria which possess a complete AHL QS system(s) as well as in bacteria that do not. Scientists are beginning to address the roles played by these proteins and it is emerging that they could allow bacteria to respond to endogenous and exogenous signals produced by their neighbours. AHL QS research thus far has mainly focused on a cell-density response involving laboratory monoculture studies. Recent findings on the role played by the unpaired LuxR-family proteins highlight the need to address bacterial behaviour and response to signals in mixed communities. Here we review recent progress with respect to these LuxR proteins, which we propose to call LuxR 'solos' since they act on their own without the need for a cognate signal generator. Initial investigations have revealed that LuxR solos have diverse roles in bacterial interspecies and interkingdom communication
Scalable Distributed DNN Training using TensorFlow and CUDA-Aware MPI: Characterization, Designs, and Performance Evaluation
TensorFlow has been the most widely adopted Machine/Deep Learning framework.
However, little exists in the literature that provides a thorough understanding
of the capabilities which TensorFlow offers for the distributed training of
large ML/DL models that need computation and communication at scale. Most
commonly used distributed training approaches for TF can be categorized as
follows: 1) Google Remote Procedure Call (gRPC), 2) gRPC+X: X=(InfiniBand
Verbs, Message Passing Interface, and GPUDirect RDMA), and 3) No-gRPC: Baidu
Allreduce with MPI, Horovod with MPI, and Horovod with NVIDIA NCCL. In this
paper, we provide an in-depth performance characterization and analysis of
these distributed training approaches on various GPU clusters including the Piz
Daint system (6 on Top500). We perform experiments to gain novel insights along
the following vectors: 1) Application-level scalability of DNN training, 2)
Effect of Batch Size on scaling efficiency, 3) Impact of the MPI library used
for no-gRPC approaches, and 4) Type and size of DNN architectures. Based on
these experiments, we present two key insights: 1) Overall, No-gRPC designs
achieve better performance compared to gRPC-based approaches for most
configurations, and 2) The performance of No-gRPC is heavily influenced by the
gradient aggregation using Allreduce. Finally, we propose a truly CUDA-Aware
MPI Allreduce design that exploits CUDA kernels and pointer caching to perform
large reductions efficiently. Our proposed designs offer 5-17X better
performance than NCCL2 for small and medium messages, and reduces latency by
29% for large messages. The proposed optimizations help Horovod-MPI to achieve
approximately 90% scaling efficiency for ResNet-50 training on 64 GPUs.
Further, Horovod-MPI achieves 1.8X and 3.2X higher throughput than the native
gRPC method for ResNet-50 and MobileNet, respectively, on the Piz Daint
cluster.Comment: 10 pages, 9 figures, submitted to IEEE IPDPS 2019 for peer-revie
Flover: A Temporal Fusion Framework for Efficient Autoregressive Model Parallel Inference
Autoregressive models, despite their commendable performance in a myriad of
generative tasks, face challenges stemming from their inherently sequential
structure. Inference on these models, by design, harnesses a temporal
dependency, where the current token's probability distribution is conditioned
on preceding tokens. This inherent characteristic severely impedes
computational efficiency during inference as a typical inference request can
require more than thousands of tokens, where generating each token requires a
load of entire model weights, making the inference more memory-bound. The large
overhead becomes profound in real deployment where requests arrive randomly,
necessitating various generation lengths. Existing solutions, such as dynamic
batching and concurrent instances, introduce significant response delays and
bandwidth contention, falling short of achieving optimal latency and
throughput. To address these shortcomings, we propose Flover -- a temporal
fusion framework for efficiently inferring multiple requests in parallel. We
deconstruct the general generation pipeline into pre-processing and token
generation, and equip the framework with a dedicated work scheduler for fusing
the generation process temporally across all requests. By orchestrating the
token-level parallelism, Flover exhibits optimal hardware efficiency and
significantly spares the system resources. By further employing a fast buffer
reordering algorithm that allows memory eviction of finished tasks, it brings
over 11x inference speedup on GPT and 16x on LLAMA compared to the cutting-edge
solutions provided by NVIDIA FasterTransformer. Crucially, by leveraging the
advanced tensor parallel technique, Flover proves efficacious across diverse
computational landscapes, from single-GPU setups to distributed scenarios,
thereby offering robust performance optimization that adapts to variable use
cases.Comment: In Proceeding of 30th IEEE International Conference on High
Performance Computing, Data, and Analytics (HiPC
Negative Regulation of Violacein Biosynthesis in Chromobacterium violaceum
In Chromobacteium violaceum, the purple pigment violacein is under positive regulation by the N-acylhomoserine lactone CviI/R quorum sensing system and negative regulation by an uncharacterized putative repressor. In this study we report that the biosynthesis of violacein is negatively controlled by a novel repressor protein, VioS. The violacein operon is regulated negatively by VioS and positively by the CviI/R system in both C. violaceum and in a heterologous Escherichia coli genetic background. VioS does not regulate the CviI/R system and apart from violacein, VioS, and quorum sensing regulate other phenotypes antagonistically. Quorum sensing regulated phenotypes in C. violaceum are therefore further regulated providing an additional level of control
The Case for Co-Designing Model Architectures with Hardware
While GPUs are responsible for training the vast majority of state-of-the-art
deep learning models, the implications of their architecture are often
overlooked when designing new deep learning (DL) models. As a consequence,
modifying a DL model to be more amenable to the target hardware can
significantly improve the runtime performance of DL training and inference. In
this paper, we provide a set of guidelines for users to maximize the runtime
performance of their transformer models. These guidelines have been created by
carefully considering the impact of various model hyperparameters controlling
model shape on the efficiency of the underlying computation kernels executed on
the GPU. We find the throughput of models with efficient model shapes is up to
39\% higher while preserving accuracy compared to models with a similar number
of parameters but with unoptimized shapes
Functional metagenomic analysis of quorum sensing signaling in a nitrifying community.
Quorum sensing (QS) can function to shape the microbial community interactions, composition, and function. In wastewater treatment systems, acylated homoserine lactone (AHL)-based QS has been correlated with the conversion of floccular biomass into microbial granules, as well as EPS production and the nitrogen removal process. However, the role of QS in such complex communities is still not fully understood, including the QS-proficient taxa and the functional QS genes involved. To address these questions, we performed a metagenomic screen for AHL genes in an activated sludge microbial community from the Ulu Pandan wastewater treatment plant (WWTP) in Singapore followed by functional validation of luxI activity using AHL biosensors and LC-MSMS profiling. We identified 13 luxI and 30 luxR homologs from the activated sludge metagenome. Of those genes, two represented a cognate pair of luxIR genes belonging to a Nitrospira spp. and those genes were demonstrated to be functionally active. The LuxI homolog synthesized AHLs that were consistent with the dominant AHLs in the activated sludge system. Furthermore, the LuxR homolog was shown to bind to and induce expression of the luxI promoter, suggesting this represents an autoinduction feedback system, characteristic of QS circuits. Additionally, a second, active promoter was upstream of a gene encoding a protein with a GGDEF/EAL domain, commonly associated with modulating the intracellular concentration of the secondary messenger, c-di-GMP. Thus, the metagenomic approach used here was demonstrated to effectively identify functional QS genes and suggests that Nitrospira spp. maybe QS is active in the activated sludge community
Carbon starvation of Pseudomonas aeruginosa biofilms selects for dispersal insensitive mutants.
BACKGROUND: Biofilms disperse in response to specific environmental cues, such as reduced oxygen concentration, changes in nutrient concentration and exposure to nitric oxide. Interestingly, biofilms do not completely disperse under these conditions, which is generally attributed to physiological heterogeneity of the biofilm. However, our results suggest that genetic heterogeneity also plays an important role in the non-dispersing population of P. aeruginosa in biofilms after nutrient starvation. RESULTS: In this study, 12.2% of the biofilm failed to disperse after 4 d of continuous starvation-induced dispersal. Cells were recovered from the dispersal phase as well as the remaining biofilm. For 96 h starved biofilms, rugose small colony variants (RSCV) were found to be present in the biofilm, but were not observed in the dispersal effluent. In contrast, wild type and small colony variants (SCV) were found in high numbers in the dispersal phase. Genome sequencing of these variants showed that most had single nucleotide mutations in genes associated with biofilm formation, e.g. in wspF, pilT, fha1 and aguR. Complementation of those mutations restored starvation-induced dispersal from the biofilms. Because c-di-GMP is linked to biofilm formation and dispersal, we introduced a c-di-GMP reporter into the wild-type P. aeruginosa and monitored green fluorescent protein (GFP) expression before and after starvation-induced dispersal. Post dispersal, the microcolonies were smaller and significantly brighter in GFP intensity, suggesting the relative concentration of c-di-GMP per cell within the microcolonies was also increased. Furthermore, only the RSCV showed increased c-di-GMP, while wild type and SCV were no different from the parental strain. CONCLUSIONS: This suggests that while starvation can induce dispersal from the biofilm, it also results in strong selection for mutants that overproduce c-di-GMP and that fail to disperse in response to the dispersal cue, starvation
The afc antifungal activity cluster, which is under tight regulatory control of ShvR, is essential for transition from intracellular persistence of Burkholderia cenocepacia to acute pro-inflammatory infection.
The opportunistic pathogen Burkholderia cenocepacia is particularly life-threatening for cystic fibrosis (CF) patients. Chronic lung infections with these bacteria can rapidly develop into fatal pulmonary necrosis and septicaemia. We have recently shown that macrophages are a critical site for replication of B. cenocepacia K56-2 and the induction of fatal pro-inflammatory responses using a zebrafish infection model. Here, we show that ShvR, a LysR-type transcriptional regulator that is important for biofilm formation, rough colony morphotype and inflammation in a rat lung infection model, is also required for the induction of fatal pro-inflammatory responses in zebrafish larvae. ShvR was not essential, however, for bacterial survival and replication in macrophages. Temporal, rhamnose-induced restoration of shvR expression in the shvR mutant during intramacrophage stages unequivocally demonstrated a key role for ShvR in transition from intracellular persistence to acute fatal pro-inflammatory disease. ShvR has been previously shown to tightly control the expression of the adjacent afc gene cluster, which specifies the synthesis of a lipopeptide with antifungal activity. Mutation of afcE, encoding an acyl-CoA dehydrogenase, has been shown to give similar phenotypes as the shvR mutant. We found that, like shvR, afcE is also critical for the switch from intracellular persistence to fatal infection in zebrafish. The closely related B. cenocepacia H111 has been shown to be less virulent than K56-2 in several infection models, including Galleria mellonella and rats. Interestingly, constitutive expression of shvR in H111 increased virulence in zebrafish larvae to almost K56-2 levels in a manner that absolutely required afc. These data confirm a critical role for afc in acute virulence caused by B. cenocepacia that depends on strain-specific regulatory control by ShvR. We propose that ShvR and AFC are important virulence factors of the more virulent Bcc species, either through pro-inflammatory effects of the lipopeptide AFC, or through AFC-dependent membrane properties
- …