19,581 research outputs found
Tolerating Correlated Failures in Massively Parallel Stream Processing Engines
Fault-tolerance techniques for stream processing engines can be categorized
into passive and active approaches. A typical passive approach periodically
checkpoints a processing task's runtime states and can recover a failed task by
restoring its runtime state using its latest checkpoint. On the other hand, an
active approach usually employs backup nodes to run replicated tasks. Upon
failure, the active replica can take over the processing of the failed task
with minimal latency. However, both approaches have their own inadequacies in
Massively Parallel Stream Processing Engines (MPSPE). The passive approach
incurs a long recovery latency especially when a number of correlated nodes
fail simultaneously, while the active approach requires extra replication
resources. In this paper, we propose a new fault-tolerance framework, which is
Passive and Partially Active (PPA). In a PPA scheme, the passive approach is
applied to all tasks while only a selected set of tasks will be actively
replicated. The number of actively replicated tasks depends on the available
resources. If tasks without active replicas fail, tentative outputs will be
generated before the completion of the recovery process. We also propose
effective and efficient algorithms to optimize a partially active replication
plan to maximize the quality of tentative outputs. We implemented PPA on top of
Storm, an open-source MPSPE and conducted extensive experiments using both real
and synthetic datasets to verify the effectiveness of our approach
Auto-tuning Distributed Stream Processing Systems using Reinforcement Learning
Fine tuning distributed systems is considered to be a craftsmanship, relying
on intuition and experience. This becomes even more challenging when the
systems need to react in near real time, as streaming engines have to do to
maintain pre-agreed service quality metrics. In this article, we present an
automated approach that builds on a combination of supervised and reinforcement
learning methods to recommend the most appropriate lever configurations based
on previous load. With this, streaming engines can be automatically tuned
without requiring a human to determine the right way and proper time to deploy
them. This opens the door to new configurations that are not being applied
today since the complexity of managing these systems has surpassed the
abilities of human experts. We show how reinforcement learning systems can find
substantially better configurations in less time than their human counterparts
and adapt to changing workloads
Microbial processes and bacterial populations associated to anaerobic treatment of sulfate-rich wastewater
A pilot-scale (1.2 m3) anaerobic sequencing batch biofilm reactor (ASBBR) containing mineral coal for biomass attachment was fed with sulfate-rich wastewater at increasing sulfate concentrations. Ethanol was used as the main organic source. Tested COD/sulfate ratios were of 1.8 and 1.5 for sulfate loading rates of 0.65–1.90 kgSO42−/cycle (48 h-cycle) or of 1.0 in the trial with 3.0 gSO42− l−1. Sulfate removal efficiencies observed in all trials were as high as 99%. Molecular inventories indicated a shift on the microbial composition and a decrease on species diversity with the increase of sulfate concentration. Beta-proteobacteria species affiliated with Aminomonas spp. and Thermanaerovibrio spp. predominated at 1.0 gSO42− l−1. At higher sulfate concentrations the predominant bacterial group was Delta-proteobacteria mainly Desulfovibrio spp. and Desulfomicrobium spp. at 2.0 gSO42− l−1, whereas Desulfurella spp. and Coprothermobacter spp. predominated at 3.0 gSO42− l−1. These organisms have been commonly associated with sulfate reduction producing acetate, sulfide and sulfur. Methanogenic archaea (Methanosaeta spp.) was found at 1.0 and 2.0 gSO42− l−1. Additionally, a simplified mathematical model was used to infer on metabolic pathways of the biomass involved in sulfate reduction
- …