424 research outputs found

    Leveraging Coding Techniques for Speeding up Distributed Computing

    Get PDF
    Large scale clusters leveraging distributed computing frameworks such as MapReduce routinely process data that are on the orders of petabytes or more. The sheer size of the data precludes the processing of the data on a single computer. The philosophy in these methods is to partition the overall job into smaller tasks that are executed on different servers; this is called the map phase. This is followed by a data shuffling phase where appropriate data is exchanged between the servers. The final so-called reduce phase, completes the computation. One potential approach, explored in prior work for reducing the overall execution time is to operate on a natural tradeoff between computation and communication. Specifically, the idea is to run redundant copies of map tasks that are placed on judiciously chosen servers. The shuffle phase exploits the location of the nodes and utilizes coded transmission. The main drawback of this approach is that it requires the original job to be split into a number of map tasks that grows exponentially in the system parameters. This is problematic, as we demonstrate that splitting jobs too finely can in fact adversely affect the overall execution time. In this work we show that one can simultaneously obtain low communication loads while ensuring that jobs do not need to be split too finely. Our approach uncovers a deep relationship between this problem and a class of combinatorial structures called resolvable designs. Appropriate interpretation of resolvable designs can allow for the development of coded distributed computing schemes where the splitting levels are exponentially lower than prior work. We present experimental results obtained on Amazon EC2 clusters for a widely known distributed algorithm, namely TeraSort. We obtain over 4.69×\times improvement in speedup over the baseline approach and more than 2.6×\times over current state of the art

    Real Analysis in Functional Equations

    Full text link
    In this article, we will showcase some analytical concepts that can be used to tackle Functional Equations (FE) in the positive real numbers domain. Such concepts and related techniques have occasionally appeared in recent High School Math Olympiads, and they are often accompanied neatly by other known techniques. In each section, we develop a theoretical background; next, we briefly mention methods that employ the theory and conclude the article by providing unsolved problems that the reader can try independently

    Genome-Wide Proteomics and Quantitative Analyses on Halophilic Archaea

    Get PDF
    The aerobic, haloalkaliphilic archaeon Natronomonas pharaonis is able to survive in salt-saturated lakes of pH 11. With genome-wide shotgun proteomics, 886 soluble proteins (929 proteins in total) of the theoretical Natronomonas pharaonis soluble proteome consisting of 2187 proteins have been confidentially identified by MS/MS. By comparing the identified proteins of Natronomonas pharaonis with homologues of other organisms, both extreme diversity between halophiles and occasional extraordinary sequence conservation to proteins from unrelated species were observed, substantiating genetic exchange between organisms that are evolutionary nearly unrelated to cope with several extreme conditions. Alternative and largely overlapping open reading frames (called overprinting) could not be identified in the genome of neither Natronomonas pharaonis nor Halobacterium salinarum, leading to the conclusion that in halophiles, not more than one protein can be produced from the same genomic sequence stretch. In the second part of this work, analyses on both the transcriptional and translational level have been performed on the halophilic archaeon Halobacterium salinarum, to gain insights into its lifestyle changes leading to cell response when challenged by heat shock. Thereby, quantitative proteomic data obtained from two different approaches regarding the labeling method (ICPL; SILAC), the fractionation of the protein or peptide mixtures (2DE; 1DE-LC), the mass spectrometric analysis (MALDI-TOF/TOF; ESI Q-TOF), and the choice of the growth medium (complex; synthetic) were integrated with data from whole-genome DNA microarrays, real-time quantitative PCR (RTqPCR), and Northern analyses. A number of genes congruently displayed substantial induction after heat shock on both the transcript and protein level as in the case of the thermosome, two AAA-type ATPases, a Dps-like ferritin protein (DpsA), a hsp5-type molecular chaperone, and the transcription initiation factor tfbB. In contrast, the dnaK operon (hsp70) did not exhibit any significant upregulation in either of the approaches. Some genes encoding enzymes of the TCA cycle, of pathways flowing into the latter, and of pathways leading to pyrimidine synthesis, were only translationally induced. Finally, differential transcriptional induction of the transcription initiation factors tfbB and tfbA, determined by RTqPCR, led to the conclusion that they may regulate genes by reciprocal action. The multiplicity of proteomics and transcriptomics methods are complementing one another, covering a bigger area on the one hand, but also confirming some unexpected findings

    Numerical analysis of coalescence-induced jumping droplets on superhydrophobic surfaces

    Get PDF
    Bio-inspired superhydrophobic surfaces are used in numerous technological applications due to their self-cleaning ability. One of the several mechanisms reported in literature and responsible for self-cleaning is the phenomenon of coalescence-induced jumping of droplets from such surfaces. The phenomenon is observed for scales below the capillary length and when gravity is negligible. Primary applications of this technology are on heat-exchangers or any other that involve surfaces for which anti-icing and water-repellency properties are desired. This thesis comprises two publications that involve high-fidelity numerical investigations on fundamental features of the jumping droplets phenomenon and focuses on two important aspects. The first one is a study on coalescing and jumping of microdroplets (R < 10 \ub5m). The differences in the jumping process (for example, reduction of the merged droplet jumping velocity) are pointed out as a function of the initial size of the droplets. Through an analysis of the energy budget, several degrees of dissipation are found, which is attributed to a competition between viscosity and the strong capillarity on the interface. The second publication focuses on the interaction of the merged droplet with a superhydrophobic surface with hysteresis. It is found that such a case has a reduced jumping velocity as compared to a no-hysteresis one. Using a dynamic contact angle model is beneficial to capture the receding contact angle and provide a more accurate estimation of the overall process. In this work, a combined Immersed Boundary -- Volume-of-fluid method with different contact angle models and a Navier-slip boundary condition is used. The numerical framework has been extensively validated

    Erasure coding for distributed matrix multiplication for matrices with bounded entries

    Get PDF
    Distributed matrix multiplication is widely used in several scientific domains. It is well recognized that computation times on distributed clusters are often dominated by the slowest workers (called stragglers). Recent work has demonstrated that straggler mitigation can be viewed as a problem of designing erasure codes. For matrices A\mathbf A and B\mathbf B, the technique essentially maps the computation of ATB\mathbf A^T \mathbf B into the multiplication of smaller (coded) submatrices. The stragglers are treated as erasures in this process. The computation can be completed as long as a certain number of workers (called the recovery threshold) complete their assigned tasks. We present a novel coding strategy for this problem when the absolute values of the matrix entries are sufficiently small. We demonstrate a tradeoff between the assumed absolute value bounds on the matrix entries and the recovery threshold. At one extreme, we are optimal with respect to the recovery threshold and on the other extreme, we match the threshold of prior work. Experimental results on cloud-based clusters validate the benefits of our method

    CAMR: Coded Aggregated MapReduce

    Get PDF
    Many big data algorithms executed on MapReduce-like systems have a shuffle phase that often dominates the overall job execution time. Recent work has demonstrated schemes where the communication load in the shuffle phase can be traded off for the computation load in the map phase. In this work, we focus on a class of distributed algorithms, broadly used in deep learning, where intermediate computations of the same task can be combined. Even though prior techniques reduce the communication load significantly, they require a number of jobs that grows exponentially in the system parameters. This limitation is crucial and may diminish the load gains as the algorithm scales. We propose a new scheme which achieves the same load as the state-of-the-art while ensuring that the number of jobs as well as the number of subfiles that the data set needs to be split into remain small

    Robotic Hiatal Hernia Repair

    Get PDF
    Robotic surgery has revolutionized medicine during the last 16 years by transformation of the classic operating theaters into computer-mediated working stations. Numerous procedures have been proved to be feasible and safe by using the continuously evolving, various robotic platforms. From the early beginnings of this revolution, challenging operations such as those concerning the gastroesophageal junction, especially in super-obese patients or during redo operations, proved out to have certain benefits when performed robotically, both for patients as well as for surgeons
    corecore