25 research outputs found
Adaptive Out-Orientations with Applications
We give simple algorithms for maintaining edge-orientations of a
fully-dynamic graph, such that the out-degree of each vertex is bounded. On one
hand, we show how to orient the edges such that the out-degree of each vertex
is proportional to the arboricity of the graph, in a worst-case update
time of . On the other hand, motivated by applications
in dynamic maximal matching, we obtain a different trade-off, namely the
improved worst case update time of for the problem of
maintaining an edge-orientation with at most out-edges per
vertex. Since our algorithms have update times with worst-case guarantees, the
number of changes to the solution (i.e. the recourse) is naturally limited.
Our algorithms make choices based entirely on local information, which makes
them automatically adaptive to the current arboricity of the graph. In other
words, they are arboricity-oblivious, while they are arboricity-sensitive. This
both simplifies and improves upon previous work, by having fewer assumptions or
better asymptotic guarantees.
As a consequence, one obtains an algorithm with improved efficiency for
maintaining a approximation of the maximum subgraph density,
and an algorithm for dynamic maximal matching whose worst-case update time is
guaranteed to be upper bounded by , where
is the arboricity at the time of the update
Optimal Sketching Bounds for Sparse Linear Regression
We study oblivious sketching for -sparse linear regression under various
loss functions such as an norm, or from a broad class of hinge-like
loss functions, which includes the logistic and ReLU losses. We show that for
sparse norm regression, there is a distribution over oblivious
sketches with rows, which is tight up to a
constant factor. This extends to loss with an additional additive
term in the upper bound. This
establishes a surprising separation from the related sparse recovery problem,
which is an important special case of sparse regression. For this problem,
under the norm, we observe an upper bound of rows, showing that sparse recovery is
strictly easier to sketch than sparse regression. For sparse regression under
hinge-like loss functions including sparse logistic and sparse ReLU regression,
we give the first known sketching bounds that achieve rows showing that
rows suffice, where
is a natural complexity parameter needed to obtain relative error bounds for
these loss functions. We again show that this dimension is tight, up to lower
order terms and the dependence on . Finally, we show that similar
sketching bounds can be achieved for LASSO regression, a popular convex
relaxation of sparse regression, where one aims to minimize
over . We show that sketching
dimension suffices and that the dependence
on and is tight.Comment: AISTATS 202
Actin Dynamics Regulate Multiple Endosomal Steps during Kaposi's Sarcoma-Associated Herpesvirus Entry and Trafficking in Endothelial Cells
The role of actin dynamics in clathrin-mediated endocytosis in mammalian cells is unclear. In this study, we define the role of actin cytoskeleton in Kaposi's sarcoma-associated herpesvirus (KSHV) entry and trafficking in endothelial cells using an immunofluorescence-based assay to visualize viral capsids and the associated cellular components. In contrast to infectivity or reporter assays, this method does not rely on the expression of any viral and reporter genes, but instead directly tracks the accumulation of individual viral particles at the nuclear membrane as an indicator of successful viral entry and trafficking in cells. Inhibitors of endosomal acidification reduced both the percentage of nuclei with viral particles and the total number of viral particles docking at the perinuclear region, indicating endocytosis, rather than plasma membrane fusion, as the primary route for KSHV entry into endothelial cells. Accordingly, a viral envelope protein was only detected on internalized KSHV particles at the early but not late stage of infection. Inhibitors of clathrin- but not caveolae/lipid raft-mediated endocytosis blocked KSHV entry, indicating that clathrin-mediated endocytosis is the major route of KSHV entry into endothelial cells. KSHV particles were colocalized not only with markers of early and recycling endosomes, and lysosomes, but also with actin filaments at the early time points of infection. Consistent with these observations, transferrin, which enters cells by clathrin-mediated endocytosis, was found to be associated with actin filaments together with early and recycling endosomes, and to a lesser degree, with late endosomes and lysosomes. KSHV infection induced dynamic actin cytoskeleton rearrangements. Disruption of the actin cytoskeleton and inhibition of regulators of actin nucleation such as Rho GTPases and Arp2/3 complex profoundly blocked KSHV entry and trafficking. Together, these results indicate an important role for actin dynamics in the internalization and endosomal sorting/trafficking of KSHV and clathrin-mediated endocytosis in endothelial cells
Utilization of nonclairvoyant online schedules
This paper addresses the analysis of nondelay, nonpreemptive, nonclairvoyant online schedules for independent jobs on m identical machines. In our online model, all jobs are submitted over time. We show that the commonly used makespan criterion is not well suited to describe utilization for this online problem. Therefore, we directly address utilization and determine the maximum deviation from the optimal utilization for the given scheduling problem. © 2006 Elsevier B.V. All rights reserved
Development of Scheduling Strategies with Genetic Fuzzy Systems Abstract
This paper presents a methodology for automatically generating online scheduling strategies for a complex objective defined by a machine provider. To this end, we assume independent parallel jobs and multiple identical machines. The scheduling algorithm is based on a rule system. This rule system classifies all possible scheduling states and assigns a corresponding scheduling strategy. Each state is described by several parameters. The rule system is established in two different ways. In the first approach, an iterative method is applied, that assigns a standard scheduling strategy to all situation classes. Here, the situation classes are fixed and cannot be modified. Afterwards, for each situation class, the best strategy is extracted individually. In the second approach, a Symbiotic Evolution varies the parameter of Gaussian membership functions to establish the different situation classes and also assigns the appropriate scheduling strategies. Finally, both rule systems will be compared by using real workload traces and different possible complex objective functions
Abstract Development of scheduling strategies with Genetic Fuzzy systems
This paper presents a methodology for automatically generating online scheduling strategies for a complex objective defined by a machine provider. To this end, we assume independent parallel jobs and multiple identical machines. The scheduling algorithm is based on a rule system. This rule system classifies all possible scheduling states and assigns a corresponding scheduling strategy. Each state is described by several parameters. The rule system is established in two different ways. In the first approach, an iterative method is applied, that assigns a standard scheduling strategy to all situation classes. Here, the situation classes are fixed and cannot be modified. Afterwards, for each situation class, the best strategy is extracted individually. In the second approach, a Symbiotic Evolution varies the parameter of Gaussian membership functions to establish the different situation classes and also assigns the appropriate scheduling strategies. Finally, both rule systems will be compared by using real workload traces and different possible complex objective functions. # 2007 Elsevier B.V. All rights reserved
On grid performance evaluation using synthetic workloads
Grid computing is becoming a common platform for solving large scale computing tasks. However, a number of major technical issues, including the lack of adequate performance evaluation approaches, hinder the grid computing’s further development. The requirements herefore are manifold; adequate approaches must combine appropriate performance metrics, realistic workload models, and flexible tools for workload generation, submission, and analysis. In this paper we present an approach to tackle this complex problem. First, we introduce a set of grid performance objectives based on traditional and grid-specific performance metrics. Second, we synthesize the requirements for realistic grid workload modeling, e.g. co-allocation, data and network management, and failure modeling. Third, we show how GrenchMark, an existing framework for generating, running, and analyzing grid workloads, can be extended to implement the proposed modeling techniques. Our approach aims to be an initial and necessary step towards a common performance evaluation framework for grid environments
On grid performance evaluation using synthetic workloads
Grid computing is becoming a common platform for solving large scale computing tasks. However, a number of major technical issues, including the lack of adequate performance evaluation approaches, hinder the grid computing’s further development. The requirements herefore are manifold; adequate approaches must combine appropriate performance metrics, realistic workload models, and flexible tools for workload generation, submission, and analysis. In this paper we present an approach to tackle this complex problem. First, we introduce a set of grid performance objectives based on traditional and grid-specific performance metrics. Second, we synthesize the requirements for realistic grid workload modeling, e.g. co-allocation, data and network management, and failure modeling. Third, we show how GrenchMark, an existing framework for generating, running, and analyzing grid workloads, can be extended to implement the proposed modeling techniques. Our approach aims to be an initial and necessary step towards a common performance evaluation framework for grid environments
FlowFlex: Malleable Scheduling for Flows of MapReduce Jobs
Part 1: Distributed ProtocolsInternational audienceWe introduce FlowFlex, a highly generic and effective scheduler for flows of MapReduce jobs connected by precedence constraints. Such a flow can result, for example, from a single user-level Pig, Hive or Jaql query. Each flow is associated with an arbitrary function describing the cost incurred in completing the flow at a particular time. The overall objective is to minimize either the total cost (minisum) or the maximum cost (minimax) of the flows. Our contributions are both theoretical and practical. Theoretically, we advance the state of the art in malleable parallel scheduling with precedence constraints. We employ resource augmentation analysis to provide bicriteria approximation algorithms for both minisum and minimax objective functions. As corollaries, we obtain approximation algorithms for total weighted completion time (and thus average completion time and average stretch), and for maximum weighted completion time (and thus makespan and maximum stretch). Practically, the average case performance of the FlowFlex scheduler is excellent, significantly better than other approaches. Specifically, we demonstrate via extensive experiments the overall performance of FlowFlex relative to optimal and also relative to other, standard MapReduce scheduling schemes. All told, FlowFlex dramatically extends the capabilities of the earlier Flex scheduler for singleton MapReduce jobs while simultaneously providing a solid theoretical foundation for both
Commitment and Slack for Online Load Maximization
We consider a basic admission control problem in which jobs with deadlines arrive online and our goal is to maximize the total volume of executed job processing times. We assume that the deadlines have a slack of at least ϵ, that is, each deadline d satisfies d≥ (1+ϵ)· p+r with processing time p and release date r. In addition, we require the admission policy to support immediate commitment, that is, upon a job's submission, we must immediately make the decision of if and where we schedule the job, and this decision is irreversible. Our main contribution is a deterministic algorithm with nearly optimal competitive ratio for load maximization on multiple machines in the non-preemptive model. Previous results either only held for a single machine, did not support commitment, or required job preemption and migration