164,601 research outputs found
Exploring the Impact of Serverless Computing on Peer To Peer Training Machine Learning
The increasing demand for computational power in big data and machine
learning has driven the development of distributed training methodologies.
Among these, peer-to-peer (P2P) networks provide advantages such as enhanced
scalability and fault tolerance. However, they also encounter challenges
related to resource consumption, costs, and communication overhead as the
number of participating peers grows. In this paper, we introduce a novel
architecture that combines serverless computing with P2P networks for
distributed training and present a method for efficient parallel gradient
computation under resource constraints.
Our findings show a significant enhancement in gradient computation time,
with up to a 97.34\% improvement compared to conventional P2P distributed
training methods. As for costs, our examination confirmed that the serverless
architecture could incur higher expenses, reaching up to 5.4 times more than
instance-based architectures. It is essential to consider that these higher
costs are associated with marked improvements in computation time, particularly
under resource-constrained scenarios. Despite the cost-time trade-off, the
serverless approach still holds promise due to its pay-as-you-go model.
Utilizing dynamic resource allocation, it enables faster training times and
optimized resource utilization, making it a promising candidate for a wide
range of machine learning applications
On the Intersection of Communication and Machine Learning
The intersection of communication and machine learning is attracting increasing interest from both communities. On the one hand, the development of modern communication system brings large amount of data and high performance requirement, which challenges the classic analytical-derivation based study philosophy and encourages the researchers to explore the data driven method, such as machine learning, to solve the problems with high complexity and large scale. On the other hand, the usage of distributed machine learning introduces the communication cost as one of the basic considerations for the design of machine learning algorithm and system.In this thesis, we first explore the application of machine learning on one of the classic problems in wireless network, resource allocation, for heterogeneous millimeter wave networks when the environment is with high dynamics. We address the practical concerns by providing the efficient online and distributed framework. In the second part, some sampling based communication-efficient distributed learning algorithm is proposed. We utilize the trade-off between the local computation and the total communication cost and propose the algorithm with good theoretical bound. In more detail, this thesis makes the following contributionsWe introduced an reinforcement learning framework to solve the resource allocation problems in heterogeneous millimeter wave network. The large state/action space is decomposed according to the topology of the network and solved by an efficient distribtued message passing algorithm. We further speed up the inference process by an online updating process.We proposed the distributed coreset based boosting framework. An efficient coreset construction algorithm is proposed based on the prior knowledge provided by clustering. Then the coreset is integrated with boosting with improved convergence rate. We extend the proposed boosting framework to the distributed setting, where the communication cost is reduced by the good approximation of coreset.We propose an selective sampling framework to construct a subset of sample that could effectively represent the model space. Based on the prior distribution of the model space or the large amount of samples from model space, we derive a computational efficient method to construct such subset by minimizing the error of classifying a classifier
Recommended from our members
Resource sharing in network slicing and human-machine interactions
In this thesis we explore two novel resource allocation models. The first addresses challenges associated with dynamic sharing of network resources by multiple tenants/services via network slicing. The second focuses on a data-driven approach to the optimization of resource allocation in interactive human-machine processes. In our first thrust we investigate how to allocate shared storage, computation, and/or connectivity resources distributed amongst multiple tenants/ virtual service providers which have dynamic loads. It is expected that next generation of wireless network will be shared by an increasing number of data-intensive mobile applications (e.g., autonomous cars, IoT, interactive 360° video streaming), and tenants/service providers. A key functional requirement for such infrastructure is enabling efficient sharing of heterogeneous resource among tenants/service providers supporting spatially varying and dynamic user demands, both from the point of view of enabling the deployment and performance management to diverse service providers and/or tenants, as well as means to increase utilization and reduce CAPEX/OPEX associated with deploying possible new infrastructures. To that end, we propose a novel dynamic resource sharing policy, namely, Share Constrained Proportional Fair (SCPF), which allocates a predefined ‘share’ of a pool of (distributed) resources to each slice. We provide a characterization of the achievable performance gains over General Processor Sharing (GPS), and Static Slicing (SS), i.e., fixed allocation of resources to slices. We also characterize the associated share dimensioning problem, asking when a particular set of load profiles and QoS requirements are feasible, as well as what should be an appropriate pricing strategy. We further consider possible slice-based admission control scheme where slices engage in an underlying game to maximize their carried loads subject to performance requirements. In order to accommodate settings where one would wish to provision different types of resources which are coupled through user demands, we generalize SCPF to a more general resource allocation criterion, namely, Share Constrained Slicing (SCS), which extends traditional α—fairness criterion, by striking a balance among inter- and intra-slice fairness vs. overall efficiency. We show that SCS has several desirable properties including slice-level protection, envyfreeness, and load-driven elasticity. In practice, mobile users' dynamics could make the cost of implementing SCS high, so we also study the feasibility of using a dynamically weighted max-min fair policy as a surrogate resource allocation scheme. For a setting with stochastic loads and elastic user requirements, we model the user dynamics under SCS as a queuing network and establish the stability condition. Finally, and perhaps surprisingly, we show via extensive simulation that while SCS (and/or the surrogate weighted max-min allocation) provides inter-slice protection, they can also achieve improved job delay and/or perceived throughput, as compared to other weighted max-min based allocation schemes whose intra-slice weight allocation is not share-constrained, e.g., traditional max-min and/or discriminatory processor sharing. In our second thrust we study how to optimize resource allocation in the context of human-machine interactions. Examples of such processes could include systems aimed at assisting humans in interactive learning, workload allocation, or web-search advertising. We devise an innovative framework to enable the optimization of a reward over an interactive process in a data-driven manner. This is a challenging problem for several reasons: (1) humans' behavior is not easily modeled and may reflect biases, memory and be sensitive to sequencing, all of which should/could be inferred from data; (2) because these interactions are typically sequential and transient, inferring such complex models for human behavior is difficult; (3) furthermore, in order to collect data on human-machine interactions one must choose a machine policy which in turn may bias inferences on human behavior. In this thesis we approach the problem of jointly estimating human behavior and optimizing machine policies via Alternating Entropy-Reward Ascent (AREA) algorithm. We characterize AREA in terms of its space and time complexity and convergence. We also provide an initial validation based on synthetic data generated by an established noisy nonlinear model for human decision-makingElectrical and Computer Engineerin
A Hierarchical Framework of Cloud Resource Allocation and Power Management Using Deep Reinforcement Learning
Automatic decision-making approaches, such as reinforcement learning (RL),
have been applied to (partially) solve the resource allocation problem
adaptively in the cloud computing system. However, a complete cloud resource
allocation framework exhibits high dimensions in state and action spaces, which
prohibit the usefulness of traditional RL techniques. In addition, high power
consumption has become one of the critical concerns in design and control of
cloud computing systems, which degrades system reliability and increases
cooling cost. An effective dynamic power management (DPM) policy should
minimize power consumption while maintaining performance degradation within an
acceptable level. Thus, a joint virtual machine (VM) resource allocation and
power management framework is critical to the overall cloud computing system.
Moreover, novel solution framework is necessary to address the even higher
dimensions in state and action spaces. In this paper, we propose a novel
hierarchical framework for solving the overall resource allocation and power
management problem in cloud computing systems. The proposed hierarchical
framework comprises a global tier for VM resource allocation to the servers and
a local tier for distributed power management of local servers. The emerging
deep reinforcement learning (DRL) technique, which can deal with complicated
control problems with large state space, is adopted to solve the global tier
problem. Furthermore, an autoencoder and a novel weight sharing structure are
adopted to handle the high-dimensional state space and accelerate the
convergence speed. On the other hand, the local tier of distributed server
power managements comprises an LSTM based workload predictor and a model-free
RL based power manager, operating in a distributed manner.Comment: accepted by 37th IEEE International Conference on Distributed
Computing (ICDCS 2017
- …