72 research outputs found
Which can Accelerate Distributed Machine Learning Faster: Hybrid Optical/Electrical or Optical Reconfigurable DCN?
We run various distributed machine learning (DML) architectures in a hybrid optical/electrical DCN and an optical DCN based on Hyper-FleX-LION. Experimental results show that Hyper-FleX-LION gains faster DML acceleration and improves acceleration ratio by up to 22.3%
Real Acceleration of Communication Process in Distributed Algorithms with Compression
Modern applied optimization problems become more and more complex every day.
Due to this fact, distributed algorithms that can speed up the process of
solving an optimization problem through parallelization are of great
importance. The main bottleneck of distributed algorithms is communications,
which can slow down the method dramatically. One way to solve this issue is to
use compression of transmitted information. In the current literature on
theoretical distributed optimization, it is generally accepted that as much as
we compress information, so much we reduce communication time. But in reality,
the communication time depends not only on the size of the transmitted
information, but also, for example, on the message startup time. In this paper,
we study distributed optimization algorithms under the assumption of a more
complex and closer-to-reality dependence of transmission time on compression.
In particular, we describe the real speedup achieved by compression, analyze
how much it makes sense to compress information, and present an adaptive way to
select the power of compression depending on unknown or changing parameters of
the communication process.Comment: 11 page
Analisis Algoritma K-Medoids pada Sistem Klasterisasi Produksi Perikanan Tangkap Kabupaten Aceh Utara
Belum adanya sistem pengelolaan data produksi perikanan tangkap di Kabupaten Aceh Utara menyebabkan pemerintah Kabupaten Aceh Utara kesulitan dalam mengklasterisasi data produksi perikanan tangkap. Penelitian ini bertujuan untuk menerapkan algoritma k-medoids dalam sistem klasterisasi data produksi perikanan tangkap berbasis web menjadi tiga klaster. Penelitian ini menggunakan data produksi perikanan tangkap di Kabupaten Aceh Utara tahun 2019-2020 yang diperoleh dari Dinas Kelautan dan Perikanan Kabupaten Aceh Utara. Hasil penelitian dengan 10 kali pengujian menunjukkan bahwa nilai rata-rata iterasi k-medoids sebesar 2,5 dengan jumlah iterasi terbanyak 4 iterasi dan iterasi terkecil senilai 2 iterasi. Hasil cluster data produksi perikanan tangkap dengan jenis tangkapan ikan Albakora masuk kedalam potensi produksi tangkapan klaster sedang. Jenis tangkapan ikan Alu-alu, Tongkol Krai, Tuna Mata Besar, Tuna Sirip Biru Selatan masuk kedalam cluster rendah. Jenis tangkapan ikan Banyar, Bawal Hitam dan Bawal Putih masuk kedalam klaster tinggi. Adapun hasil klaster yang terbentuk dapat membantu pemerintah Kabupaten Aceh Utara dalam mengambil kebijakan untuk menambah nilai produksi tangkapan ikan di Kabupaten Aceh Utara
The impact of data selection strategies on distributed model performance
Distributed Machine Learning, in which data and learning tasks are scattered across a cluster of computers, is one of the answers of the field to the challenges posed by Big Data. Still, in an era in which data abounds, decisions must still be made regarding which specific data to use on the training of the model, either because the amount of available data is simply too large, or because the training time or complexity of the model must be kept low. Typical approaches include, for example, selection based on data freshness. However, old data are not necessarily outdated and might still contain relevant patterns. Likewise, relying only on recent data may significantly decrease data diversity and representativity, and decrease model quality. The goal of this paper is to compare different heuristics for selecting data in a distributed Machine Learning scenario. Specifically, we ascertain whether selecting data based on their characteristics (meta-features), and optimizing for maximum diversity, improves model quality while, eventually, allowing to reduce model complexity. This will allow to develop more informed data selection strategies in distributed settings, in which the criteria are not only the location of the data or the state of each node in the cluster, but also include intrinsic and relevant characteristics of the data.This work has been supported by national funds through FCT – Fundação para a Ciência e Tecnologia through projects UIDB/04728/2020, EXPL/CCI/COM/0706/2021 and CPCA-IAC/AV/475278/202
TRBoost: A Generic Gradient Boosting Machine based on Trust-region Method
Gradient Boosting Machines (GBMs) are derived from Taylor expansion in
functional space and have achieved state-of-the-art results on a variety of
problems. However, there is a dilemma for GBMs to maintain a balance between
performance and generality. Specifically, gradient descent-based GBMs employ
the first-order Taylor expansion to make them appropriate for all loss
functions. And Newton's method-based GBMs use the positive hessian information
to achieve better performance at the expense of generality. In this paper, a
generic Gradient Boosting Machine called Trust-region Boosting (TRBoost) is
presented to maintain this balance. In each iteration, we apply a constrained
quadratic model to approximate the objective and solve it by the Trust-region
algorithm to obtain a new learner. TRBoost offers the benefit that we do not
need the hessian to be positive definite, which generalizes GBMs to suit
arbitrary loss functions while keeping up the good performance as the
second-order algorithm. Several numerical experiments are conducted to confirm
that TRBoost is not only as general as the first-order GBMs but also able to
get competitive results with the second-order GBMs
Distributed optimization on directed graphs based on inexact ADMM with partial participation
We consider the problem of minimizing the sum of cost functions pertaining to
agents over a network whose topology is captured by a directed graph (i.e.,
asymmetric communication). We cast the problem into the ADMM setting, via a
consensus constraint, for which both primal subproblems are solved inexactly.
In specific, the computationally demanding local minimization step is replaced
by a single gradient step, while the averaging step is approximated in a
distributed fashion. Furthermore, partial participation is allowed in the
implementation of the algorithm. Under standard assumptions on strong convexity
and Lipschitz continuous gradients, we establish linear convergence and
characterize the rate in terms of the connectivity of the graph and the
conditioning of the problem. Our line of analysis provides a sharper
convergence rate compared to Push-DIGing. Numerical experiments corroborate the
merits of the proposed solution in terms of superior rate as well as
computation and communication savings over baselines
A Survey From Distributed Machine Learning to Distributed Deep Learning
Artificial intelligence has achieved significant success in handling complex
tasks in recent years. This success is due to advances in machine learning
algorithms and hardware acceleration. In order to obtain more accurate results
and solve more complex problems, algorithms must be trained with more data.
This huge amount of data could be time-consuming to process and require a great
deal of computation. This solution could be achieved by distributing the data
and algorithm across several machines, which is known as distributed machine
learning. There has been considerable effort put into distributed machine
learning algorithms, and different methods have been proposed so far. In this
article, we present a comprehensive summary of the current state-of-the-art in
the field through the review of these algorithms. We divide this algorithms in
classification and clustering (traditional machine learning), deep learning and
deep reinforcement learning groups. Distributed deep learning has gained more
attention in recent years and most of studies worked on this algorithms. As a
result, most of the articles we discussed here belong to this category. Based
on our investigation of algorithms, we highlight limitations that should be
addressed in future research
Communication Efficient Distributed Newton Method with Fast Convergence Rates
We propose a communication and computation efficient second-order method for
distributed optimization. For each iteration, our method only requires
communication complexity, where is the problem dimension.
We also provide theoretical analysis to show the proposed method has the
similar convergence rate as the classical second-order optimization algorithms.
Concretely, our method can find~-second-order stationary points for nonconvex problem
by iterations, where is
the Lipschitz constant of Hessian. Moreover, it enjoys a local superlinear
convergence under the strongly-convex assumption. Experiments on both convex
and nonconvex problems show that our proposed method performs significantly
better than baselines.Comment: Accepted in SIGKDD 202
Federated K-Means Clustering via Dual Decomposition-based Distributed Optimization
The use of distributed optimization in machine learning can be motivated
either by the resulting preservation of privacy or the increase in
computational efficiency. On the one hand, training data might be stored across
multiple devices. Training a global model within a network where each node only
has access to its confidential data requires the use of distributed algorithms.
Even if the data is not confidential, sharing it might be prohibitive due to
bandwidth limitations. On the other hand, the ever-increasing amount of
available data leads to large-scale machine learning problems. By splitting the
training process across multiple nodes its efficiency can be significantly
increased. This paper aims to demonstrate how dual decomposition can be applied
for distributed training of -means clustering problems. After an overview
of distributed and federated machine learning, the mixed-integer quadratically
constrained programming-based formulation of the -means clustering
training problem is presented. The training can be performed in a distributed
manner by splitting the data across different nodes and linking these nodes
through consensus constraints. Finally, the performance of the subgradient
method, the bundle trust method, and the quasi-Newton dual ascent algorithm are
evaluated on a set of benchmark problems. While the mixed-integer
programming-based formulation of the clustering problems suffers from weak
integer relaxations, the presented approach can potentially be used to enable
an efficient solution in the future, both in a central and distributed setting
- …