260 research outputs found
Design of generalized fractional order gradient descent method
This paper focuses on the convergence problem of the emerging fractional
order gradient descent method, and proposes three solutions to overcome the
problem. In fact, the general fractional gradient method cannot converge to the
real extreme point of the target function, which critically hampers the
application of this method. Because of the long memory characteristics of
fractional derivative, fixed memory principle is a prior choice. Apart from the
truncation of memory length, two new methods are developed to reach the
convergence. The one is the truncation of the infinite series, and the other is
the modification of the constant fractional order. Finally, six illustrative
examples are performed to illustrate the effectiveness and practicability of
proposed methods.Comment: 8 pages, 16 figure
Support Vector Machine optimization with fractional gradient descent for data classification
Data classification has several problems one of which is a large amount of data that will reduce computing time. SVM is a reliable linear classifier for linear or non-linear data, for large-scale data, there are computational time constraints. The Fractional gradient descent method is an unconstrained optimization algorithm to train classifiers with support vector machines that have convex problems. Compared to the classic integer-order model, a model built with fractional calculus has a significant advantage to accelerate computing time. In this research, it is to conduct investigate the current state of this new optimization method fractional derivatives that can be implemented in the classifier algorithm. The results of the SVM Classifier with fractional gradient descent optimization, it reaches a convergence point of approximately 50 iterations smaller than SVM-SGD. The process of updating or fixing the model is smaller in fractional because the multiplier value is less than 1 or in the form of fractions. The SVM-Fractional SGD algorithm is proven to be an effective method for rainfall forecast decisions
Performance Analysis of Fractional Learning Algorithms
Fractional learning algorithms are trending in signal processing and adaptive
filtering recently. However, it is unclear whether the proclaimed superiority
over conventional algorithms is well-grounded or is a myth as their performance
has never been extensively analyzed. In this article, a rigorous analysis of
fractional variants of the least mean squares and steepest descent algorithms
is performed. Some critical schematic kinks in fractional learning algorithms
are identified. Their origins and consequences on the performance of the
learning algorithms are discussed and swift ready-witted remedies are proposed.
Apposite numerical experiments are conducted to discuss the convergence and
efficiency of the fractional learning algorithms in stochastic environments.Comment: 29 pages, 6 figure
Scalable optimization algorithms for recommender systems
Recommender systems have now gained significant popularity and been widely used in many e-commerce applications. Predicting user preferences is a key step to providing high quality recommendations. In practice, however, suggestions made to users must not only consider user preferences in isolation; a good recommendation engine also needs to account for certain constraints. For instance, an online video rental that suggests multimedia items (e.g., DVDs) to its customers should consider the availability of DVDs in stock to reduce customer waiting times for accepted recommendations. Moreover, every user should receive a small but sufficient number of suggestions that the user is likely to be interested in.
This thesis aims to develop and implement scalable optimization algorithms that can be used (but are not restricted) to generate recommendations satisfying certain objectives and constraints like the ones above. State-of-the-art approaches lack efficiency and/or scalability in coping with large real-world instances, which may involve millions of users and items. First, we study large-scale matrix completion in the context of collaborative filtering in recommender systems. For such problems, we propose a set of novel shared-nothing algorithms which are designed to run on a small cluster of commodity nodes and outperform alternative approaches in terms of efficiency, scalability, and memory footprint. Next, we view our recommendation task as a generalized matching problem, and propose the first distributed solution for solving such problems at scale. Our algorithm is designed to run on a small cluster of commodity nodes (or in a MapReduce environment) and has strong approximation guarantees. Our matching algorithm relies on linear programming. To this end, we present an efficient distributed approximation algorithm for mixed packing-covering linear programs, a simple but expressive subclass of linear programs. Our approximation algorithm requires a poly-logarithmic number of passes over the input, is simple, and well-suited for parallel processing on GPUs, in shared-memory architectures, as well as on a small cluster of commodity nodes.Empfehlungssysteme haben eine beachtliche Popularität erreicht und werden in zahlreichen E-Commerce Anwendungen eingesetzt. Entscheidend für die Generierung hochqualitativer Empfehlungen ist die Vorhersage von Nutzerpräferenzen. Jedoch sollten in der Praxis nicht nur Vorschläge auf Basis von Nutzerpräferenzen gegeben werden, sondern gute Empfehlungssysteme müssen auch bestimmte Nebenbedingungen berücksichtigen. Zum Beispiel sollten online Videoverleihfirmen, welche ihren Kunden multimediale Produkte (z.B. DVDs) vorschlagen, die Verfügbarkeit von vorrätigen DVDs beachten, um die Wartezeit der Kunden für angenommene Empfehlungen zu reduzieren. Darüber hinaus sollte jeder Kunde eine kleine, aber ausreichende Anzahl an Vorschlägen erhalten, an denen er interessiert sein könnte.
Diese Arbeit strebt an skalierbare Optimierungsalgorithmen zu entwickeln und zu implementieren, die (unter anderem) eingesetzt werden können Empfehlungen zu generieren, welche weitere Zielvorgaben und Restriktionen einhalten. Derzeit existierenden Ansätzen mangelt es an Effizienz und/oder Skalierbarkeit im Umgang mit sehr großen, durchaus realen Datensätzen von, beispielsweise Millionen von Nutzern und Produkten. Zunächst analysieren wir die Vervollständigung großskalierter Matrizen im Kontext von kollaborativen Filtern in Empfehlungssystemen. Für diese Probleme schlagen wir verschiedene neue, verteilte Algorithmen vor, welche konzipiert sind auf einer kleinen Anzahl von gängigen Rechnern zu laufen. Zudem können sie alternative Ansätze hinsichtlich der Effizienz, Skalierbarkeit und benötigten Speicherkapazität überragen. Als Nächstes haben wir die Empfehlungsproblematik als ein generalisiertes Zuordnungsproblem betrachtet und schlagen daher die erste verteilte Lösung für großskalierte Zuordnungsprobleme vor. Unser Algorithmus funktioniert auf einer kleinen Gruppe von gängigen Rechnern (oder in einem MapReduce-Programmierungsmodel) und erzielt gute Approximationsgarantien. Unser Zuordnungsalgorithmus beruht auf linearer Programmierung. Daher präsentieren wir einen effizienten, verteilten Approximationsalgorithmus für vermischte lineare Packungs- und Überdeckungsprobleme, eine einfache aber expressive Unterklasse der linearen Programmierung. Unser Algorithmus benötigt eine polylogarithmische Anzahl an Scans der Eingabedaten. Zudem ist er einfach und sehr gut geeignet für eine parallele Verarbeitung mithilfe von Grafikprozessoren, unter einer gemeinsam genutzten Speicherarchitektur sowie auf einer kleinen Gruppe von gängigen Rechnern
Near-Optimal Straggler Mitigation for Distributed Gradient Methods
Modern learning algorithms use gradient descent updates to train inferential
models that best explain data. Scaling these approaches to massive data sizes
requires proper distributed gradient descent schemes where distributed worker
nodes compute partial gradients based on their partial and local data sets, and
send the results to a master node where all the computations are aggregated
into a full gradient and the learning model is updated. However, a major
performance bottleneck that arises is that some of the worker nodes may run
slow. These nodes a.k.a. stragglers can significantly slow down computation as
the slowest node may dictate the overall computational time. We propose a
distributed computing scheme, called Batched Coupon's Collector (BCC) to
alleviate the effect of stragglers in gradient methods. We prove that our BCC
scheme is robust to a near optimal number of random stragglers. We also
empirically demonstrate that our proposed BCC scheme reduces the run-time by up
to 85.4% over Amazon EC2 clusters when compared with other straggler mitigation
strategies. We also generalize the proposed BCC scheme to minimize the
completion time when implementing gradient descent-based algorithms over
heterogeneous worker nodes
- …