19 research outputs found

    Exascale Ready Work-Optimal Matrix Inversion

    Get PDF
    In dieser Diplomarbeit entwickle ich einen neuen Algorithmus OPT zur Inversion von Matrizen. Ich beweise Eigenschaften zur parallelen Laufzeit und zum Arbeitsaufwand von OPT. OPT ist kombiniert aus Strassens Algorithmus zur Inversion von Matrizen und aus Newton Approximation und basiert auf einer Subroutine zur Matrixmultiplikation. OPT ist ein arbeitsoptimaler Algorithmus, d.h. er benötigt höchstens einen konstanten Faktor mehr Arbeit als jeder andere (arbeitsoptimale) Algorithmus. Außerdem benötigt OPT nur plylogarithmische Zeit auf höchstens O(n3) Prozessoren, wobei die Prozessorzahl von der Multiplikationsroutine bestimmt wird. Damit vereint er diese beiden Vorteile von Strassens Algorithmus und Newton Approximation. Ich beweise eine neue Abschätzung zur numerischen Stabilität von Strassens Algorithmus kombiniert mit Newton Approximation. Im Zuge der Diplomarbeit habe ich OPT, Strassens Inversionsalgorithmus und Newton Approximation zusammen mit einer Matrixcontainerklasse in einem flexiblen Testprogramm implementiert. Ich beschreibe das Design der Implementierung und die Verwendung und Schwierigkeiten von BLAS für die Matrixmultiplikationsroutine. Im experimentellen Teil vergleiche ich die Laufzeit und die numerische Stabilität von OPT mit der Routine aus der Intel Math Kernel Library (MKL). Die konstanten Faktoren des Arbeitsaufwands von OPT erweisen sich als nicht mehr als doppelt so hoch wie die der MKL-Routine. Wie vorhergesagt skaliert OPT sehr gut. Selbst auf einem Computer mit nur acht Kernen ist er bereits deutlich schneller als die MKL-Routine. Bezüglich der numerischen Stabilität werden OPT und Strassens Algorithmus dessen schlechtem Ruf nicht gerecht. Stattdessen erzeugen sie Ergebnisse vergleichbar mit denen der MKL-Routine. Ich entdecke eine unerwartete Instabilität von Newton Approximation wodurch sie schlechtere Ergebnisse erzeugt als alle anderen Algorithmen in der Implementierung. Zu dieser Instabilität präsentiere ich einige weitere Experimente

    Straggler Robust Distributed Matrix Inverse Approximation

    Full text link
    A cumbersome operation in numerical analysis and linear algebra, optimization, machine learning and engineering algorithms; is inverting large full-rank matrices which appears in various processes and applications. This has both numerical stability and complexity issues, as well as high expected time to compute. We address the latter issue, by proposing an algorithm which uses a black-box least squares optimization solver as a subroutine, to give an estimate of the inverse (and pseudoinverse) of real nonsingular matrices; by estimating its columns. This also gives it the flexibility to be performed in a distributed manner, thus the estimate can be obtained a lot faster, and can be made robust to \textit{stragglers}. Furthermore, we assume a centralized network with no message passing between the computing nodes, and do not require a matrix factorization; e.g. LU, SVD or QR decomposition beforehand.Comment: 4 pages, 1 figure, conferenc

    A new processing approach for reducing computational complexity in cloud-RAN mobile networks

    Get PDF
    Cloud computing is considered as one of the key drivers for the next generation of mobile networks (e.g. 5G). This is combined with the dramatic expansion in mobile networks, involving millions (or even billions) of subscribers with a greater number of current and future mobile applications (e.g. IoT). Cloud Radio Access Network (C-RAN) architecture has been proposed as a novel concept to gain the benefits of cloud computing as an efficient computing resource, to meet the requirements of future cellular networks. However, the computational complexity of obtaining the channel state information in the full-centralized C-RAN increases as the size of the network is scaled up, as a result of enlargement in channel information matrices. To tackle this problem of complexity and latency, MapReduce framework and fast matrix algorithms are proposed. This paper presents two levels of complexity reduction in the process of estimating the channel information in cellular networks. The results illustrate that complexity can be minimized from O(N3) to O((N/k)3), where N is the total number of RRHs and k is the number of RRHs per group, by dividing the processing of RRHs into parallel groups and harnessing the MapReduce parallel algorithm in order to process them. The second approach reduces the computation complexity from O((N/k)3) to O((N/k)2:807) using the algorithms of fast matrix inversion. The reduction in complexity and latency leads to a significant improvement in both the estimation time and in the scalability of C-RAN networks

    Fast algorithms for the Sylvester equation AX−XBT=C

    Get PDF
    AbstractFor given matrices A∈Fm×m, B∈Fn×n, and C∈Fm×n over an arbitrary field F, the matrix equation AX−XBT=C has a unique solution X∈Fm×n if and only if A and B have disjoint spectra. We describe an algorithm that computes the solution X for m,n⩽N with O(Nβ·logN) arithmetic operations in F, where β>2 is such that M×M matrices can be multiplied with O(Mβ) arithmetic operations, e.g., β=2.376. It seems that before no better bound than O(m3·n3) arithmetic operations was known. The state of the art in numerical analysis is O(n3+m3) flops, but these algorithms (due to Bartels/Stewart and Golub/Nash/van Loan) involve Schur decompositions, i.e., they compute the eigenvalues of at least one of A and B, and can hence not be transferred for general F

    Real-time implementation of some bilateral teleoperation schemes

    Get PDF
    Today's technology has pushed back many boundaries that we thought were impossible. One of these technologies is bilateral teleoperation. Bilateral teleoperation allows an operator to control a robot, at a distance, over a communication medium and get a feedback of the interaction forces between the robot's end-effector and the remote environment. This technology leads, for instance, to application in telesurgery, hazardous material handling and underwater repairs. Our main research is to compare different control methods for such a system using a six degree of freedom (DOF) parallel robot for the master device and a 6 DOF serial robot for the slave device. According to the choice of the transmitted variables over the communication medium, we attempt to show the inherent differences between transmitting position/sensor force information, velocity/control force information and, thereafter, wave variables. The Position Error Based (PEB) controller, the Kinesthetic Force Based (KFB) controller and the transmission of Wave variables are implemented. Experimental results with and without time delays present are carried out and compared. For our implementations a haptic device, based on the twin pantograph architecture, is used as a master manipulator, while the slave manipulator is a 6 DOF serial robot manipulator, the A465 Robot by CRS Robotics
    corecore