Search CORE

822 research outputs found

Photonic integrated reconfigurable linear processors as neural network accelerators

Author: Andriolli N.
Castoldi P.
Cococcioni M.
Contestabile G.
De Marinis L.
Liboiron-Ladouceur O.
Publication venue: 'MDPI AG'
Publication date: 01/01/2021
Field of study

Reconfigurable linear optical processors can be used to perform linear transformations and are instrumental in effectively computing matrix–vector multiplications required in each neural network layer. In this paper, we characterize and compare two thermally tuned photonic integrated processors realized in silicon-on-insulator and silicon nitride platforms suited for extracting feature maps in convolutional neural networks. The reduction in bit resolution when crossing the processor is mainly due to optical losses, in the range 2.3–3.3 for the silicon-on-insulator chip and in the range 1.3–2.4 for the silicon nitride chip. However, the lower extinction ratio of Mach–Zehnder elements in the latter platform limits their expressivity (i.e., the capacity to implement any transformation) to 75%, compared to 97% of the former. Finally, the silicon-on-insulator processor outperforms the silicon nitride one in terms of footprint and energy efficiency

Archivio della Ricerca - Università di Pisa

Photonic Integrated Reconfigurable Linear Processors as Neural Network Accelerators

Author: Andriolli Nicola
Castoldi Piero
Cococcioni Marco
Contestabile Giampiero
De Marinis Lorenzo
Liboiron-Ladouceur Odile
Publication venue: 'MDPI AG'
Publication date: 01/01/2021
Field of study

Reconfigurable linear optical processors can be used to perform linear transformations and are instrumental in effectively computing matrix-vector multiplications required in each neural network layer. In this paper, we characterize and compare two thermally tuned photonic integrated processors realized in silicon-on-insulator and silicon nitride platforms suited for extracting feature maps in convolutional neural networks. The reduction in bit resolution when crossing the processor is mainly due to optical losses, in the range 2.3-3.3 for the silicon-on-insulator chip and in the range 1.3-2.4 for the silicon nitride chip. However, the lower extinction ratio of Mach-Zehnder elements in the latter platform limits their expressivity (i.e., the capacity to implement any transformation) to 75%, compared to 97% of the former. Finally, the silicon-on-insulator processor outperforms the silicon nitride one in terms of footprint and energy efficiency

Multidisciplinary Digital Publishing Institute

Directory of Open Access Journals

Archivio della ricerca della Scuola Superiore Sant'Anna

Optimization landscape of deep neural networks

Author: Nguyen Ngoc Quynh
Publication venue: Saarländische Universitäts- und Landesbibliothek
Publication date: 01/01/2019
Field of study

It has been empirically observed in deep learning that the training problem of deep over-parameterized neural networks does not seem to have a big problem with suboptimal local minima despite all hardness results proven in the literature. In many cases, local search algorithms such as (stochastic) gradient descent frequently converge to a globally optimal solution. In an attempt to better understand this phenomenon, this thesis studies sufficient conditions on the network architecture so that the landscape of the associated loss function is guaranteed to be well-behaved, which could be favorable to local search algorithms. Our analysis touches upon fundamental aspects of the problem such as existence of solutions with zero training error, global optimality of critical points, topology of level sets and sublevel sets of the loss. Gaining insight from this analysis, we come up with a new class of network architectures that are practically relevant and have a strong theoretical guarantee on the loss surface. We empirically investigate the generalization ability of these networks and other related phenomena observed in deep learning such as implicit bias of stochastic gradient descent. Finally, we study limitations of deep and narrow neural networks in learning connected decision regions, and draw connections to adversarial manipulation problems. Our results and analysis presented in this thesis suggest that having a sufficiently wide layer in the architecture is not only helpful to make the loss surface become well-behaved but also important to the expressive power of neural networks.Es wurde empirisch beobachtet, dass beim Trainieren von überparametrisierten tiefen, neuronalen Netzen keine Probleme mit lokalen Minima auftreten, trotz den Schwerheits-Resultaten in der Literatur. In vielen Fällen konvergieren lokale Suchalgorithmen wie (stochastischer) Gradientenabstieg oft zu einer global optimalen Lösung. In einem Versuch dieses Phänomen besser zu verstehen, diskutiert diese Arbeit hinreichende Bedingungen an die Netzwerkarchitektur, so dass die Funktionslandschaft der assozierten Verlustfunktion sich garantiert gut verhält, was günstig für lokale Suchalgorithmen ist. Unsere Analyse bezieht sich auf grundlegende Aspekte des Problems wie z.B. Existenz von Lösungen mit null Trainingsfehlern, globale Optimalität der kritischen Punkte und Topologie der Niveau- und Unterniveau-Mengen der Verlustfunktion. Aus den in dieser Analyse gewonnenen Erkenntnisse entwickeln wir eine neue Klasse von Netzwerkarchitekturen, die praxisrelevant sind und die starke theoretische Garantien über die Oberfläche der Verlustfunktion erlauben. Wir untersuchen empirisch die Generalisierungsfähigkeit dieser Netzwerke und anderer verwandter Phänomene, die beim tiefen Lernen beobachtet wurden, wie z.B. der implizite Bias des stochastischen Gradientenabstiegs. Weiter diskutieren wir Einschränkungen tiefer und schmaler neuronaler Netze beim Lernen von miteinander verbundenen Entscheidungsregionen und stellen eine Verbindung zum Problem der bösartigen Manipulation her. Unsere Ergebnisse und Analysen, die in dieser Arbeit vorgestellt werden, legen nahe, dass eine ausreichend breite Schicht in der Architektur nicht nur hilfreich ist, damit die Verlustoberfläche wohlbehalten ist, aber auch wichtig ist für die Ausdrucksstärke von neuronalen Netzen

Universaar

Acronym

Attacking Graph Neural Networks with Bit Flips: Weisfeiler and Lehman Go Indifferent

Author: Gansterer Wilfried N.
Kriege Nils N.
Kummer Lorenz
Moustafa Samir
Publication venue
Publication date: 02/11/2023
Field of study

Prior attacks on graph neural networks have mostly focused on graph poisoning and evasion, neglecting the network's weights and biases. Traditional weight-based fault injection attacks, such as bit flip attacks used for convolutional neural networks, do not consider the unique properties of graph neural networks. We propose the Injectivity Bit Flip Attack, the first bit flip attack designed specifically for graph neural networks. Our attack targets the learnable neighborhood aggregation functions in quantized message passing neural networks, degrading their ability to distinguish graph structures and losing the expressivity of the Weisfeiler-Lehman test. Our findings suggest that exploiting mathematical properties specific to certain graph neural network architectures can significantly increase their vulnerability to bit flip attacks. Injectivity Bit Flip Attacks can degrade the maximal expressive Graph Isomorphism Networks trained on various graph property prediction datasets to random output by flipping only a small fraction of the network's bits, demonstrating its higher destructive power compared to a bit flip attack transferred from convolutional neural networks. Our attack is transparent and motivated by theoretical insights which are confirmed by extensive empirical results

arXiv.org e-Print Archive