23 research outputs found
Robust low-rank training via approximate orthonormal constraints
With the growth of model and data sizes, a broad effort has been made to
design pruning techniques that reduce the resource demand of deep learning
pipelines, while retaining model performance. In order to reduce both inference
and training costs, a prominent line of work uses low-rank matrix
factorizations to represent the network weights. Although able to retain
accuracy, we observe that low-rank methods tend to compromise model robustness
against adversarial perturbations. By modeling robustness in terms of the
condition number of the neural network, we argue that this loss of robustness
is due to the exploding singular values of the low-rank weight matrices. Thus,
we introduce a robust low-rank training algorithm that maintains the network's
weights on the low-rank matrix manifold while simultaneously enforcing
approximate orthonormal constraints. The resulting model reduces both training
and inference costs while ensuring well-conditioning and thus better
adversarial robustness, without compromising model accuracy. This is shown by
extensive numerical evidence and by our main approximation theorem that shows
the computed robust low-rank network well-approximates the ideal full model,
provided a highly performing low-rank sub-network exists
Rank-adaptive spectral pruning of convolutional layers during training
The computing cost and memory demand of deep learning pipelines have grown
fast in recent years and thus a variety of pruning techniques have been
developed to reduce model parameters. The majority of these techniques focus on
reducing inference costs by pruning the network after a pass of full training.
A smaller number of methods address the reduction of training costs, mostly
based on compressing the network via low-rank layer factorizations. Despite
their efficiency for linear layers, these methods fail to effectively handle
convolutional filters. In this work, we propose a low-parametric training
method that factorizes the convolutions into tensor Tucker format and
adaptively prunes the Tucker ranks of the convolutional kernel during training.
Leveraging fundamental results from geometric integration theory of
differential equations on tensor manifolds, we obtain a robust training
algorithm that provably approximates the full baseline performance and
guarantees loss descent. A variety of experiments against the full model and
alternative low-rank baselines are implemented, showing that the proposed
method drastically reduces the training costs, while achieving high
performance, comparable to or better than the full baseline, and consistently
outperforms competing low-rank approaches
AC/DC: The FERMI FEL Split and Delay Optical Device for Ultrafast X-ray Science
Free-electron lasers (FELs) are the most advanced class of light-sources, by virtue of their unique capability to lase high-brightness pulses characterized by wavelengths spanning the extreme-ultraviolet, the soft and hard X-ray spectral domains, as well as by temporal lengths lying in the femtosecond (fs) timescale. The next step to push the current standards in ultrafast X-ray science is strongly linked to the possibility of engineering and exploiting time-resolved experiments exclusively for FELs pulses, ideally having different colors tunable at specific electronic resonance of the chemical elements. At the seeded FERMI FEL (Trieste, Italy) this goal is committed to the optical device known as AC/DC, which stands for the auto correlator/delay creator. AC/DC is designed to double the incoming FEL pulse splitting the photon beam by inserting a grazing incidence flat mirror, thus preserving the spectral and temporal properties, and further delaying one of these two pulses in time. It can independently tune the FEL pulses fluence on the two optical paths by means of solid-state filters, too. Here, we present a detailed description about this optical device. Strong emphasis is dedicated to the AC/DC opto-mechanical design and to the laser-based feedback systems implemented to compensate for any mismatch affecting the FEL optical trajectory, ascribable to both mechanical imperfections and paraxial errors rising during a temporal delay scan
FEL stochastic spectroscopy revealing silicon bond softening dynamics
Time-resolved X-ray Emission/Absorption Spectroscopy (Tr-XES/XAS) is an
informative experimental tool sensitive to electronic dynamics in materials,
widely exploited in diverse research fields. Typically, Tr-XES/XAS requires
X-ray pulses with both a narrow bandwidth and sub-picosecond pulse duration, a
combination that in principle finds its optimum with Fourier transform-limited
pulses. In this work, we explore an alternative xperimental approach, capable
of simultaneously retrieving information about unoccupied (XAS) and occupied
(XES) states from the stochastic fluctuations of broadband extreme ultraviolet
pulses of a free-electron laser. We used this method, in combination with
singular value decomposition and Tikhonov regularization procedures, to
determine the XAS/XES response from a crystalline silicon sample at the
L2,3-edge, with an energy resolution of a few tens of meV. Finally, we combined
this spectroscopic method with a pump-probe approach to measure structural and
electronic dynamics of a silicon membrane. Tr-XAS/XES data obtained after
photoexcitation with an optical laser pulse at 390 nm allowed us to observe
perturbations of the band structure, which are compatible with the formation of
the predicted precursor state of a non-thermal solid-liquid phase transition
associated with a bond softening phenomenon
Widely tunable two-colour seeded free-electron laser source for resonant-pump resonant-probe magnetic scattering
International audienceThe advent of free-electron laser (FEL) sources delivering two synchronized pulses of different wavelengths (or colours) has made available a whole range of novel pump–probe experiments. This communication describes a major step forward using a new configuration of the FERMI FEL-seeded source to deliver two pulses with different wavelengths, each tunable independently over a broad spectral range with adjustable time delay. The FEL scheme makes use of two seed laser beams of different wavelengths and of a split radiator section to generate two extreme ultraviolet pulses from distinct portions of the same electron bunch. The tunability range of this new two-colour source meets the requirements of double-resonant FEL pump/FEL probe time-resolved studies. We demonstrate its performance in a proof-of-principle magnetic scattering experiment in Fe–Ni compounds, by tuning the FEL wavelengths to the Fe and Ni 3p resonances
Timing methodologies and studies at the FERMI free-electron laser.
Time-resolved investigations have begun a new era of chemistry and physics, enabling the monitoring in real time of the dynamics of chemical reactions and matter. Induced transient optical absorption is a basic ultrafast electronic effect, originated by a partial depletion of the valence band, that can be triggered by exposing insulators and semiconductors to sub-picosecond extreme-ultraviolet pulses. Besides its scientific and fundamental implications, this process is very important as it is routinely applied in free-electron laser (FEL) facilities to achieve the temporal superposition between FEL and optical laser pulses with tens of femtoseconds accuracy. Here, a set of methodologies developed at the FERMI facility based on ultrafast effects in condensed materials and employed to effectively determine the FEL/laser cross correlation are presented
Modeling attention dynamics in social networks
In this thesis we are going to present an analitically tractable model for
the spreading of information in an online social network. The approach used draws inspiration from ecology, indeed a social network can be regarded as an ecosystem in which contents are species struggling for the main resource (attention) to survive
Low-rank lottery tickets: finding efficient low-rank neural networks via matrix differential equations
Neural networks have achieved tremendous success in a large variety of
applications. However, their memory footprint and computational demand can
render them impractical in application settings with limited hardware or energy
resources. In this work, we propose a novel algorithm to find efficient
low-rank subnetworks. Remarkably, these subnetworks are determined and adapted
already during the training phase and the overall time and memory resources
required by both training and evaluating them is significantly reduced. The
main idea is to restrict the weight matrices to a low-rank manifold and to
update the low-rank factors rather than the full matrix during training. To
derive training updates that are restricted to the prescribed manifold, we
employ techniques from dynamic model order reduction for matrix differential
equations. Moreover, our method automatically and dynamically adapts the ranks
during training to achieve a desired approximation accuracy. The efficiency of
the proposed method is demonstrated through a variety of numerical experiments
on fully-connected and convolutional networks
In situ single-shot diffractive fluence mapping for X-ray free-electron laser pulses
Free electron laser beam profile characterization is usually performed separately from the actual measurements and this leads to considerable uncertainty in the results. Here the authors demonstrate the simultaneous measurement of the FEL beam profile with the experiment by using integrated gratings