8 research outputs found
Can we Pretrain a SotA Legal Language Model on a Budget From Scratch?
Even though many efficient transformers have been proposed, only few such models are available for specialized domains. Additionally, since the pretraining process is extremely costly in general – but even more so as the sequence length increases – it is often only in reach of large research labs. One way of making pretraining cheaper is the Replaced Token Detection (RTD) task, by providing more signal during training compared to MLM, since the loss can be computed over all tokens. In this work, we train Longformer models with the efficient RTD task on long-context legal data to showcase that pretraining efficient LMs is possibl using less than 12 GPU days. We evaluate the trained models on challenging summarization tasks requiring the model to summarize complex long texts. We find that both the small and base models outperform their baselines on the in-domain BillSum and out-of-domain PubMed tasks in their respective parameter range. We publish our models as a resource for researcher and practitioners
Ab initio Modelling of the Early Stages of Precipitation in Al-6000 Alloys
Age hardening induced by the formation of (semi)-coherent precipitate phases
is crucial for the processing and final properties of the widely used Al-6000
alloys. Early stages of precipitation are particularly important from the
fundamental and technological side, but are still far from being fully
understood. Here, an analysis of the energetics of nanometric precipitates of
the meta-stable phases is performed, identifying the bulk, elastic
strain and interface energies that contribute to the stability of a nucleating
cluster. Results show that needle-shape precipitates are unstable to growth
even at the smallest size formula unit, i.e. there is no energy
barrier to growth. The small differences between different compositions points
toward the need for the study of possible precipitate/matrix interface
reconstruction. A classical semi-quantitative nucleation theory approach
including elastic strain energy captures the trends in precipitate energy
versus size and composition. This validates the use of mesoscale models to
assess stability and interactions of precipitates. Studies of smaller 3d
clusters also show stability relative to the solid solution state, indicating
that the early stages of precipitation may be diffusion-limited. Overall, these
results demonstrate the important interplay among composition-dependent bulk,
interface, and elastic strain energies in determining nanoscale precipitate
stability and growth
Early Stages of Precipitation In Aluminum Alloys by First-Principles and Machine-Learning Atomistic Simulations
Age hardening induced by the formation of (semi)-coherent precipitate phases is crucial for the processing and final properties of the widely used Al-6000 alloys despite the early stages of precipitation are still far from being fully understood. This crucial step in the technology of Al-based alloys is studied by means of multi-scale simulations that include first-principles atomistic modeling, surrogate models based on statistical learning, as well as kinetic Monte Carlo and continuum elasticity models to bridge time and length scales. We begin with an analysis of the energetics of nanometric precipitates of the meta-stable beta'' phases (that play a crucial role in this system) identifying the bulk, elastic strain and interface energies that contribute to the stability of a nucleating cluster. Results show that needle-shape precipitates are unstable to growth even at the smallest size beta'' formula unit. This study made it possible to develop a semi-quantitative classical nucleation theory model, including also elastic strain energy, that captures the trends in precipitate energy versus size and composition. This validates the use of mesoscale models to assess stability and interactions of beta'' precipitates. Studies of smaller 3D clusters also show stability relative to the solid solution state, indicating that the early stages of precipitation may be diffusion-limited.
Our results thus point toward the need for a systematic study of the energetics of aggregates in the Guinier-Preston zone regime, and the interactions between those aggregates and vacancies and/or trace elements to understand and fine-tune the behavior of Al-6000 alloys in the early stages of precipitation. To enable full atomistic-level simulations of the whole precipitation sequence of this important alloy system, two Neural Network (NN) potentials have been created by representing just 2-body interactions and including also the 3-body interactions. For the latter, we developed an automatic scheme to determine the most appropriate representation of the structural features of this ternary alloy. Training of the NN uses an extensive database of energies and forces computed using Density Functional Theory, including complex precipitate phases. The NN potentials accurately reproduce most of the properties of pure Al which are relevant to the mechanical behavior and formation energies of small solute clusters and precipitates that are required for modeling the precipitation and mechanical strengthening. This success not only enables future detailed studies of Al-Mg-Si but also highlights the ability of machine learning methods to generate useful potentials in complex alloy systems.
Finally, we used this NN potential to implement a kinetic Monte Carlo scheme to study the formation of pre-precipitation clusters. While quantitative accuracy will probably require further refinement of its training set, to achieve a more complete description of the interactions between solute atoms and vacancies, we could already observe some of the key mechanisms the determine the ultra-fast formation of aggregates. This work lays the foundations for a thorough investigation of the behavior of Al-6000 alloys over time and size scales that are technologically relevant and demonstrates a combination of atomistic modeling techniques that could be adapted to a large number of similar metallic alloys
Automatic selection of atomic fingerprints and reference configurations for machine-learning potentials
Machine learning of atomic-scale properties is revolutionizing molecular
modelling, making it possible to evaluate inter-atomic potentials with
first-principles accuracy, at a fraction of the costs. The accuracy, speed and
reliability of machine-learning potentials, however, depends strongly on the
way atomic configurations are represented, i.e. the choice of descriptors used
as input for the machine learning method. The raw Cartesian coordinates are
typically transformed in "fingerprints", or "symmetry functions", that are
designed to encode, in addition to the structure, important properties of the
potential-energy surface like its invariances with respect to rotation,
translation and permutation of like atoms. Here we discuss automatic protocols
to select a number of fingerprints out of a large pool of candidates, based on
the correlations that are intrinsic to the training data. This procedure can
greatly simplify the construction of neural network potentials that strike the
best balance between accuracy and computational efficiency, and has the
potential to accelerate by orders of magnitude the evaluation of Gaussian
Approximation Potentials based on the Smooth Overlap of Atomic Positions
kernel. We present applications to the construction of neural network
potentials for water and for an Al-Mg-Si alloy, and to the prediction of the
formation energies of small organic molecules using Gaussian process
regression
Sensory Characteristics and Nutritional Quality of Food Products Made with a Biofortified and Lectin Free Common Bean (Phaseolus vulgaris L.) Flour
Common beans (Phaseolus vulgaris L.) are an important source of nutrients with beneficial effects on human health. However, they contain lectins, that limit the direct use of flour in food preparations without thermal treatment, and phytic acid, that reduces mineral cation bioavailability. The objectives of this research were: to obtain biofortified snacks and a cream using an untreated common bean flour devoid of active lectins (lec−) and with reduced content of phytic acid (lpa) and to evaluate the sensorial appreciation for these products. The main results of the present work were: the products with the lpa lec− flour did not retain residual hemagglutinating activity due to lectins; they showed higher residual α-amylase inhibitor activity (from 2.2 to 135 times), reduced in vitro predicted glycemic index (about 5 units reduction) and increased iron bioavailability compared to the products with wild type flour; products with common bean flour were less appreciated than the reference ones without this flour, but the presence of an intense umami taste can be a positive attribute. Results confirmed that the use of the lpa lec− flour has important advantages in the preparation of safe and nutritionally improved products, and provide useful information to identify target consumers, such as children and elderly people