Search CORE

16 research outputs found

The RefinedWeb Dataset for Falcon LLM: Outperforming Curated Corpora with Web Data, and Web Data Only

Author: Almazrouei Ebtesam
Alobeidli Hamza
Cappelli Alessandro
Cojocaru Ruxandra
Hesslow Daniel
Launay Julien
Malartic Quentin
Pannier Baptiste
Penedo Guilherme
Publication venue
Publication date: 01/06/2023
Field of study

Large language models are commonly trained on a mixture of filtered web data and curated high-quality corpora, such as social media conversations, books, or technical papers. This curation process is believed to be necessary to produce performant models with broad zero-shot generalization abilities. However, as larger models requiring pretraining on trillions of tokens are considered, it is unclear how scalable is curation and whether we will run out of unique high-quality data soon. At variance with previous beliefs, we show that properly filtered and deduplicated web data alone can lead to powerful models; even significantly outperforming models from the state-of-the-art trained on The Pile. Despite extensive filtering, the high-quality data we extract from the web is still plentiful, and we are able to obtain five trillion tokens from CommonCrawl. We publicly release an extract of 600 billion tokens from our RefinedWeb dataset, and 1.3/7.5B parameters language models trained on it

arXiv.org e-Print Archive

The Falcon Series of Open Language Models

Author: Almazrouei Ebtesam
Alobeidli Hamza
Alshamsi Abdulaziz
Cappelli Alessandro
Cojocaru Ruxandra
Debbah Mérouane
Goffinet Étienne
Hesslow Daniel
Launay Julien
Malartic Quentin
Mazzotta Daniele
Noune Badreddine
Pannier Baptiste
Penedo Guilherme
Publication venue
Publication date: 29/11/2023
Field of study

We introduce the Falcon series: 7B, 40B, and 180B parameters causal decoder-only models trained on a diverse high-quality corpora predominantly assembled from web data. The largest model, Falcon-180B, has been trained on over 3.5 trillion tokens of text--the largest openly documented pretraining run. Falcon-180B significantly outperforms models such as PaLM or Chinchilla, and improves upon concurrently developed models such as LLaMA 2 or Inflection-1. It nears the performance of PaLM-2-Large at a reduced pretraining and inference cost, making it, to our knowledge, one of the three best language models in the world along with GPT-4 and PaLM-2-Large. We report detailed evaluations, as well as a deep dive into the methods and custom tooling employed to pretrain Falcon. Notably, we report on our custom distributed training codebase, allowing us to efficiently pretrain these models on up to 4,096 A100s on cloud AWS infrastructure with limited interconnect. We release a 600B tokens extract of our web dataset, as well as the Falcon-7/40/180B models under a permissive license to foster open-science and accelerate the development of an open ecosystem of large language models

arXiv.org e-Print Archive

Brain metastasis and renal cell carcinoma : prognostic scores assessment in the era of targeted therapies

Author: Awada Ahmad
Barthelemy Philippe
Clavier Jean-Baptiste
Devrient Daniel
El Ali Ziad
Gil Thierry
Kotecki Nuria
Pannier Diane
Penel Nicolas
Rottey Sylvie
Ryckewaert Thomas
Van Paemel Ruben
Vermassen Tijl
Waisse Waissi
Publication venue: 'Anticancer Research USA Inc.'
Publication date: 01/01/2019
Field of study

Aim: This study aimed at exploring several brain metastatic prognostic scores in patients with renal cell carcinoma. Patients and Methods: We retrospectively analyzed data of 93 metastatic renal cell carcinoma patients who were diagnosed with brain metastases between October 2005 and July 2016 who received targeted therapy. Potential prognostic factors (RTOG RPA, BS-BM, and a newly developed score CERENAL) were analyzed. Results: A total of 75 patients received targeted therapy. All scores showed prognostic value in progression-free survival after first-line treatment with CERENAL being the sole independent prognostic factor associated with improved duration of first-line treatment. Both RTOG RPA and CERENAL were potential prognosticators for overall survival, whereas only the CERENAL score was associated with prolonged disease-specific survival. Conclusion: Several prognostic scores can be useful to predict survival of patients with brain metastases from renal cancer, especially the newly developed CERENAL score

Ghent University Academic Bibliography

Vers la prédiction de l’évolution de la microstructure sous irradiation d’alliages ferritiques modèles par une approche hybride AKMC-OKMC

Author: Pannier Baptiste
Publication venue
Publication date: 27/06/2017
Field of study

Ce travail de thèse consistait en premier lieu à accélérer un modèle de Monte Carlo Cinétique Atomique visant à simuler l’évolution de la microstructure d’alliages modèles du type FeCuMnNiSiP représentatifs de l’acier de cuve sous irradiation neutronique. Cette accélération était nécessaire pour atteindre des doses ainsi que des flux comparables à l’expérience en des temps raisonnables. Pour cela, une accélération algorithmique du code de calcul LAKIMOCA a d’abord été réalisée. Les diverses optimisations apportées ont permis d’accélérer le code d’un facteur 7. Cette accélération ne s’avérant pas suffisante, l’approche retenue a été le développement d’une approche hybride entre une approche Monte Carlo atomique et Monte Carlo d’objets. La paramétrisation du modèle objet a permis de mieux comprendre les macro évènements en jeux dans les simulations, mais s’est révélée être d’une grande difficulté lorsque la complexité chimique des objets devient trop importante. Néanmoins, l’approche hybride a apporté une accélération des temps de calcul d’environ deux ordres de grandeur permettant de simuler des doses correspondant à 40 ans d’irradiation en production. De ces résultats, différentes limitations du modèle ainsi que de sa paramétrisation ont été mises en évidence. La difficulté du modèle à reproduire des effets de flux a été comblée par l’ajout d’un absorbeur visant à réduire la force de puits des joints de grains ainsi que par l’ajout de pièges pour rendre compte de la présence d’impureté dans le fer pur. Les simulations à hautes doses dans les alliages du type FeCuMnNiSiP ont aussi mis en évidence des différences entre les microstructures simulées et celles observées expérimentalement. Ainsi, dans un second temps, un nouveau modèle de cohésion basée sur des interactions de paires dépendantes de la concentration locale a été développé et paramétré. Bien que le nouveau modèle de cohésion soit numériquement plus lourd, il a été possible d’atteindre la dose ciblée en le couplant à l’approche hybride. Les résultats obtenus sont en meilleur en accord avec les calculs DFT récents ainsi qu’avec les microstructures expérimentales.This PhD thesis work consisted, in the first place, in accelerating an atomic kinetic Monte Carlo model aiming at simulating the microstructure evolution of the FeCuMnNiP model alloys, representative of the reactor pressure vessel steels, under irradiation. This acceleration was required to reach, in a reasonable amount of time, doses and flux conditions comparable to the experimental ones. To do so, an algorithmic optimization has first been performed. The different optimizations introduced lead to an acceleration of the code of a 7 factor. Since this acceleration was not sufficient, the retained approach was to develop an hybrid between an AKMC and an OKMC. The parameterization of the object model provided a better understanding of the macro events involved in the simulations. It turns out that parameterize the model became too complex when increasing the chemical complexity of the objects. However, the hybrid approach brings an acceleration of two orders of magnitude allowing reaching doses corresponding to 40 years of irradiation in service condition. From these results, different limitations of the model as well as the parameterization were highlighted. The difficulty of the model to reproduce flux effect has been solved by adding an absorber that reduced the grain boundary sink strength. Traps have also been introduced to simulate the presence of impurities in pure iron. The high doses simulations in FeCuMnNiSiP model alloys also highlighted differences between the microstructures simulated and those observed experimentally. Thus, in a second time, a new cohesive model based on concentration dependent pair interactions has been developed and parameterized. While the new cohesive model is numerically heavier than the previous one, it has been possible to reach the target dose by coupling it with the hybrid model. The results obtained are in better agreement with recent DFT calculations and experimental microstructures

Theses.fr

Atomic scale mechanisms for the amorphisation of irradiated graphite

Author: Baranek Philippe
Chartier Alain
Pannier Baptiste
van Brutzel Laurent
Publication venue: 'Elsevier BV'
Publication date: 01/09/2015
Field of study

International audienc

Crossref

HAL-CEA

Hal-Diderot

GNN-based structural information to improve DNN-based basal ganglia segmentation in children following early brain lesion

Author: Coupeau Patty
Dinomais Mickael
Fasquel Jean-Baptiste
Hertz-Pannier Lucie
Publication venue: Elsevier
Publication date: 01/07/2024
Field of study

International audienceAnalyzing the basal ganglia following an early brain lesion is crucial due to their noteworthy role in sensory–motor functions. However, the segmentation of these subcortical structures on MRI is challenging in children and is further complicated by the presence of a lesion. Although current deep neural networks (DNN) perform well in segmenting subcortical brain structures in healthy brains, they lack robustness when faced with lesion variability, leading to structural inconsistencies. Given the established spatial organization of the basal ganglia, we propose enhancing the DNN-based segmentation through post-processing with a graph neural network (GNN). The GNN conducts node classification on graphs encoding both class probabilities and spatial information regarding the regions segmented by the DNN. In this study, we focus on neonatal arterial ischemic stroke (NAIS) in children. The approach is evaluated on both healthy children and children after NAIS using three DNN backbones: U-Net, UNETr, and MSGSE-Net. The results show an improvement in segmentation performance, with an increase in the median Dice score by up to 4% and a reduction in the median Hausdorff distance (HD) by up to 93% for healthy children (from 36.45 to 2.57) and up to 91% for children suffering from NAIS (from 40.64 to 3.50). The performance of the method is compared with atlas-based methods. Severe cases of neonatal stroke result in a decline in performance in the injured hemisphere, without negatively affecting the segmentation of the contra-injured hemisphere. Furthermore, the approach demonstrates resilience to small training datasets, a widespread challenge in the medical field, particularly in pediatrics and for rare pathologies

HAL-CEA

Detecting cerebral palsy in neonatal stroke children: GNN-based detection considering the structural organization of basal ganglia

Author: Coupeau Patty
Dinomais Mickael
Démas Josselin
Fasquel Jean-Baptiste
Hertz-Pannier Lucie
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 18/04/2023
Field of study

International audienceAs a long-term consequence of neonatal arterial ischaemic stroke (NAIS), the presence of cerebral palsy (CP) depends on the structural integrity of brain areas, especially of basal ganglia. Yet, it remains challenging to establish an early diagnosis of CP from a conventional structural MRI. In this study, we introduce a graph neural network-based classification for the recognition of NAIS children and mainly for the detection of children with CP among the NAIS ones. From the structural MRI of 68 children aged 7 years old and their corresponding segmentation of basal ganglia, we construct graphs where nodes represent structures, carrying on node and edge attributes structural information (volumes, distances). The classification accuracy achieved by the proposed method is of 86% for the detection of NAIS and of 89% for the detection of CP among neonatal stroke children

HAL-CEA

Hand function after neonatal stroke: a graph model based on basal ganglia and thalami structure

Author: Chabrier Stéphane
Coupeau Patty
Dinomais Mickael
Démas Josselin
Fasquel Jean-Baptiste
Hertz-Pannier Lucie
Publication venue: Elsevier
Publication date: 01/01/2024
Field of study

International audienceIntroduction: Neonatal arterial ischemic stroke (NAIS) is a common model to study the impact of a unilateral early brain insult on developmental brain plasticity and the appearance of long-term outcomes. Motor difficulties that may arise are typically related to poor function of the affected (contra-lesioned) hand, but surprisingly also of the ipsilesional hand. Although many longitudinal studies after NAIS have shown that predicting the occurrence of gross motor difficulties is easier, accurately predicting hand motor function (for both hands) from morphometric MRI remains complicated. The hypothesis of an association between the structural organization of the basal ganglia (BG) and thalamus with hand motor function seems intuitive, given their key role in sensorimotor function. Neuroimaging studies have frequently investigated these structures to evaluate the correlation between their volumes and motor function following early brain injury. However, the results have been controversial. We hypothesize the involvement of other structural parameters.Method: The study involves 35 children (mean age 7.3 years, SD 0.4) with middle cerebral artery NAIS who underwent a structural T1-weighted 3D MRI and clinical examination to assess manual dexterity using the Box and Blocks Test (BBT). Graphs are used to represent high-level structural information of the BG and thalami (volumes, elongations, distances) measured from the MRI. A graph neural network (GNN) is proposed to predict children’s hand motor function through a graph regression. To reduce the impact of external factors on motor function (such as behavior and cognition), we calculate a BBT score ratio for each child and hand.Results: The results indicate a significant correlation between the score ratios predicted by our method and the actual score ratios of both hands (p < 0.05), together with a relatively high accuracy of prediction (mean L1 distance < 0.03). The structural information seems to have a different influence on each hand’s motor function. The affected hand’s motor function is more correlated with the volume, while the ‘unaffected’ hand function is more correlated with the elongation of the structures. Experiments emphasize the importance of considering the whole macrostructural organization of the basal ganglia and thalami networks, rather than the volume alone, to predict hand motor function.Conclusion: There is a significant correlation between the structural characteristics of the basal ganglia/thalami and motor function in both hands. These results support the use of MRI macrostructural features of the basal ganglia and thalamus as an early biomarker for predicting motor function in both hands after early brain injury

Directory of Open Access Journals

HAL-CEA

Hand function after neonatal stroke: A graph model based on basal ganglia and thalami structure

Author: Jean-Baptiste Fasquel
Josselin Démas
Lucie Hertz-Pannier
Mickael Dinomais
Patty Coupeau
Stéphane Chabrier
Publication venue: Elsevier
Publication date: 01/01/2024
Field of study

Introduction: Neonatal arterial ischemic stroke (NAIS) is a common model to study the impact of a unilateral early brain insult on developmental brain plasticity and the appearance of long-term outcomes. Motor difficulties that may arise are typically related to poor function of the affected (contra-lesioned) hand, but surprisingly also of the ipsilesional hand. Although many longitudinal studies after NAIS have shown that predicting the occurrence of gross motor difficulties is easier, accurately predicting hand motor function (for both hands) from morphometric MRI remains complicated. The hypothesis of an association between the structural organization of the basal ganglia (BG) and thalamus with hand motor function seems intuitive given their key role in sensorimotor function. Neuroimaging studies have frequently investigated these structures to evaluate the correlation between their volumes and motor function following early brain injury. However, the results have been controversial. We hypothesize the involvement of other structural parameters. Method: The study involves 35 children (mean age 7.3 years, SD 0.4) with middle cerebral artery NAIS who underwent a structural T1-weighted 3D MRI and clinical examination to assess manual dexterity using the Box and Blocks Test (BBT). Graphs are used to represent high-level structural information of the BG and thalami (volumes, elongations, distances) measured from the MRI. A graph neural network (GNN) is proposed to predict children’s hand motor function through a graph regression. To reduce the impact of external factors on motor function (such as behavior and cognition), we calculate a BBT score ratio for each child and hand. Results: The results indicate a significant correlation between the score ratios predicted by our method and the actual score ratios of both hands (p < 0.05), together with a relatively high accuracy of prediction (mean L1 distance < 0.03). The structural information seems to have a different influence on each hand’s motor function. The affected hand’s motor function is more correlated with the volume, while the ‘unaffected’ hand function is more correlated with the elongation of the structures. Experiments emphasize the importance of considering the whole macrostructural organization of the basal ganglia and thalami networks, rather than the volume alone, to predict hand motor function. Conclusion: There is a significant correlation between the structural characteristics of the basal ganglia/thalami and motor function in both hands. These results support the use of MRI macrostructural features of the basal ganglia and thalamus as an early biomarker for predicting motor function in both hands after early brain injury

Directory of Open Access Journals