16 research outputs found
The RefinedWeb Dataset for Falcon LLM: Outperforming Curated Corpora with Web Data, and Web Data Only
Large language models are commonly trained on a mixture of filtered web data
and curated high-quality corpora, such as social media conversations, books, or
technical papers. This curation process is believed to be necessary to produce
performant models with broad zero-shot generalization abilities. However, as
larger models requiring pretraining on trillions of tokens are considered, it
is unclear how scalable is curation and whether we will run out of unique
high-quality data soon. At variance with previous beliefs, we show that
properly filtered and deduplicated web data alone can lead to powerful models;
even significantly outperforming models from the state-of-the-art trained on
The Pile. Despite extensive filtering, the high-quality data we extract from
the web is still plentiful, and we are able to obtain five trillion tokens from
CommonCrawl. We publicly release an extract of 600 billion tokens from our
RefinedWeb dataset, and 1.3/7.5B parameters language models trained on it
The Falcon Series of Open Language Models
We introduce the Falcon series: 7B, 40B, and 180B parameters causal
decoder-only models trained on a diverse high-quality corpora predominantly
assembled from web data. The largest model, Falcon-180B, has been trained on
over 3.5 trillion tokens of text--the largest openly documented pretraining
run. Falcon-180B significantly outperforms models such as PaLM or Chinchilla,
and improves upon concurrently developed models such as LLaMA 2 or
Inflection-1. It nears the performance of PaLM-2-Large at a reduced pretraining
and inference cost, making it, to our knowledge, one of the three best language
models in the world along with GPT-4 and PaLM-2-Large. We report detailed
evaluations, as well as a deep dive into the methods and custom tooling
employed to pretrain Falcon. Notably, we report on our custom distributed
training codebase, allowing us to efficiently pretrain these models on up to
4,096 A100s on cloud AWS infrastructure with limited interconnect. We release a
600B tokens extract of our web dataset, as well as the Falcon-7/40/180B models
under a permissive license to foster open-science and accelerate the
development of an open ecosystem of large language models
Brain metastasis and renal cell carcinoma : prognostic scores assessment in the era of targeted therapies
Aim: This study aimed at exploring several brain metastatic prognostic scores in patients with renal cell carcinoma. Patients and Methods: We retrospectively analyzed data of 93 metastatic renal cell carcinoma patients who were diagnosed with brain metastases between October 2005 and July 2016 who received targeted therapy. Potential prognostic factors (RTOG RPA, BS-BM, and a newly developed score CERENAL) were analyzed. Results: A total of 75 patients received targeted therapy. All scores showed prognostic value in progression-free survival after first-line treatment with CERENAL being the sole independent prognostic factor associated with improved duration of first-line treatment. Both RTOG RPA and CERENAL were potential prognosticators for overall survival, whereas only the CERENAL score was associated with prolonged disease-specific survival. Conclusion: Several prognostic scores can be useful to predict survival of patients with brain metastases from renal cancer, especially the newly developed CERENAL score
Vers la prĂ©diction de lâĂ©volution de la microstructure sous irradiation dâalliages ferritiques modĂšles par une approche hybride AKMC-OKMC
Ce travail de thĂšse consistait en premier lieu Ă accĂ©lĂ©rer un modĂšle de Monte Carlo CinĂ©tique Atomique visant Ă simuler lâĂ©volution de la microstructure dâalliages modĂšles du type FeCuMnNiSiP reprĂ©sentatifs de lâacier de cuve sous irradiation neutronique. Cette accĂ©lĂ©ration Ă©tait nĂ©cessaire pour atteindre des doses ainsi que des flux comparables Ă lâexpĂ©rience en des temps raisonnables. Pour cela, une accĂ©lĂ©ration algorithmique du code de calcul LAKIMOCA a dâabord Ă©tĂ© rĂ©alisĂ©e. Les diverses optimisations apportĂ©es ont permis dâaccĂ©lĂ©rer le code dâun facteur 7. Cette accĂ©lĂ©ration ne sâavĂ©rant pas suffisante, lâapproche retenue a Ă©tĂ© le dĂ©veloppement dâune approche hybride entre une approche Monte Carlo atomique et Monte Carlo dâobjets. La paramĂ©trisation du modĂšle objet a permis de mieux comprendre les macro Ă©vĂšnements en jeux dans les simulations, mais sâest rĂ©vĂ©lĂ©e ĂȘtre dâune grande difficultĂ© lorsque la complexitĂ© chimique des objets devient trop importante. NĂ©anmoins, lâapproche hybride a apportĂ© une accĂ©lĂ©ration des temps de calcul dâenviron deux ordres de grandeur permettant de simuler des doses correspondant Ă 40 ans dâirradiation en production. De ces rĂ©sultats, diffĂ©rentes limitations du modĂšle ainsi que de sa paramĂ©trisation ont Ă©tĂ© mises en Ă©vidence. La difficultĂ© du modĂšle Ă reproduire des effets de flux a Ă©tĂ© comblĂ©e par lâajout dâun absorbeur visant Ă rĂ©duire la force de puits des joints de grains ainsi que par lâajout de piĂšges pour rendre compte de la prĂ©sence dâimpuretĂ© dans le fer pur. Les simulations Ă hautes doses dans les alliages du type FeCuMnNiSiP ont aussi mis en Ă©vidence des diffĂ©rences entre les microstructures simulĂ©es et celles observĂ©es expĂ©rimentalement. Ainsi, dans un second temps, un nouveau modĂšle de cohĂ©sion basĂ©e sur des interactions de paires dĂ©pendantes de la concentration locale a Ă©tĂ© dĂ©veloppĂ© et paramĂ©trĂ©. Bien que le nouveau modĂšle de cohĂ©sion soit numĂ©riquement plus lourd, il a Ă©tĂ© possible dâatteindre la dose ciblĂ©e en le couplant Ă lâapproche hybride. Les rĂ©sultats obtenus sont en meilleur en accord avec les calculs DFT rĂ©cents ainsi quâavec les microstructures expĂ©rimentales.This PhD thesis work consisted, in the first place, in accelerating an atomic kinetic Monte Carlo model aiming at simulating the microstructure evolution of the FeCuMnNiP model alloys, representative of the reactor pressure vessel steels, under irradiation. This acceleration was required to reach, in a reasonable amount of time, doses and flux conditions comparable to the experimental ones. To do so, an algorithmic optimization has first been performed. The different optimizations introduced lead to an acceleration of the code of a 7 factor. Since this acceleration was not sufficient, the retained approach was to develop an hybrid between an AKMC and an OKMC. The parameterization of the object model provided a better understanding of the macro events involved in the simulations. It turns out that parameterize the model became too complex when increasing the chemical complexity of the objects. However, the hybrid approach brings an acceleration of two orders of magnitude allowing reaching doses corresponding to 40 years of irradiation in service condition. From these results, different limitations of the model as well as the parameterization were highlighted. The difficulty of the model to reproduce flux effect has been solved by adding an absorber that reduced the grain boundary sink strength. Traps have also been introduced to simulate the presence of impurities in pure iron. The high doses simulations in FeCuMnNiSiP model alloys also highlighted differences between the microstructures simulated and those observed experimentally. Thus, in a second time, a new cohesive model based on concentration dependent pair interactions has been developed and parameterized. While the new cohesive model is numerically heavier than the previous one, it has been possible to reach the target dose by coupling it with the hybrid model. The results obtained are in better agreement with recent DFT calculations and experimental microstructures
Atomic scale mechanisms for the amorphisation of irradiated graphite
International audienc
GNN-based structural information to improve DNN-based basal ganglia segmentation in children following early brain lesion
International audienceAnalyzing the basal ganglia following an early brain lesion is crucial due to their noteworthy role in sensoryâmotor functions. However, the segmentation of these subcortical structures on MRI is challenging in children and is further complicated by the presence of a lesion. Although current deep neural networks (DNN) perform well in segmenting subcortical brain structures in healthy brains, they lack robustness when faced with lesion variability, leading to structural inconsistencies. Given the established spatial organization of the basal ganglia, we propose enhancing the DNN-based segmentation through post-processing with a graph neural network (GNN). The GNN conducts node classification on graphs encoding both class probabilities and spatial information regarding the regions segmented by the DNN. In this study, we focus on neonatal arterial ischemic stroke (NAIS) in children. The approach is evaluated on both healthy children and children after NAIS using three DNN backbones: U-Net, UNETr, and MSGSE-Net. The results show an improvement in segmentation performance, with an increase in the median Dice score by up to 4% and a reduction in the median Hausdorff distance (HD) by up to 93% for healthy children (from 36.45 to 2.57) and up to 91% for children suffering from NAIS (from 40.64 to 3.50). The performance of the method is compared with atlas-based methods. Severe cases of neonatal stroke result in a decline in performance in the injured hemisphere, without negatively affecting the segmentation of the contra-injured hemisphere. Furthermore, the approach demonstrates resilience to small training datasets, a widespread challenge in the medical field, particularly in pediatrics and for rare pathologies
Detecting cerebral palsy in neonatal stroke children: GNN-based detection considering the structural organization of basal ganglia
International audienceAs a long-term consequence of neonatal arterial ischaemic stroke (NAIS), the presence of cerebral palsy (CP) depends on the structural integrity of brain areas, especially of basal ganglia. Yet, it remains challenging to establish an early diagnosis of CP from a conventional structural MRI. In this study, we introduce a graph neural network-based classification for the recognition of NAIS children and mainly for the detection of children with CP among the NAIS ones. From the structural MRI of 68 children aged 7 years old and their corresponding segmentation of basal ganglia, we construct graphs where nodes represent structures, carrying on node and edge attributes structural information (volumes, distances). The classification accuracy achieved by the proposed method is of 86% for the detection of NAIS and of 89% for the detection of CP among neonatal stroke children
Hand function after neonatal stroke: a graph model based on basal ganglia and thalami structure
International audienceIntroduction: Neonatal arterial ischemic stroke (NAIS) is a common model to study the impact of a unilateral early brain insult on developmental brain plasticity and the appearance of long-term outcomes. Motor difficulties that may arise are typically related to poor function of the affected (contra-lesioned) hand, but surprisingly also of the ipsilesional hand. Although many longitudinal studies after NAIS have shown that predicting the occurrence of gross motor difficulties is easier, accurately predicting hand motor function (for both hands) from morphometric MRI remains complicated. The hypothesis of an association between the structural organization of the basal ganglia (BG) and thalamus with hand motor function seems intuitive, given their key role in sensorimotor function. Neuroimaging studies have frequently investigated these structures to evaluate the correlation between their volumes and motor function following early brain injury. However, the results have been controversial. We hypothesize the involvement of other structural parameters.Method: The study involves 35 children (mean age 7.3 years, SD 0.4) with middle cerebral artery NAIS who underwent a structural T1-weighted 3D MRI and clinical examination to assess manual dexterity using the Box and Blocks Test (BBT). Graphs are used to represent high-level structural information of the BG and thalami (volumes, elongations, distances) measured from the MRI. A graph neural network (GNN) is proposed to predict childrenâs hand motor function through a graph regression. To reduce the impact of external factors on motor function (such as behavior and cognition), we calculate a BBT score ratio for each child and hand.Results: The results indicate a significant correlation between the score ratios predicted by our method and the actual score ratios of both hands (p < 0.05), together with a relatively high accuracy of prediction (mean L1 distance < 0.03). The structural information seems to have a different influence on each handâs motor function. The affected handâs motor function is more correlated with the volume, while the âunaffectedâ hand function is more correlated with the elongation of the structures. Experiments emphasize the importance of considering the whole macrostructural organization of the basal ganglia and thalami networks, rather than the volume alone, to predict hand motor function.Conclusion: There is a significant correlation between the structural characteristics of the basal ganglia/thalami and motor function in both hands. These results support the use of MRI macrostructural features of the basal ganglia and thalamus as an early biomarker for predicting motor function in both hands after early brain injury
Hand function after neonatal stroke: A graph model based on basal ganglia and thalami structure
Introduction: Neonatal arterial ischemic stroke (NAIS) is a common model to study the impact of a unilateral early brain insult on developmental brain plasticity and the appearance of long-term outcomes. Motor difficulties that may arise are typically related to poor function of the affected (contra-lesioned) hand, but surprisingly also of the ipsilesional hand. Although many longitudinal studies after NAIS have shown that predicting the occurrence of gross motor difficulties is easier, accurately predicting hand motor function (for both hands) from morphometric MRI remains complicated. The hypothesis of an association between the structural organization of the basal ganglia (BG) and thalamus with hand motor function seems intuitive given their key role in sensorimotor function. Neuroimaging studies have frequently investigated these structures to evaluate the correlation between their volumes and motor function following early brain injury. However, the results have been controversial. We hypothesize the involvement of other structural parameters. Method: The study involves 35 children (mean age 7.3 years, SD 0.4) with middle cerebral artery NAIS who underwent a structural T1-weighted 3D MRI and clinical examination to assess manual dexterity using the Box and Blocks Test (BBT). Graphs are used to represent high-level structural information of the BG and thalami (volumes, elongations, distances) measured from the MRI. A graph neural network (GNN) is proposed to predict childrenâs hand motor function through a graph regression. To reduce the impact of external factors on motor function (such as behavior and cognition), we calculate a BBT score ratio for each child and hand. Results: The results indicate a significant correlation between the score ratios predicted by our method and the actual score ratios of both hands (p < 0.05), together with a relatively high accuracy of prediction (mean L1 distance < 0.03). The structural information seems to have a different influence on each handâs motor function. The affected handâs motor function is more correlated with the volume, while the âunaffectedâ hand function is more correlated with the elongation of the structures. Experiments emphasize the importance of considering the whole macrostructural organization of the basal ganglia and thalami networks, rather than the volume alone, to predict hand motor function. Conclusion: There is a significant correlation between the structural characteristics of the basal ganglia/thalami and motor function in both hands. These results support the use of MRI macrostructural features of the basal ganglia and thalamus as an early biomarker for predicting motor function in both hands after early brain injury