Navigating the landscape of particle accelerators has become increasingly
challenging with recent surges in contributions. These intricate devices
challenge comprehension, even within individual facilities. To address this, we
introduce PACuna, a fine-tuned language model refined through publicly
available accelerator resources like conferences, pre-prints, and books. We
automated data collection and question generation to minimize expert
involvement and make the data publicly available. PACuna demonstrates
proficiency in addressing intricate accelerator questions, validated by
experts. Our approach shows adapting language models to scientific domains by
fine-tuning technical texts and auto-generated corpora capturing the latest
developments can further produce pre-trained models to answer some intricate
questions that commercially available assistants cannot and can serve as
intelligent assistants for individual facilities