This thesis presents two significant research contributions: one focuses on improving the adaptation of large language models (LLMs) using parameter-efficient fine-tuning (PEFT), and the other addresses the effective modelling of history-dependent stochastic processes—specifically Volterra processes, which are commonly applied in quantitative finance.
In the first part, I introduce a user-friendly adaptation pipeline that boosts the performance of a standard foundation model, bringing it much closer to a fully fine-tuned, task-specific version. Remarkably, it achieves this while using significantly less compute and memory, all while keeping data private. The pipeline leverages existing learnable low-rank adapters (LoRA) for known datasets and predicts adapter values for new datasets using this readily available information. Its main advantage is that it can run on a standard laptop without requiring GPU power, ensuring that data remains local. This method effectively closes about half of the performance gap between an untuned base model and a fully fine-tuned one, making specialized models more accessible to researchers, practitioners, and everyday users who lack expensive infrastructure or work with sensitive data on devices like smartphones.
The second part addresses a computational challenge in translating the non-Markovian Volterra process into a format suitable for computation. This translation is difficult because the data history dimension affecting the current state grows with the length of the path. I propose a two-step approach to make this process manageable: first, the Volterra process is mapped onto a simpler, lower-dimensional manifold; then, a geometric deep learning model—a "hypernetwork"—is applied, specifically designed for the manifold’s structure. We provide both mathematical and computational evidence demonstrating the model’s effectiveness and practicality (with proofs developed by co-authors available in the main paper), along with extensive testing of each parameter to validate our approach.ThesisMaster of Science (MSc)This thesis presents two contributions at the intersection of artificial intelligence and mathematics.
First, I introduce a novel method for adapting large language models on widely available hardware. This approach recovers half of the performance lost when using an untuned base model instead of a GPU fine-tuned one, while running on a single laptop with minimal cost and energy consumption. It makes specialized models more accessible, preserves privacy by keeping data local, and promotes environmentally responsible computing.
Second, I develop a practical framework for working with history-dependent stochastic processes commonly used in quantitative finance. Such processes are often too large to compute efficiently. The method proposed here compresses them into a low-dimensional representation and then applies a computational model, enabling efficient simulation, estimation, and practical application.
Together, these contributions introduce novel algorithms capable of addressing real-world problems from fresh perspectives
Is data on this page outdated, violates copyrights or anything else? Report the problem now and we will take corresponding actions after reviewing your request.