International audienceThe Simons Observatory (SO), due to start full science operations in early 2025, aims to set tight constraints on inflationary physics by inferring the tensor-to-scalar ratio r from measurements of CMB polarization B-modes. Its nominal design targets a precision σ(r=0)≤0.003 without delensing. Achieving this goal and further reducing uncertainties requires the mitigation of other sources of large-scale B-modes such as Galactic foregrounds and weak gravitational lensing. We present an analysis pipeline aiming to estimate r by including delensing within a cross-spectral likelihood, and demonstrate it on SO-like simulations. Lensing B-modes are synthesised using internal CMB lensing reconstructions as well as Planck-like CIB maps and LSST-like galaxy density maps. This B-mode template is then introduced into SO's power-spectrum-based foreground-cleaning algorithm by extending the likelihood function to include all auto- and cross-spectra between the lensing template and the SAT B-modes. Within this framework, we demonstrate the equivalence of map-based and cross-spectral delensing and use it to motivate an optimized pixel-weighting scheme for power spectrum estimation. We start by validating our pipeline in the simplistic case of uniform foreground spectral energy distributions (SEDs). In the absence of primordial B-modes, σ(r) decreases by 37% as a result of delensing. Tensor modes at the level of r=0.01 are successfully detected by our pipeline. Even with more realistic foreground models including spatial variations in the dust and synchrotron spectral properties, we obtain unbiased estimates of r by employing the moment-expansion method. In this case, delensing-related improvements range between 27% and 31%. These results constitute the first realistic assessment of the delensing performance at SO's nominal sensitivity level. (Abridged