We present a fully probabilistic, physical model of the non-linearly evolved
density field, as probed by realistic galaxy surveys. Our model is valid in the
linear and mildly non-linear regimes and uses second order Lagrangian
perturbation theory to connect the initial conditions with the final density
field. Our parameter space consists of the 3D initial density field and our
method allows a fully Bayesian exploration of the sets of initial conditions
that are consistent with the galaxy distribution sampling the final density
field. A natural byproduct of this technique is an optimal non-linear
reconstruction of the present density and velocity fields, including a full
propagation of the observational uncertainties. A test of these methods on
simulated data mimicking the survey mask, selection function and galaxy number
of the SDSS DR7 main sample shows that this physical model gives accurate
reconstructions of the underlying present-day density and velocity fields on
scales larger than ~6 Mpc/h. Our method naturally and accurately reconstructs
non-linear features corresponding to three-point and higher order correlation
functions such as walls and filaments. Simple tests of the reconstructed
initial conditions show statistical consistency with the Gaussian simulation
inputs. Our test demonstrates that statistical approaches based on physical
models of the large scale structure distribution are now becoming feasible for
realistic current and future surveys.Comment: 20 pages, 8 figure