We study the phase transition phenomenon inherent in the shuffled (permuted)
regression problem, which has found numerous applications in databases,
privacy, data analysis, etc. In this study, we aim to precisely identify the
locations of the phase transition points by leveraging techniques from message
passing (MP). In our analysis, we first transform the permutation recovery
problem into a probabilistic graphical model. We then leverage the analytical
tools rooted in the message passing (MP) algorithm and derive an equation to
track the convergence of the MP algorithm. By linking this equation to the
branching random walk process, we are able to characterize the impact of the
signal-to-noise-ratio (\snr) on the permutation recovery. Depending on
whether the signal is given or not, we separately investigate the oracle case
and the non-oracle case. The bottleneck in identifying the phase transition
regimes lies in deriving closed-form formulas for the corresponding critical
points, but only in rare scenarios can one obtain such precise expressions. To
tackle this technical challenge, this study proposes the Gaussian approximation
method, which allows us to obtain the closed-form formulas in almost all
scenarios. In the oracle case, our method can fairly accurately predict the
phase transition \snr. In the non-oracle case, our algorithm can predict the
maximum allowed number of permuted rows and uncover its dependency on the
sample number