Knowledge of customer phase connection in low-voltage distribution networks is important for Distribution
System Operators (DSOs). This paper presents a novel data-driven phase identification method based on Bayesian
inference, which uses load consumption profiles as inputs. This method uses a non-linear function to establish the
probability of a customer being connected to a given phase, based on variations in the customer’s consumption
and those in the phase feeders. Owing to the Bayesian inference, the proposed method can provide up-to-date
certainty about the phase connection of each customer. To improve the detection of those customers that are
more difficult to identify, after obtaining the up-to-date certainty for all users, the consumption of those who
have an up-to-date certainty above a certain percentile compared with the rest of the substation (those that are
more likely to be correctly classified) is subtracted from the phase in which they are classified. The performance
of the proposed method was evaluated using a real (non-synthetic) low-voltage distribution network. Favourable
results (with accuracies higher than 97 %) were obtained in almost all cases, regardless of the percentage of
Smart Meter penetration and the size of the substation. A comparison with other state-of-the-art methods showed
that the proposed method outperforms (or equals) them. The proposed method does not necessarily require
previously labelled data; however, it can handle them even if they contain errors. Having previous information
(partial or complete) increases the performance of phase identification, making it possible to correct erroneous
previous labelling