Machine learning, algorithms to extract empirical knowledge from data, can be
used to classify data, which is one of the most common tasks in observational
astronomy. In this paper, we focus on Bayesian data classification algorithms
using the Gaussian mixture model and show two applications in pulsar astronomy.
After reviewing the Gaussian mixture model and the related
Expectation-Maximization algorithm, we present a data classification method
using the Neyman-Pearson test. To demonstrate the method, we apply the
algorithm to two classification problems. Firstly, it is applied to the well
known period-period derivative diagram, where we find that the pulsar
distribution can be modeled with six Gaussian clusters, with two clusters for
millisecond pulsars (recycled pulsars) and the rest for normal pulsars. From
this distribution, we derive an empirical definition for millisecond pulsars as
10−17P˙≤3.23(100msP)−2.34. The two
millisecond pulsar clusters may have different evolutionary origins, since the
companion stars to these pulsars in the two clusters show different chemical
composition. Four clusters are found for normal pulsars. Possible implications
for these clusters are also discussed. Our second example is to calculate the
likelihood of unidentified \textit{Fermi} point sources being pulsars and rank
them accordingly. In the ranked point source list, the top 5% sources contain
50% known pulsars, the top 50% contain 99% known pulsars, and no known active
galaxy (the other major population) appears in the top 6%. Such a ranked list
can be used to help the future follow-up observations for finding pulsars in
unidentified \textit{Fermi} point sources.Comment: 9 pages, 4 figures, accepted by MNRA