This paper presents a model that uses the information that sellers publish in
real estate market websites to predict whether a property has higher or lower
price than the average price of its similar properties. The model learns the
correlation between price and information (text descriptions and features) of
real estate properties through automatic identification of latent semantic
content given by a machine learning model based on doc2vec and xgboost. The
proposed model was evaluated with a data set of 57,516 publications of real
estate properties collected from 2016 to 2018 of Bogot\'a city. Results show
that the accuracy of a classifier that involves text descriptions is slightly
higher than a classifier that only uses features of the real estate properties,
as text descriptions tends to contain detailed information about the property