The recent proliferation of embedding models has enhanced the accessibility of textual data classification. However, the crucial challenge is evaluating and selecting the most effective embedding model for a specific domain from a vast number of options. In this study, we address this challenge by assessing the performance of embedding models based on their effectiveness in downstream tasks. We analyze consultation records maintained by an apartment management body in South Korea, and convert this textual data into numerical representations using various embedding models. The vectorized text is then categorized using a k-means clustering algorithm. The downstream task, specifically, the classification of consultation records, is evaluated using a quantitative metric (Silhouette score) and qualitative approaches (domain-specific knowledge and visual inspection). The qualitative approaches yield more reliable results than the quantitative approach. These findings are expected to be valuable for the various stakeholders in property management
Is data on this page outdated, violates copyrights or anything else? Report the problem now and we will take corresponding actions after reviewing your request.