5 research outputs found

    Which Clustering Do You Want? Inducing Your Ideal Clustering with Minimal Feedback

    No full text
    Abstract While traditional research on text clustering has largely focused on grouping documents by topic, it is conceivable that a user may want to cluster documents along other dimensions, such as the author's mood, gender, age, or sentiment. Without knowing the user's intention, a clustering algorithm will only group documents along the most prominent dimension, which may not be the one the user desires. To address the problem of clustering documents along the user-desired dimension, previous work has focused on learning a similarity metric from data manually annotated with the user's intention or having a human construct a feature space in an interactive manner during the clustering process. With the goal of reducing reliance on human knowledge for fine-tuning the similarity function or selecting the relevant features required by these approaches, we propose a novel active clustering algorithm, which allows a user to easily select the dimension along which she wants to cluster the documents by inspecting only a small number of words. We demonstrate the viability of our algorithm on a variety of commonly-used sentiment datasets

    Feedback Clustering for Online Travel Agencies Searches: a Case Study

    Get PDF
    Understanding choices performed by online customers is a growing need in the travel industry. In many practical situations, the only available information is the flight search query performed by the customer with no additional profile knowledge. In general, customer flight bookings are driven by prices, duration, number of connections, and so on. However, not all customers might assign the same importance to each of those criteria. Here comes the need of grouping together all flight searches performed by the same kind of customer, that is having the same booking criteria. The effectiveness of some set of recommendations, for a single cluster, can be measured in terms of the number of bookings historically performed. This effectiveness measure plays the role of a feedback, that is an external knowledge which can be recombined to iteratively obtain a final segmentation. In this paper, we describe our Online Travel Agencies (OTA) flight search use case and highlight its specific features. We address the flight search segmentation problem motivated above by proposing a novel algorithm called Split-or-Merge (S/M). This algorithm is a variation of the Split-Merge-Evolve (SME) method. The SME method has already been introduced in the community as an iterative process updating a clustering given by the K-means algorithm by splitting and merging clusters subject to feedback independent evaluations. No previous application of the SME method to the real-word data is reported in literature to the best of our knowledge. Here, we provide experimental evaluations over real-world data to the SME and the S/M methods. The impact on our domain-specific metrics obtained under the SME and the S/M methods suggests that feedback clustering techniques can be very promising in the handling of the domain of OTA flight searches

    Which Clustering Do You Want? Inducing Your Ideal Clustering with Minimal Feedback

    No full text
    corecore