8 research outputs found

    Semi-supervised deep embedded clustering

    Get PDF
    National Research Foundation (NRF) Singapor

    Deep Clustering: A Comprehensive Survey

    Full text link
    Cluster analysis plays an indispensable role in machine learning and data mining. Learning a good data representation is crucial for clustering algorithms. Recently, deep clustering, which can learn clustering-friendly representations using deep neural networks, has been broadly applied in a wide range of clustering tasks. Existing surveys for deep clustering mainly focus on the single-view fields and the network architectures, ignoring the complex application scenarios of clustering. To address this issue, in this paper we provide a comprehensive survey for deep clustering in views of data sources. With different data sources and initial conditions, we systematically distinguish the clustering methods in terms of methodology, prior knowledge, and architecture. Concretely, deep clustering methods are introduced according to four categories, i.e., traditional single-view deep clustering, semi-supervised deep clustering, deep multi-view clustering, and deep transfer clustering. Finally, we discuss the open challenges and potential future opportunities in different fields of deep clustering

    Segment-based CO2 emission evaluations from passenger cars based on deep learning techniques

    Get PDF
    The overall level of emissions from the Swiss passenger cars is strongly dependent on the fleet composition. Despite technology improvements, the Swiss passenger cars fleet remains emissions intensive. To analyze the root of this problem and evaluate potential solutions, this paper applies deep learning techniques to evaluate the inter-class (namely micro, small, middle, upper middle, large and luxury class) and intra-class (namely sport utility vehicle and non-sport utility vehicle) differences in carbon dioxide (CO2) emissions. This paper takes full use of novel semi-supervised fuzzy C-means (SSFCM), random forest and AdaBoost models as well as model fusion to successfully classify passenger vehicles and enable segment-based CO2 emission evaluations

    Robust vehicle classification based on deep features learning

    Get PDF
    This paper aims to introduce a scientific Semi-Supervised Fuzzy C-Mean (SSFCM) clustering approach for passenger cars classification based on the feature learning technique. The proposed method is able to classify passenger vehicles in the micro, small, middle, upper middle, large and luxury classes. The performance of the algorithm is analyzed and compared with an unsupervised fuzzy C-means (FCM) clustering algorithm and Swiss expert classification dataset. Experiment results demonstrate that the classification of SSFCM algorithm has better correlation with expert classification than traditional unsupervised algorithm. These results exhibit that SSFCM can reduce the sensitivity of FCM to the initial cluster centroids with the help of labeled instances. Furthermore, SSFCM results in improved classification performance by using the resampling technique to deal with the multi-class imbalanced problem and eliminate the irrelevant and redundant features

    Fortschritte im unüberwachten Lernen und Anwendungsbereiche: Subspace Clustering mit Hintergrundwissen, semantisches Passworterraten und erlernte Indexstrukturen

    Get PDF
    Over the past few years, advances in data science, machine learning and, in particular, unsupervised learning have enabled significant progress in many scientific fields and even in everyday life. Unsupervised learning methods are usually successful whenever they can be tailored to specific applications using appropriate requirements based on domain expertise. This dissertation shows how purely theoretical research can lead to circumstances that favor overly optimistic results, and the advantages of application-oriented research based on specific background knowledge. These observations apply to traditional unsupervised learning problems such as clustering, anomaly detection and dimensionality reduction. Therefore, this thesis presents extensions of these classical problems, such as subspace clustering and principal component analysis, as well as several specific applications with relevant interfaces to machine learning. Examples include password guessing using semantic word embeddings and learning spatial index structures using statistical models. In essence, this thesis shows that application-oriented research has many advantages for current and future research.In den letzten Jahren haben Fortschritte in der Data Science, im maschinellen Lernen und insbesondere im unüberwachten Lernen zu erheblichen Fortentwicklungen in vielen Bereichen der Wissenschaft und des täglichen Lebens geführt. Methoden des unüberwachten Lernens sind in der Regel dann erfolgreich, wenn sie durch geeignete, auf Expertenwissen basierende Anforderungen an spezifische Anwendungen angepasst werden können. Diese Dissertation zeigt, wie rein theoretische Forschung zu Umständen führen kann, die allzu optimistische Ergebnisse begünstigen, und welche Vorteile anwendungsorientierte Forschung hat, die auf spezifischem Hintergrundwissen basiert. Diese Beobachtungen gelten für traditionelle unüberwachte Lernprobleme wie Clustering, Anomalieerkennung und Dimensionalitätsreduktion. Daher werden in diesem Beitrag Erweiterungen dieser klassischen Probleme, wie Subspace Clustering und Hauptkomponentenanalyse, sowie einige spezifische Anwendungen mit relevanten Schnittstellen zum maschinellen Lernen vorgestellt. Beispiele sind das Erraten von Passwörtern mit Hilfe semantischer Worteinbettungen und das Lernen von räumlichen Indexstrukturen mit Hilfe statistischer Modelle. Im Wesentlichen zeigt diese Arbeit, dass anwendungsorientierte Forschung viele Vorteile für die aktuelle und zukünftige Forschung hat