52 research outputs found

    Multilevel Weighted Support Vector Machine for Classification on Healthcare Data with Missing Values

    Full text link
    This work is motivated by the needs of predictive analytics on healthcare data as represented by Electronic Medical Records. Such data is invariably problematic: noisy, with missing entries, with imbalance in classes of interests, leading to serious bias in predictive modeling. Since standard data mining methods often produce poor performance measures, we argue for development of specialized techniques of data-preprocessing and classification. In this paper, we propose a new method to simultaneously classify large datasets and reduce the effects of missing values. It is based on a multilevel framework of the cost-sensitive SVM and the expected maximization imputation method for missing values, which relies on iterated regression analyses. We compare classification results of multilevel SVM-based algorithms on public benchmark datasets with imbalanced classes and missing values as well as real data in health applications, and show that our multilevel SVM-based method produces fast, and more accurate and robust classification results.Comment: arXiv admin note: substantial text overlap with arXiv:1503.0625

    Shape Retrieval Methods for Architectural 3D Models

    Get PDF
    This thesis introduces new methods for content-based retrieval of architecture-related 3D models. We thereby consider two different overall types of architectural 3D models. The first type consists of context objects that are used for detailed design and decoration of 3D building model drafts. This includes e.g. furnishing for interior design or barriers and fences for forming the exterior environment. The second type consists of actual building models. To enable efficient content-based retrieval for both model types that is tailored to the user requirements of the architectural domain, type-specific algorithms must be developed. On the one hand, context objects like furnishing that provide similar functions (e.g. seating furniture) often share a similar shape. Nevertheless they might be considered to belong to different object classes from an architectural point of view (e.g. armchair, elbow chair, swivel chair). The differentiation is due to small geometric details and is sometimes only obvious to an expert from the domain. Building models on the other hand are often distinguished according to the underlying floor- and room plans. Topological floor plan properties for example serve as a starting point for telling apart residential and commercial buildings. The first contribution of this thesis is a new meta descriptor for 3D retrieval that combines different types of local shape descriptors using a supervised learning approach. The approach enables the differentiation of object classes according to small geometric details and at the same time integrates expert knowledge from the field of architecture. We evaluate our approach using a database containing arbitrary 3D models as well as on one that only consists of models from the architectural domain. We then further extend our approach by adding a sophisticated shape descriptor localization strategy. Additionally, we exploit knowledge about the spatial relationship of object components to further enhance the retrieval performance. In the second part of the thesis we introduce attributed room connectivity graphs (RCGs) as a means to characterize a 3D building model according to the structure of its underlying floor plans. We first describe how RCGs are inferred from a given building model and discuss how substructures of this graph can be queried efficiently. We then introduce a new descriptor denoted as Bag-of-Attributed-Subgraphs that transforms attributed graphs into a vector-based representation using subgraph embeddings. We finally evaluate the retrieval performance of this new method on a database consisting of building models with different floor plan types. All methods presented in this thesis are aimed at an as automated as possible workflow for indexing and retrieval such that only minimum human interaction is required. Accordingly, only polygon soups are required as inputs which do not need to be manually repaired or structured. Human effort is only needed for offline groundtruth generation to enable supervised learning and for providing information about the orientation of building models and the unit of measurement used for modeling

    Rails Quality Data Modelling via Machine Learning-Based Paradigms

    Get PDF

    Μηχανές μάθησης υψηλής απόδοσης για προβλήματα κατηγοριοποίησης - εφαρμογή στην ταξινόμηση κτιρίων σε ενεργειακές κλάσεις

    Get PDF
    Στην εργασία αυτή μελετάται και αναλύεται ο αλγόριθμος Extreme Learning Machine, καθώς και η εφαρμογή του σε προβλήματα κατηγοριοποίησης και ταξινόμησης. Τα νευρωνικά δίκτυα πρόσθιας τροφοδότησης, χρησιμοποιούνται κατά κόρον σε προβλήματα ταξινόμησης, όμως η διαδικασία εκπαίδευσης τους με τους παραδοσιακούς αλγόριθμους είναι εξαιρετικά αργή, πράγμα που οφείλεται στην εκτεταμένη χρήση των αλγορίθμων που βασίζονται στην κλίση και στο ότι οι παράμετροι των δικτύων υπολογίζονται σε κάθε επανάληψη. Ο αλγόριθμος ELM διαφοροποιείται συγκριτικά με τους παραδοσιακούς αλγόριθμους μάθησης στο ότι είναι εξαιρετικά γρήγορος, επιτυγχάνοντας ταυτόχρονα μεγάλη ακρίβεια στην ταξινόμηση. Επιπλέον, οι παράμετροι του δικτύου επιλέγονται τυχαία και δεν επαναϋπολογίζονται ξανά ως το πέρας της διαδικασίας εκπαίδευσης. Παρουσιάζονται αρκετές παραλλαγές του αλγορίθμου ELM, με τις οποίες ξεπερνιούνται κάποιες δυσκολίες και μειονεκτήματα του παραδοσιακού αλγορίθμου και επιτυγχάνονται ακόμα καλύτεροι χρόνοι και ταξινόμηση, καθώς και κάποια υβριδικά μοντέλα συνδυασμού του κλασικού αλγορίθμου ELM με κάποιον παραδοσιακό αλγόριθμο μάθησης. Επιπρόσθετα, αναφέρονται ενδεικτικά κάποιες εφαρμογές του αλγορίθμου ELM σε προβλήματα ταξινόμησης που βρίσκουν εφαρμογή σε πολλαπλά επιστημονικά πεδία, όπως στη βιοιατρική, στη μηχανική, στη βιολογία , καθώς και στη μουσική. Στη συνέχεια, αναλύεται ένα πρόβλημα ταξινόμησης, που αφορά στην ταξινόμηση κτιρίων στην ενεργειακή τους κλάση ανάλογα με το φορτίο θέρμανσης και ψύξης που χρειάζονται. Περιγράφεται αναλυτικά πώς προκύπτει η συνολική πρωτογενής καταναλισκόμενη ενέργεια από τα φορτία θέρμανσης και ψύξης και στη συνέχεια με ποια μεθοδολογία προκύπτει η ενεργειακή κλάση για το κάθε κτίριο. Τέλος, γίνεται πειραματική αξιολόγηση του αλγορίθμου για το συγκεκριμένο πρόβλημα ταξινόμησης, συγκρίνοντας τον με άλλους αλγορίθμους κατηγοριοποίησης δεδομένων.The topic of this thesis is the study and analysis of the extreme Learning Machine Algorithm and its application on classification problems. Feed forward neural networks have been extensively used on classification problems, but their training time with traditional learning algorithms is extremely slow. Two key reasons for that may be: the slow gradient-based learning algorithms are extensively used to train neural networks, and all the parameters of the networks are tuned iteratively by using such learning algorithms. The ELM algorithm differs compared to traditional learning algorithms because it is extremely fast, while achieving high accuracy in classification. In addition, the parameters of the network are selected randomly and not recalculated until the end of the training process. Several ELM variants are presented, which manage to overcome the difficulties and disadvantages of the classical algorithm achieving better training times and training accuracy. Some hybrid models are presented as well combining the classic algorithm ELM with a traditional learning algorithm. Additionally, some ELM algorithm’s applications are mentioned in classification problems in multiple scientific fields, such as biomedical, engineering, biology and music. Then, a new classification problem is analyzed, concerning the classification of buildings in an energy class depending on required heating load and cooling load. It is elaborated how the total primary energy of every building comes out from heating load and cooling load and there is a thorough description of the methodology used for buildings classification regarding this total energy amount. Finally experimental results are presented and Extreme Learning Machine Algorithm is compared with other learning algorithms
    corecore