52 research outputs found

    Radar-based Road User Classification and Novelty Detection with Recurrent Neural Network Ensembles

    Full text link
    Radar-based road user classification is an important yet still challenging task towards autonomous driving applications. The resolution of conventional automotive radar sensors results in a sparse data representation which is tough to recover by subsequent signal processing. In this article, classifier ensembles originating from a one-vs-one binarization paradigm are enriched by one-vs-all correction classifiers. They are utilized to efficiently classify individual traffic participants and also identify hidden object classes which have not been presented to the classifiers during training. For each classifier of the ensemble an individual feature set is determined from a total set of 98 features. Thereby, the overall classification performance can be improved when compared to previous methods and, additionally, novel classes can be identified much more accurately. Furthermore, the proposed structure allows to give new insights in the importance of features for the recognition of individual classes which is crucial for the development of new algorithms and sensor requirements.Comment: 8 pages, 9 figures, accepted paper for 2019 IEEE Intelligent Vehicles Symposium (IV), Paris, France, June 201

    Duality between Feature Selection and Data Clustering

    Full text link
    The feature-selection problem is formulated from an information-theoretic perspective. We show that the problem can be efficiently solved by an extension of the recently proposed info-clustering paradigm. This reveals the fundamental duality between feature selection and data clustering,which is a consequence of the more general duality between the principal partition and the principal lattice of partitions in combinatorial optimization

    Effect of Feature Selection on Gene Expression Datasets Classification Accurac

    Get PDF
    Feature selection attracts researchers who deal with machine learning and data mining. It consists of selecting the variables that have the greatest impact on the dataset classification, and discarding the rest. This dimentionality reduction allows classifiers to be fast and more accurate. This paper traits the effect of feature selection on the accuracy of widely used classifiers in literature. These classifiers are compared with three real datasets which are pre-processed with feature selection methods. More than 9% amelioration in classification accuracy is observed, and k-means appears to be the most sensitive classifier to feature selection

    Information-theoretic Feature Selection via Tensor Decomposition and Submodularity

    Full text link
    Feature selection by maximizing high-order mutual information between the selected feature vector and a target variable is the gold standard in terms of selecting the best subset of relevant features that maximizes the performance of prediction models. However, such an approach typically requires knowledge of the multivariate probability distribution of all features and the target, and involves a challenging combinatorial optimization problem. Recent work has shown that any joint Probability Mass Function (PMF) can be represented as a naive Bayes model, via Canonical Polyadic (tensor rank) Decomposition. In this paper, we introduce a low-rank tensor model of the joint PMF of all variables and indirect targeting as a way of mitigating complexity and maximizing the classification performance for a given number of features. Through low-rank modeling of the joint PMF, it is possible to circumvent the curse of dimensionality by learning principal components of the joint distribution. By indirectly aiming to predict the latent variable of the naive Bayes model instead of the original target variable, it is possible to formulate the feature selection problem as maximization of a monotone submodular function subject to a cardinality constraint - which can be tackled using a greedy algorithm that comes with performance guarantees. Numerical experiments with several standard datasets suggest that the proposed approach compares favorably to the state-of-art for this important problem

    Automated assessment of movement impairment in Huntington's disease

    Get PDF
    Quantitative assessment of movement impairment in Huntington’s disease (HD) is essential to monitoring of disease progression. This study aimed to develop and validate a novel low cost, objective automated system for the evaluation of upper limb movement impairment in HD in order to eliminate the inconsistency of the assessor and offer a more sensitive, continuous assessment scale. Patients with genetically confirmed HD and healthy controls were recruited to this observational study. Demographic data including age (years), gender and Unified Huntington’s Disease Rating Scale Total Motor Score (UHDRS-TMS) were recorded. For the purposes of this study a modified upper limb motor impairment score (mULMS) was generated from the UHDRS-TMS. All participants completed a brief, standardized clinical assessment of upper limb dexterity whilst wearing a tri-axial accelerometer on each wrist and on the sternum. The captured acceleration data were used to develop an automatic classification system for discriminating between healthy and HD participants and to automatically generate a continuous Movement Impairment Score (MIS) that reflected the degree of the movement impairment. Data from 48 healthy and 44 HD participants was used to validate the developed system, which achieved 98.78% accuracy in discriminating between healthy and HD participants. The Pearson correlation coefficient between the automatic MIS and the clinician rated mULMS was 0.77 with a p-value < 0.01. The approach presented in this study demonstrates the possibility of an automated objective, consistent and sensitive assessment of the HD movement impairment

    Procedimiento para mejorar la precisión en el acierto de los fracasos en implantes dentales mediante técnicas de ciencia de datos

    Get PDF
    Nowadays, the prediction about dental implant failure is determined through clinical and radiological evaluation. For this reason, predictions are highly dependent on the Implantologists’ experience. In addition, it is extremely crucial to detect in time if a dental implant is going to fail, due to time, cost, trauma to the patient, postoperative problems, among others. This paper proposes a procedure using multiple feature selection methods and classification algorithms to improve the accuracy of dental implant failures in the province of Misiones, Argentina, validated by human experts. The experimentation is performed with two data sets, a set of dental implants made for the case study and an artificially generated set. The proposed approach allows to know the most relevant features and improve the accuracy in the classification of the target class (dental implant failure), to avoid biasing the decision making based on the application and results of individual methods. The proposed approach achieves an accuracy of 79% of failures, while individual classifiers achieve a maximum of 72%.Hoy en día, la predicción del fracaso de un implante dental está determinado a través de una evaluación clínica y radiológica. Por esta razón, las predicciones dependen en gran medida de la experiencia del implantólogo. Además, es extremadamente crucial detectar a tiempo si un implante dental va a fallar, por cuestiones de tiempo, costo, traumas al paciente, problemas postoperatorios, entre otros. En este trabajo se propone un procedimiento mediante la utilización de múltiples métodos de selección de características y algoritmos de clasificación, para mejorar la precisión en el acierto de los fracasos en implantes dentales de la provincia de Misiones, Argentina validado por expertos humanos. La experimentación es realizada con cuatro conjuntos de datos, un conjunto de implantes dentales confeccionado para el estudio de caso, un conjunto generado artificialmente y otros dos conjuntos obtenidos de distintos repositorios de datos. El procedimiento propuesto permitió conocer las características más relevantes y mejoró la precisión en la clasificación de la clase objetivo (fracaso del implante dental), permitiendo no sesgar la toma de decisión en base a la aplicación y resultados de método individuales. El procedimiento propuesto consigue una precisión del 79% de los fracasos, mientras que los clasificadores individuales alcanzan un máximo del 72%.Fil: Ganz, Nancy Beatriz. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Nordeste. Instituto de Materiales de Misiones. Universidad Nacional de Misiones. Facultad de Ciencias Exactas Químicas y Naturales. Instituto de Materiales de Misiones; ArgentinaFil: Ares, Alicia Esther. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Nordeste. Instituto de Materiales de Misiones. Universidad Nacional de Misiones. Facultad de Ciencias Exactas Químicas y Naturales. Instituto de Materiales de Misiones; ArgentinaFil: Kuna, Horacio Daniel. Universidad Nacional de Misiones; Argentin

    Feature Selection and Overlapping Clustering-Based Multilabel Classification Model

    Get PDF
    Multilabel classification (MLC) learning, which is widely applied in real-world applications, is a very important problem in machine learning. Some studies show that a clustering-based MLC framework performs effectively compared to a nonclustering framework. In this paper, we explore the clustering-based MLC problem. Multilabel feature selection also plays an important role in classification learning because many redundant and irrelevant features can degrade performance and a good feature selection algorithm can reduce computational complexity and improve classification accuracy. In this study, we consider feature dependence and feature interaction simultaneously, and we propose a multilabel feature selection algorithm as a preprocessing stage before MLC. Typically, existing cluster-based MLC frameworks employ a hard cluster method. In practice, the instances of multilabel datasets are distinguished in a single cluster by such frameworks; however, the overlapping nature of multilabel instances is such that, in real-life applications, instances may not belong to only a single class. Therefore, we propose a MLC model that combines feature selection with an overlapping clustering algorithm. Experimental results demonstrate that various clustering algorithms show different performance for MLC, and the proposed overlapping clustering-based MLC model may be more suitable
    • …
    corecore