52 research outputs found
Radar-based Road User Classification and Novelty Detection with Recurrent Neural Network Ensembles
Radar-based road user classification is an important yet still challenging
task towards autonomous driving applications. The resolution of conventional
automotive radar sensors results in a sparse data representation which is tough
to recover by subsequent signal processing. In this article, classifier
ensembles originating from a one-vs-one binarization paradigm are enriched by
one-vs-all correction classifiers. They are utilized to efficiently classify
individual traffic participants and also identify hidden object classes which
have not been presented to the classifiers during training. For each classifier
of the ensemble an individual feature set is determined from a total set of 98
features. Thereby, the overall classification performance can be improved when
compared to previous methods and, additionally, novel classes can be identified
much more accurately. Furthermore, the proposed structure allows to give new
insights in the importance of features for the recognition of individual
classes which is crucial for the development of new algorithms and sensor
requirements.Comment: 8 pages, 9 figures, accepted paper for 2019 IEEE Intelligent Vehicles
Symposium (IV), Paris, France, June 201
Duality between Feature Selection and Data Clustering
The feature-selection problem is formulated from an information-theoretic
perspective. We show that the problem can be efficiently solved by an extension
of the recently proposed info-clustering paradigm. This reveals the fundamental
duality between feature selection and data clustering,which is a consequence of
the more general duality between the principal partition and the principal
lattice of partitions in combinatorial optimization
Effect of Feature Selection on Gene Expression Datasets Classification Accurac
Feature selection attracts researchers who deal with machine learning and data mining. It consists of selecting the variables that have the greatest impact on the dataset classification, and discarding the rest. This dimentionality reduction allows classifiers to be fast and more accurate. This paper traits the effect of feature selection on the accuracy of widely used classifiers in literature. These classifiers are compared with three real datasets which are pre-processed with feature selection methods. More than 9% amelioration in classification accuracy is observed, and k-means appears to be the most sensitive classifier to feature selection
Information-theoretic Feature Selection via Tensor Decomposition and Submodularity
Feature selection by maximizing high-order mutual information between the
selected feature vector and a target variable is the gold standard in terms of
selecting the best subset of relevant features that maximizes the performance
of prediction models. However, such an approach typically requires knowledge of
the multivariate probability distribution of all features and the target, and
involves a challenging combinatorial optimization problem. Recent work has
shown that any joint Probability Mass Function (PMF) can be represented as a
naive Bayes model, via Canonical Polyadic (tensor rank) Decomposition. In this
paper, we introduce a low-rank tensor model of the joint PMF of all variables
and indirect targeting as a way of mitigating complexity and maximizing the
classification performance for a given number of features. Through low-rank
modeling of the joint PMF, it is possible to circumvent the curse of
dimensionality by learning principal components of the joint distribution. By
indirectly aiming to predict the latent variable of the naive Bayes model
instead of the original target variable, it is possible to formulate the
feature selection problem as maximization of a monotone submodular function
subject to a cardinality constraint - which can be tackled using a greedy
algorithm that comes with performance guarantees. Numerical experiments with
several standard datasets suggest that the proposed approach compares favorably
to the state-of-art for this important problem
Automated assessment of movement impairment in Huntington's disease
Quantitative assessment of movement impairment in Huntington’s disease (HD) is essential to monitoring of disease progression. This study aimed to develop and validate a novel low cost, objective automated system for the evaluation of upper limb movement impairment in HD in order to eliminate the inconsistency of the assessor and offer a more sensitive, continuous assessment scale. Patients with genetically confirmed HD and healthy controls were recruited to this observational study. Demographic data including age (years), gender and Unified Huntington’s Disease Rating Scale Total Motor Score (UHDRS-TMS) were recorded. For the purposes of this study a modified upper limb motor impairment score (mULMS) was generated from the UHDRS-TMS. All participants completed a brief, standardized clinical assessment of upper limb dexterity whilst wearing a tri-axial accelerometer on each wrist and on the sternum. The captured acceleration data were used to develop an automatic classification system for discriminating between healthy and HD participants and to automatically generate a continuous Movement Impairment Score (MIS) that reflected the degree of the movement impairment. Data from 48 healthy and 44 HD participants was used to validate the developed system, which achieved 98.78% accuracy in discriminating between healthy and HD participants. The Pearson correlation coefficient between the automatic MIS and the clinician rated mULMS was 0.77 with a p-value < 0.01. The approach presented in this study demonstrates the possibility of an automated objective, consistent and sensitive assessment of the HD movement impairment
Procedimiento para mejorar la precisión en el acierto de los fracasos en implantes dentales mediante técnicas de ciencia de datos
Nowadays, the prediction about dental implant failure is determined through clinical and radiological evaluation. For this reason, predictions are highly dependent on the Implantologists’ experience. In addition, it is extremely crucial to detect in time if a dental implant is going to fail, due to time, cost, trauma to the patient, postoperative problems, among others. This paper proposes a procedure using multiple feature selection methods and classification algorithms to improve the accuracy of dental implant failures in the province of Misiones, Argentina, validated by human experts. The experimentation is performed with two data sets, a set of dental implants made for the case study and an artificially generated set. The proposed approach allows to know the most relevant features and improve the accuracy in the classification of the target class (dental implant failure), to avoid biasing the decision making based on the application and results of individual methods. The proposed approach achieves an accuracy of 79% of failures, while individual classifiers achieve a maximum of 72%.Hoy en dÃa, la predicción del fracaso de un implante dental está determinado a través de una evaluación clÃnica y radiológica. Por esta razón, las predicciones dependen en gran medida de la experiencia del implantólogo. Además, es extremadamente crucial detectar a tiempo si un implante dental va a fallar, por cuestiones de tiempo, costo, traumas al paciente, problemas postoperatorios, entre otros. En este trabajo se propone un procedimiento mediante la utilización de múltiples métodos de selección de caracterÃsticas y algoritmos de clasificación, para mejorar la precisión en el acierto de los fracasos en implantes dentales de la provincia de Misiones, Argentina validado por expertos humanos. La experimentación es realizada con cuatro conjuntos de datos, un conjunto de implantes dentales confeccionado para el estudio de caso, un conjunto generado artificialmente y otros dos conjuntos obtenidos de distintos repositorios de datos. El procedimiento propuesto permitió conocer las caracterÃsticas más relevantes y mejoró la precisión en la clasificación de la clase objetivo (fracaso del implante dental), permitiendo no sesgar la toma de decisión en base a la aplicación y resultados de método individuales. El procedimiento propuesto consigue una precisión del 79% de los fracasos, mientras que los clasificadores individuales alcanzan un máximo del 72%.Fil: Ganz, Nancy Beatriz. Consejo Nacional de Investigaciones CientÃficas y Técnicas. Centro CientÃfico Tecnológico Conicet - Nordeste. Instituto de Materiales de Misiones. Universidad Nacional de Misiones. Facultad de Ciencias Exactas QuÃmicas y Naturales. Instituto de Materiales de Misiones; ArgentinaFil: Ares, Alicia Esther. Consejo Nacional de Investigaciones CientÃficas y Técnicas. Centro CientÃfico Tecnológico Conicet - Nordeste. Instituto de Materiales de Misiones. Universidad Nacional de Misiones. Facultad de Ciencias Exactas QuÃmicas y Naturales. Instituto de Materiales de Misiones; ArgentinaFil: Kuna, Horacio Daniel. Universidad Nacional de Misiones; Argentin
Single-center versus multi-center biparametric MRI radiomics approach for clinically significant peripheral zone prostate cancer
Contains fulltext :
239809.pdf (Publisher’s version ) (Open Access
Feature Selection and Overlapping Clustering-Based Multilabel Classification Model
Multilabel classification (MLC) learning, which is widely applied in real-world applications, is a very important problem in machine learning. Some studies show that a clustering-based MLC framework performs effectively compared to a nonclustering framework. In this paper, we explore the clustering-based MLC problem. Multilabel feature selection also plays an important role in classification learning because many redundant and irrelevant features can degrade performance and a good feature selection algorithm can reduce computational complexity and improve classification accuracy. In this study, we consider feature dependence and feature interaction simultaneously, and we propose a multilabel feature selection algorithm as a preprocessing stage before MLC. Typically, existing cluster-based MLC frameworks employ a hard cluster method. In practice, the instances of multilabel datasets are distinguished in a single cluster by such frameworks; however, the overlapping nature of multilabel instances is such that, in real-life applications, instances may not belong to only a single class. Therefore, we propose a MLC model that combines feature selection with an overlapping clustering algorithm. Experimental results demonstrate that various clustering algorithms show different performance for MLC, and the proposed overlapping clustering-based MLC model may be more suitable
- …