6 research outputs found

    Feature Relevance Bounds for Ordinal Regression

    Get PDF
    The increasing occurrence of ordinal data, mainly sociodemographic, led to a renewed research interest in ordinal regression, i.e. the prediction of ordered classes. Besides model accuracy, the interpretation of these models itself is of high relevance, and existing approaches therefore enforce e.g. model sparsity. For high dimensional or highly correlated data, however, this might be misleading due to strong variable dependencies. In this contribution, we aim for an identification of feature relevance bounds which - besides identifying all relevant features - explicitly differentiates between strongly and weakly relevant features

    FRI -- Feature Relevance Intervals for Interpretable and Interactive Data Exploration

    Full text link
    Most existing feature selection methods are insufficient for analytic purposes as soon as high dimensional data or redundant sensor signals are dealt with since features can be selected due to spurious effects or correlations rather than causal effects. To support the finding of causal features in biomedical experiments, we hereby present FRI, an open source Python library that can be used to identify all-relevant variables in linear classification and (ordinal) regression problems. Using the recently proposed feature relevance method, FRI is able to provide the base for further general experimentation or in specific can facilitate the search for alternative biomarkers. It can be used in an interactive context, by providing model manipulation and visualization methods, or in a batch process as a filter method.Comment: Addition of IEEE copyright notice. Accepted for CIBCB 2019 (https://cibcb2019.icas.xyz/

    Time-series representation framework based on multi-instance similarity measures

    Get PDF
    Time series analysis plays an essential role in today’s society due to the ease of access to information. This analysis is present in the majority of applications that involve sensors, but in recent years thanks to technological advancement, this approach has been directed towards the treatment of complex signals that lack periodicity and even that present non-stationary dynamics such as signals of brain activity or magnetic and satellite resonance images. The main challenges at the time of time series analysis are focused on the representation of the same, for which methodologies based on similarity measures have been proposed. However, these approaches are oriented to the measurement of local patterns point-to-point in the signals using metrics based on the form. Besides, the selection of relevant information from the representations is of high importance, in order to eliminate noise and train classifiers with discriminant information for the analysis tasks, however, this selection is usually made at the level of characteristics, leaving aside the Global signal information. In the same way, lately, there have been applications in which it is necessary to analyze time series from different sources of information or multimodal, for which there are methods that generate acceptable performance but lack interpretability. In this regard, we propose a framework based on representations of similarity and multiple-instance learning that allows selecting relevant information for classification tasks in order to improve the performance and interpretability of the modelsResumen: El análisis de series de tiempo juega un papel importante en la sociedad actual debido a la facilidad de acceso a la información. Este análisis está presente en la mayoría de aplicaciones que involucran sensores, pero en los ´últimos años gracias al avance tecnológico, este enfoque se ha encaminado hacia el tratamiento de señales complejas que carecen de periodicidad e incluso que presentan dinámicas no estacionarias como lo son las señales de actividad cerebral o las imágenes de resonancias magnéticas y satelitales. Los principales retos a la hora de realizar en análisis de series de tiempo se centran en la representación de las mismas, para lo cual se han propuesto metodologías basadas en medidas de similitud, sin embargo, estos enfoques están orientados a la medición de patrones locales punto a punto en las señales utilizando métricas basadas en la forma. Además, es de alta importancia la selección de información relevante de las representaciones, con el fin de eliminar el ruido y entrenar clasificadores con información discriminante para las tareas de análisis, sin embargo, esta selección se suele hacer a nivel de características, dejando de lado la información de global de la señal. De la misma manera, ´últimamente han surgido aplicaciones en las cuales es necesario el análisis de series de tiempo provenientes de diferentes fuentes de información o multimodales, para lo cual existen métodos que generan un rendimiento aceptable, pero carecen de interpretabilidad. En este sentido, en nosotros proponemos un marco de trabajo basado en representaciones de similitud y aprendizaje de múltiples instancias que permita seleccionar información relevante para tareas de clasificación con el fin de mejorar el rendimiento y la interpretabilidad de los modelosMaestrí

    Feature Relevance Bounds for Ordinal Regression

    No full text
    Pfannschmidt L, Jakob J, Biehl M, Tino P, Hammer B. Feature Relevance Bounds for Ordinal Regression. In: Verleysen M, ed. Proceedings of the 27th European Symposium on Artificial Neural Networks (ESANN 2019). Louvain-la-Neuve: i6doc; 2019.The increasing occurrence of ordinal data, mainly sociodemographic, led to a renewed research interest in ordinal regression, i.e. the prediction of ordered classes. Besides model accuracy, the interpretation of these models itself is of high relevance, and existing approaches therefore enforce e.g. model sparsity. For high dimensional or highly correlated data, however, this might be misleading due to strong variable dependencies. In this contribution, we aim for an identification of feature relevance bounds which - besides identifying all relevant features - explicitly differentiates between strongly and weakly relevant features
    corecore