6 research outputs found

    Variable selection in classification for multivariate functional data

    Get PDF
    When classification methods are applied to high-dimensional data, selecting a subset of the predictors may lead to an improvement in the predictive ability of the estimated model, in addition to reducing the model complexity. In Functional Data Analysis (FDA), i.e., when data are functions, selecting a subset of predictors corresponds to selecting a subset of individual time instants in the time interval in which the functional data are measured. In this paper, we address the problem of selecting the most informative time instants in multivariate functional data, a case much less studied than its single-variate counterpart. Our proposal allows one to use in a very simple way high-order information of the data, e.g. monotonicity or convexity by means of the functional data derivatives. The aforementioned problem is addressed with tools of Global Optimization in continuous variables: the time instants are selected to maximize the correlation between the class label and the Support Vector Machine score used for classification. The effectiveness of the proposal is shown in univariate and multivariate datasets

    Aplicación de técnicas de aprendizaje máquina para la caracterización y clasificación de pacientes con trastorno obsesivo compulsivo

    Get PDF
    El siguiente Trabajo Fin de Grado se basa en el cada vez más habitual empleo de métodos de aprendizaje máquina con el fin de clasificar y caracterizar trastornos psiquiátricos. Concretamente, el sistema diseñado pretende acercarse al diagnóstico de TOC (‘Trastorno Obsesivo Compulsivo’) a través del análisis de imágenes de resonancia magnética (MRI). El sistema diseñado tiene como objetivo plantear un algoritmo capaz de diagnosticar pacientes con TOC y, principalmente, capaz de caracterizar la enfermedad, detectando de manera automática las regiones neuroanatómicas relacionadas con el trastorno. Para ello, se empleará una arquitectura modular creada a partir de dos premisas fundamentales. 1. Análisis por áreas funcionales y/o neuroanatómicas. Cada imagen de resonancia magnética se divide en, aproximadamente, una centena de subconjuntos compuestos por vóxeles asociados a un área funcional o región neuroanatómica del cerebro. Así pues, el objetivo es aplicar un clasificador que facilite la selección de los conjuntos de vóxeles relevantes para la detección de la enfermedad. 2. Caracterización y fusión de áreas funcionales. El sistema utilizará métodos de selección de características sobre las salidas de los clasificadores el objetivo de obtener una selección automática de las áreas relevantes para el diagnóstico de la patología que estamos tratando. Asimismo, el último paso será el estudio de la relación que tienen las áreas entre sí mediante el uso de clasificadores, tanto lineales como no lineales. Una vez desarrollado y aplicado el algoritmo, se aprovecharán los resultados tanto para comparar la clasificación de pacientes con los resultados previos obtenidos mediante métodos tradicionales [1], [2], como para analizar el patrón de áreas neuroanatómicas responsables del trastorno. -------------------------------------------------------This work is based on increasingly common use of machine learning methods in order to classify and characterize psychiatric disorders. Specifically, the designed system tries to be able to diagnose OCD (Obsessive-Compulsive Disorder) though the MRI (Magnetic Resonance Imaging) analysis. The main system's goal is to construct an algorithm able to detect OCD patients and characterize the disease, detecting automatically neuroanatomical regions related to the disorder, supported on a modular arquitecture process with two fundamental principles. 1. Analysis of functional and/or neuroanatomical areas. Each MRI is divided into one hundred subsets composed of voxels associated to a functional area. Thus, the goal is to apply a classifier which facilitates the selection of the relevant voxels sets for the diagnosis of the disease. 2. Characterization and combination of functional areas. The system will use feature selection methods with the outputs of the first classifiers in order to get an automatic selection of the relevant areas for diagnosis of the pathology. The last step will use linear and no liner classifiers to analyze whether the different areas are interrelated. Having the algorithm developed, we will use the results to compare the classifications of patients with previous results got by traditional methods [1], [2], and to analyze the pattern of neuroanatomical areas responsible for the disorder.Ingeniería de Sistemas Audiovisuale

    Izbor atributa integracijom znanja o domenu primenom metoda odlučivanja kod prediktivnog modelovanja vremenskih serija nadgledanim mašinskim učenjem

    Get PDF
    The aim of the research presented within this doctoral dissertation is to develop a feature selection methodology through integrating domain-specific knowledge by applying mathematical methods of decision-making, to improve the feature selection process and the precision of supervised machine learning methods for predictive modeling of time series. To integrate domain-specific knowledge, a multi-criteria decision making method is used, i.e. an analytical hierarchical process proven to be successful in numerous studies carried out to date. This approach was selected because it allows the selection of a set of factors based on their relevance, even in the case of mutually opposite criteria. In predicting the movement of time series, the possibility of integrating feature relevance into support vector machines to improve their prediction accuracy was studied. The proposed methodology was applied as a feature-selection method for the predictive modelling of movement of financial time series. Unlike existing approaches, where the feature selection method is based on a quantitative analysis of the input values, the proposed methodology carries out a qualitative evaluation of the attributes in relation to the prediction domain and represents a means of integrating a priori knowledge of the prediction domain

    Information-theoretic feature selection for functional data classification

    No full text
    The classification of functional or high-dimensional data requires to select a reduced subset of features among the initial set, both to help fighting the curse of dimensionality and to help interpreting the problem and the model. The mutual information criterion may be used in that context, but it suffers from the difficulty of its estimation through a finite set of samples. Efficient estimators are not designed specifically to be applied in a classification context, and thus suffer from further drawbacks and difficulties. This paper presents an estimator of mutual information that is specifically designed for classification tasks, including multi-class ones. It is combined to a recently published stopping criterion in a traditional forward feature selection procedure. Experiments on both traditional benchmarks and on an industrial functional classification problem show the added value of this estimator. (C) 2009 Elsevier B.V. All rights reserved
    corecore