25 research outputs found
MPAgenomics : An R package for multi-patients analysis of genomic markers
MPAgenomics, standing for multi-patients analysis (MPA) of genomic markers,
is an R-package devoted to: (i) efficient segmentation, and (ii) genomic marker
selection from multi-patient copy number and SNP data profiles. It provides
wrappers from commonly used packages to facilitate their repeated (sometimes
difficult) use, offering an easy-to-use pipeline for beginners in R. The
segmentation of successive multiple profiles (finding losses and gains) is
based on a new automatic choice of influential parameters since default ones
were misleading in the original packages. Considering multiple profiles in the
same time, MPAgenomics wraps efficient penalized regression methods to select
relevant markers associated with a given response
Analyse multi-patients de données génomiques
National audienceMPAgenomics, standing for multi-patients analysis (MPA) of genomic markers, is an R-package devoted to: (i) efficient segmentation, and (ii) genomic marker selection from multi-patient copy number and SNP data profiles.It provides wrappers from commonly used packages to facilitate their repeated (sometimes difficult) use, offering an easy-to-use pipeline for beginners in R. The segmentation of successive multiple profiles (finding losses and gains) is based on a new automatic choice of influential parameters since default ones were misleading in the original packages. Considering multiple profiles in the same time, MPAgenomics wraps efficient penalized regression methods to select relevant markers associated with a given response
Categorical functional data analysis. The cfda R package
International audienc
Sélection de groupes de variables corrélées par classification ascendante hiérarchique et group-lasso
National audienceIn a context of variable selection, the use of penalized regressions in presence of high correlations might be problematic. Only a subset of the correlated variables is selected. Firstly aggregating related variables can help both for selection and interpretation. However, clustering methods require calibration of additional parameters. We will introduce a new method combining hierarchical clustering and group selection.Dans un contexte de sélection de variables, utiliser des régressions pénalisées en présence de fortes corrélations peut poser problème. Seul un sous-ensemble des variables corrélées est sélectionné. Agréger préalablement les variables liées entre elles peut aider aussi bien a la sélection qu'à l' interprétation. Cependant, les méthodes de regroupement de variables nécessitent la calibration de paramètres supplémentaires. Nous présenterons une nouvelle méthode combinant classification ascendante hiérarchique et sélection de groupes de variables
Analyse des données fonctionnelles catégorielles. Le package R cfda
International audienceWe present how to take into account categorical functional data, represented by the trajectories of a jump process with a continuous time and a finite set of states. As an extension of multiple correspondence analysis to an infinite set of variables,the optimal codings of the states over time are approximated on an arbitrary finite function basis. This allows dimension reduction, representation optimization and visualisation of data in smaller dimensional spaces. We have implemented this methodology in the Rpackage cfda available on the CRAN, we will present in this communication how it can be implemented on real data in a clustering framework
cfda: an R Package for Categorical Functional Data Analysis
Categorical functional data represented by paths of a stochastic jump process with continuous time and finite set of states are considered. As an extension of the multiple correspondence analysis to an infinite set of variables, optimal encodings of states over time are approximated using an arbitrary finite basis of functions. That allows dimension reduction, optimal representation and visualisation of data in lower dimensional spaces. The methodology is implemented in the cfda R package and is illustrated using a real data set in the clustering framework
Sélection de groupes de variables corrélées par classification ascendante hiérarchique et group-lasso
National audienceDans un contexte de sélection de variables, utiliser des régressions pénalisées en présence de fortes corrélations peut poser problème. Seul un sous-ensemble des variables corrélées est sélectionné. Agréger préalablement les variables liées entre elles peut aider aussi bien à la sélection qu'à l'interprétation. Cependant, les méthodes de regroupement de variables nécessitent la calibration de paramètres supplémentaires. Nous présenterons une nouvelle méthode combinant classification ascendante hiérarchique et sélection de groupes de variables
Variable selection with Multi-Layer Group Lasso
International audienceThe MLGL (Multi-Layer Group-Lasso) R package implements a new procedure of variable selection in the context of redundancy between explanatory variables, which holds true with high0dimensional data. A sparsity assumption is made–that is, only a few variables are assumed to be relevant for predicting the response variable. In this context, the performance of classical Lasso-based approaches strongly deteriorates as the redundancy strengthens.The proposed approach combines variables aggregation and selection in order to improve interpretability and performance. First, a hierarchical clustering procedure provides at each level a partition of the variables into groups. Then, the set of groups of variables from the different levels of the hierarchy is given as input to group-Lasso, with weights adapted to the structure of the hierarchy. At this step, group-Lasso outputs sets of candidate groups of variables for each value of regularization parameter.The versatility offered by MLGL to choose groups at different levels of the hierarchy a priori induces a high computational complexity. MLGL, however, exploits the structure of the hierarchy and the weights used in group-Lasso to greatly reduce the final time cost. The final choice of the regularization parameter–and therefore the final choice of groups–is made by a multiple hierarchical testing procedure
From fictions, technologies, controls and terrors: Psycho-Pass between power, production, normalization and resistance
El presente trabajo tiene como objetivo realizar un análisis minucioso del anime japonés Psyco-Pass el cual, ambientado en un futuro signado por una configuración social de gran avance y control tecnológico, presenta un marco narrativo sumamente interesante para el análisis sociológico.
En Psyco-Pass diversos dispositivos tecnológicos, prácticas, discursos y saberes se combinan para configurar un orden social en la que cada individuo tiene inscripto en su cuerpo un “coeficiente de criminalidad”, que indica su estatus dentro de la gama de lo correcto, lo aceptable y lo imposible/eliminable. Asimismo, tanto este coeficiente como los servicios de seguridad e incluso las disposiciones más simples de la vida cotidiana son supervisados y orientados por el sistema Sybil, un sistema informático conformado (aparentemente) por algoritmos y patrones de conducta que clasifica y ordena la vida social en este Japón futurista.
Siguiendo estos parámetros, la narración circula por varios parajes donde personajes principales, secundarios y antagonistas deben enfrentarse a diversas situaciones prácticas, morales y de poder actuando en conformidad, en duda o en total sublevación contra el orden imperante.
En síntesis, Psyco-Pass nos abre una ventana a una potencial configuración social no tan lejana a la actual que invita a su análisis y reflexión.This work aims to make a detailed analysis of the Japanese anime Psyco-Pass which, set in a future marked by a social configuration of great advance and technological control, presents a very interesting narrative framework for sociological analysis.
In Psyco-Pass technological devices, practices, speeches and knowledge combine to form a social order where each individual has a ‘crime coefficient’ inscribed in his body which indicates his status within the range of the right, the acceptable and the impossible/removable. Likewise, both this coefficient as the security services and even the simpler provisions of daily life are supervised and guided by the Sybil system, a computer system made up (apparently) of algorithms and behavior patterns that classifies and orders social life in this futuristic Japan.
Following these parameters, the narrative circulates through several places where various main, secondary and antagonistic characters must face various practical, moral and power situations acting in conformity, in doubt or in total uprising against the prevailing order.
In short, Psyco-Pass opens a window to a potential social configuration not so far away from the current one that invites its analysis and reflection.Facultad de Periodismo y Comunicación Socia
MLGL: An R package implementing correlated variable selection by hierarchical clustering and group-Lasso
International audienceThe MLGL R-package, standing for Multi-Layer Group-Lasso, implements a new procedure of variable selection in the context of redundancy between explanatory variables, which holds true with high dimensional data. A sparsity assumption is made that is, only a few variables are assumed to be relevant for predicting the response variable. In this context, the performance of classical Lasso-based approaches strongly deteriorates as the redundancy strengthens. The proposed approach combines variables aggregation and selection in order to improve interpretability and performance. First, a hierarchical clustering procedure provides at each level a partition of the variables into groups. Then, the set of groups of variables from the different levels of the hierarchy is given as input to group-Lasso, with weights adapted to the structure of the hierarchy. At this step, group-Lasso outputs sets of candidate groups of variables for each value of regularization parameter. The versatility offered by MLGL to choose groups at different levels of the hierarchy a priori induces a high computational complexity. MLGL however exploits the structure of the hierarchy and the weights used in group-Lasso to greatly reduce the final time cost. The final choice of the regularization parameter – and therefore the final choice of groups – is made by a multiple hierarchical testing procedure