Search CORE

2,502 research outputs found

Clicks and Cliques. Exploring the Soul of the Community

Author: Alvarez-Castro Ignacio
da Silva Natalia
Publication venue
Publication date: 09/10/2017
Field of study

In the paper we analyze 26 communities across the United States with the objective to understand what attaches people to their community and how this attachment differs among communities. How different are attached people from unattached? What attaches people to their community? How different are the communities? What are key drivers behind emotional attachment? To address these questions, graphical, supervised and unsupervised learning tools were used and information from the Census Bureau and the Knight Foundation were combined. Using the same pre-processed variables as Knight (2010) most likely will drive the results towards the same conclusions than the Knight foundation, so this paper does not use those variables

arXiv.org e-Print Archive

Bayesian analysis of high-dimensional count data

Author: Alvarez-Castro Ignacio
Publication venue: Iowa State University Digital Repository
Publication date: 01/01/2017
Field of study

This thesis describes my research work in past years in the Statistic Department of Iowa State University. There are several key statistical features common to the whole thesis. In the first place, all the statistical methods are developed taking a Bayesian perspective to conduct the statistical inference. A second common feature of the two main parts is that both correspond to high-dimensional problems. In the first case, because a large amount of information for a few individuals is available, and in the second part due to model space is really large which brings computational intractability issues. Finally, the response variable in all data used here is a positive count, in the first part, it is associated with the gene expression while in the second part it represents a number of automobile crashes

Digital Repository @ Iowa State University (ISU)

Student performance predictive models using LMS data in Primary Schools

Author: Alvarez-Castro Ignacio
Publication venue: International Conference on Data Science 2023
Publication date: 09/11/2023
Field of study

Plan Ceibal is a public policy implemented in Uruguay, it is part of the global initiative One Lap- top per Child (OLPC, 2005). The basic feature is providing every student and teacher in primary school with a laptop or tablet and internet access. Different data sets were combined, students and teachers activities registered in the Learning Management System (LMS) and student’s performance in national standardized tests. Data were used to compute student’s engagement indexes, combining motivation, creativity, velocity and performance. Statistical models were used to determine key drivers of LMS use, this is relevant to define educational policies based on evidence. Models for LMS use are fitted for several regional levels. Additionally, statistical learning methods were fitted to predict student’s performance in national standardized test us- ing as predictor variables different constructed usage indexes from the LMS platform. A major challenge was how to deal with sub-grouping data structure into machine learning algorithms, usually developed for independent observations. Initial results suggest school district is the main driver of the technology usage in the classroom.ANI

REDI - Digital Repository of the National Agency of Research and Innovation

Student performance predictive models using LMS data in Primary Schools

Author: Alvarez-Castro Ignacio
da Silva Natalia
Publication venue: Joint Statistical Meetings 2023
Publication date: 06/08/2023
Field of study

Plan Ceibal is a public policy implemented in Uruguay, it is part of the global initiative One Laptop per Child (OLPC, 2005). The basic feature is providing every student and teacher in primary school with a laptop or tablet and internet access. Different data sets were combined, students and teachers activities registered in the Learning Management System (LMS) and student's performance in national standardized tests. Data were used to compute student's engagement indexes, combining motivation, creativity, velocity and performance. Statistical models were used to determine key drivers of LMS use, this is relevant to define educational policies based on evidence. Models for LMS use are fitted for several regional levels. Additionally, statistical learning methods were fitted to predict student's performance in national standardized test using as predictor variables different constructed usage indexes from the LMS platform. A major challenge was how to deal with sub-grouping data structure into machine learning algorithms, usually developed for independent observations. Initial results suggest school district is the main driver of the technology usage in the classroom.ANI

REDI - Digital Repository of the National Agency of Research and Innovation

Priorcovmatrix: explorar, visualizar y estimar matrices de covarianzas

Author: Alvarez Castro Ignacio
Publication venue
Publication date: 01/02/2019
Field of study

La estimación de matrices de covarianza surge en problemas multivariados como la distribución normal multivariada o modelos de regresión generalizados mixtos donde los efectos aleatorios son modelados de forma conjunta. La inferencia Bayesiana sobre una matriz de covarianza requiere especificar una distribución de probabilidades para dicha matriz. Las distribuciones que tienen como dominio las matrices de covarianza no han recibido mucha atención en términos de caracterizar sus propiedades. En este trabajo se presenta el paquete priorcovmatrix permite ajustar, simular y visualizar algunas distribuciones multivariadas utilizadas para modelar matrices de covarianza. La distribución Wishart inversa, Wishart inversa escalada, y otras distribuciones forman parte de la librería.Sociedad Argentina de Informática e Investigación Operativ

Priorcovmatrix: explorar, visualizar y estimar matrices de covarianzas

Author: Alvarez Castro Ignacio
Publication venue
Publication date: 01/02/2019
Field of study

Servicio de Difusión de la Creación Intelectual

Fully Bayesian analysis of allele-specific RNA-seq data

Author: Alvarez-Castro Ignacio
Niemi Jarad
Niemi Jarad
Publication venue: Iowa State University Digital Repository
Publication date: 23/08/2019
Field of study

Diploid organisms have two copies of each gene, called alleles, that can be separately transcribed. The RNA abundance associated to any particular allele is known as allele-specific expression (ASE). When two alleles have polymorphisms in transcribed regions, ASE can be studied using RNA-seq read count data. ASE has characteristics different from the regular RNA-seq expression: ASE cannot be assessed for every gene, measures of ASE can be biased towards one of the alleles (reference allele), and ASE provides two measures of expression for a single gene for each biological samples with leads to additional complications for single-gene models. We present statistical methods for modeling ASE and detecting genes with differential allelic expression. We propose a hierarchical, overdispersed, count regression model to deal with ASE counts. The model accommodates gene-specific overdispersion, has an internal measure of the reference allele bias, and uses random effects to model the gene-specific regression parameters. Fully Bayesian inference is obtained using the fbseq package that implements a parallel strategy to make the computational times reasonable. Simulation and real data analysis suggest the proposed model is a practical and powerful tool for the study of differential ASE

Digital Repository @ Iowa State University (ISU)

SpICE: An interpretable method for spatial data

Author: Alvarez-Castro Ignacio
da Silva Natalia
Moreno Leonardo
Sosa Andrés
Publication venue
Publication date: 11/11/2023
Field of study

Statistical learning methods are widely utilized in tackling complex problems due to their flexibility, good predictive performance and its ability to capture complex relationships among variables. Additionally, recently developed automatic workflows have provided a standardized approach to implementing statistical learning methods across various applications. However these tools highlight a main drawbacks of statistical learning: its lack of interpretation in their results. In the past few years an important amount of research has been focused on methods for interpreting black box models. Having interpretable statistical learning methods is relevant to have a deeper understanding of the model. In problems were spatial information is relevant, combined interpretable methods with spatial data can help to get better understanding of the problem and interpretation of the results. This paper is focused in the individual conditional expectation (ICE-plot), a model agnostic methods for interpreting statistical learning models and combined them with spatial information. ICE-plot extension is proposed where spatial information is used as restriction to define Spatial ICE curves (SpICE). Spatial ICE curves are estimated using real data in the context of an economic problem concerning property valuation in Montevideo, Uruguay. Understanding the key factors that influence property valuation is essential for decision-making, and spatial data plays a relevant role in this regard

arXiv.org e-Print Archive

Introducción a la estadística Bayesiana con aplicaciones de estimación en áreas pequeñas usando software STAN

Author: Alvarez-Castro Ignacio
Goyeneche Juan José
Publication venue: Sociedad colombiana de Estadística
Publication date: 09/10/2023
Field of study

En este mini-curso se presenta una breve introducción a la estadística Bayesiana utilizando el programa STAN. Se utiliza un enfoque aplicado, recorriendo las características básicas del modelado Bayesiano y su implementación en STAN en aplicaciones concretas. Como ejemplos para el trabajo se utilizarán problemas de estimación en áreas pequeñas.ANIIFundación Ceiba

REDI - Digital Repository of the National Agency of Research and Innovation

Uso de plataformas educativas del Plan Ceibal

Author: Alvarez Castro Ignacio
da Silva Natalia
Publication venue
Publication date: 02/08/2024
Field of study

En este trabajo se presenta el desarrollo de indicadores para evaluar el uso de plataformas educativas utilizadas por el Plan Ceibal, específicamente centrado en la plataforma CREA. A su vez se analiza la evolución del uso antes y durante los años de pandemia y se estudian los principales factores que explican su variabilidad. Los resultados indican que el uso de CREA es 5 veces más intenso en 2021 que en años pre-pandemia. Los principales factores para explicar dicha variabilidad en el uso de la plataforma se deben al departamento, contexto socioeconómico y el uso de la plataforma por parte del docente. En particular el impacto en el uso del docente por contexto socioeconómico presenta diferencias en los distintos departamentos del paísFundación CeibalANI

REDI - Digital Repository of the National Agency of Research and Innovation