2,502 research outputs found

    Clicks and Cliques. Exploring the Soul of the Community

    Full text link
    In the paper we analyze 26 communities across the United States with the objective to understand what attaches people to their community and how this attachment differs among communities. How different are attached people from unattached? What attaches people to their community? How different are the communities? What are key drivers behind emotional attachment? To address these questions, graphical, supervised and unsupervised learning tools were used and information from the Census Bureau and the Knight Foundation were combined. Using the same pre-processed variables as Knight (2010) most likely will drive the results towards the same conclusions than the Knight foundation, so this paper does not use those variables

    Bayesian analysis of high-dimensional count data

    Get PDF
    This thesis describes my research work in past years in the Statistic Department of Iowa State University. There are several key statistical features common to the whole thesis. In the first place, all the statistical methods are developed taking a Bayesian perspective to conduct the statistical inference. A second common feature of the two main parts is that both correspond to high-dimensional problems. In the first case, because a large amount of information for a few individuals is available, and in the second part due to model space is really large which brings computational intractability issues. Finally, the response variable in all data used here is a positive count, in the first part, it is associated with the gene expression while in the second part it represents a number of automobile crashes

    Student performance predictive models using LMS data in Primary Schools

    Get PDF
    Plan Ceibal is a public policy implemented in Uruguay, it is part of the global initiative One Lap- top per Child (OLPC, 2005). The basic feature is providing every student and teacher in primary school with a laptop or tablet and internet access. Different data sets were combined, students and teachers activities registered in the Learning Management System (LMS) and student’s performance in national standardized tests. Data were used to compute student’s engagement indexes, combining motivation, creativity, velocity and performance. Statistical models were used to determine key drivers of LMS use, this is relevant to define educational policies based on evidence. Models for LMS use are fitted for several regional levels. Additionally, statistical learning methods were fitted to predict student’s performance in national standardized test us- ing as predictor variables different constructed usage indexes from the LMS platform. A major challenge was how to deal with sub-grouping data structure into machine learning algorithms, usually developed for independent observations. Initial results suggest school district is the main driver of the technology usage in the classroom.ANI

    Student performance predictive models using LMS data in Primary Schools

    Get PDF
    Plan Ceibal is a public policy implemented in Uruguay, it is part of the global initiative One Laptop per Child (OLPC, 2005). The basic feature is providing every student and teacher in primary school with a laptop or tablet and internet access. Different data sets were combined, students and teachers activities registered in the Learning Management System (LMS) and student's performance in national standardized tests. Data were used to compute student's engagement indexes, combining motivation, creativity, velocity and performance. Statistical models were used to determine key drivers of LMS use, this is relevant to define educational policies based on evidence. Models for LMS use are fitted for several regional levels. Additionally, statistical learning methods were fitted to predict student's performance in national standardized test using as predictor variables different constructed usage indexes from the LMS platform. A major challenge was how to deal with sub-grouping data structure into machine learning algorithms, usually developed for independent observations. Initial results suggest school district is the main driver of the technology usage in the classroom.ANI

    Priorcovmatrix: explorar, visualizar y estimar matrices de covarianzas

    Get PDF
    La estimación de matrices de covarianza surge en problemas multivariados como la distribución normal multivariada o modelos de regresión generalizados mixtos donde los efectos aleatorios son modelados de forma conjunta. La inferencia Bayesiana sobre una matriz de covarianza requiere especificar una distribución de probabilidades para dicha matriz. Las distribuciones que tienen como dominio las matrices de covarianza no han recibido mucha atención en términos de caracterizar sus propiedades. En este trabajo se presenta el paquete priorcovmatrix permite ajustar, simular y visualizar algunas distribuciones multivariadas utilizadas para modelar matrices de covarianza. La distribución Wishart inversa, Wishart inversa escalada, y otras distribuciones forman parte de la librería.Sociedad Argentina de Informática e Investigación Operativ

    Priorcovmatrix: explorar, visualizar y estimar matrices de covarianzas

    Get PDF
    La estimación de matrices de covarianza surge en problemas multivariados como la distribución normal multivariada o modelos de regresión generalizados mixtos donde los efectos aleatorios son modelados de forma conjunta. La inferencia Bayesiana sobre una matriz de covarianza requiere especificar una distribución de probabilidades para dicha matriz. Las distribuciones que tienen como dominio las matrices de covarianza no han recibido mucha atención en términos de caracterizar sus propiedades. En este trabajo se presenta el paquete priorcovmatrix permite ajustar, simular y visualizar algunas distribuciones multivariadas utilizadas para modelar matrices de covarianza. La distribución Wishart inversa, Wishart inversa escalada, y otras distribuciones forman parte de la librería.Sociedad Argentina de Informática e Investigación Operativ

    Fully Bayesian analysis of allele-specific RNA-seq data

    Get PDF
    Diploid organisms have two copies of each gene, called alleles, that can be separately transcribed. The RNA abundance associated to any particular allele is known as allele-specific expression (ASE). When two alleles have polymorphisms in transcribed regions, ASE can be studied using RNA-seq read count data. ASE has characteristics different from the regular RNA-seq expression: ASE cannot be assessed for every gene, measures of ASE can be biased towards one of the alleles (reference allele), and ASE provides two measures of expression for a single gene for each biological samples with leads to additional complications for single-gene models. We present statistical methods for modeling ASE and detecting genes with differential allelic expression. We propose a hierarchical, overdispersed, count regression model to deal with ASE counts. The model accommodates gene-specific overdispersion, has an internal measure of the reference allele bias, and uses random effects to model the gene-specific regression parameters. Fully Bayesian inference is obtained using the fbseq package that implements a parallel strategy to make the computational times reasonable. Simulation and real data analysis suggest the proposed model is a practical and powerful tool for the study of differential ASE

    SpICE: An interpretable method for spatial data

    Full text link
    Statistical learning methods are widely utilized in tackling complex problems due to their flexibility, good predictive performance and its ability to capture complex relationships among variables. Additionally, recently developed automatic workflows have provided a standardized approach to implementing statistical learning methods across various applications. However these tools highlight a main drawbacks of statistical learning: its lack of interpretation in their results. In the past few years an important amount of research has been focused on methods for interpreting black box models. Having interpretable statistical learning methods is relevant to have a deeper understanding of the model. In problems were spatial information is relevant, combined interpretable methods with spatial data can help to get better understanding of the problem and interpretation of the results. This paper is focused in the individual conditional expectation (ICE-plot), a model agnostic methods for interpreting statistical learning models and combined them with spatial information. ICE-plot extension is proposed where spatial information is used as restriction to define Spatial ICE curves (SpICE). Spatial ICE curves are estimated using real data in the context of an economic problem concerning property valuation in Montevideo, Uruguay. Understanding the key factors that influence property valuation is essential for decision-making, and spatial data plays a relevant role in this regard

    Introducción a la estadística Bayesiana con aplicaciones de estimación en áreas pequeñas usando software STAN

    Get PDF
    En este mini-curso se presenta una breve introducción a la estadística Bayesiana utilizando el programa STAN. Se utiliza un enfoque aplicado, recorriendo las características básicas del modelado Bayesiano y su implementación en STAN en aplicaciones concretas. Como ejemplos para el trabajo se utilizarán problemas de estimación en áreas pequeñas.ANIIFundación Ceiba

    Uso de plataformas educativas del Plan Ceibal

    Get PDF
    En este trabajo se presenta el desarrollo de indicadores para evaluar el uso de plataformas educativas utilizadas por el Plan Ceibal, específicamente centrado en la plataforma CREA. A su vez se analiza la evolución del uso antes y durante los años de pandemia y se estudian los principales factores que explican su variabilidad. Los resultados indican que el uso de CREA es 5 veces más intenso en 2021 que en años pre-pandemia. Los principales factores para explicar dicha variabilidad en el uso de la plataforma se deben al departamento, contexto socioeconómico y el uso de la plataforma por parte del docente. En particular el impacto en el uso del docente por contexto socioeconómico presenta diferencias en los distintos departamentos del paísFundación CeibalANI
    corecore