229,829 research outputs found

    An Investigation of the Big Five and Narrow Personality Traits in Relation to Academic Performance

    Get PDF
    The present study investigated the relationship between the Big Five personality traits (agreeableness, conscientiousness, emotional stability, extroversion, and openness), as well as the relationship of more narrow personality traits, with academic performance. The issue of narrow traits adding incremental validity to the Big Five in predicting academic performance was investigated, using archival data collected from 552 university students. Results from a correlation analysis indicated that openness, conscientiousness, agreeableness, and emotional stability were all significantly related to GPA (college grade-point average), while extroversion was not related. Due to a significant gender difference in college GPA, gender interaction terms with each of the Big Five factors were employed for regression analyses. The regression analyses indicated that GPA was related to openness, emotional stability, and agreeableness. Bivariate correlation analyses showed that, of the five narrow traits, aggression, self-directed learning, optimism, and work drive were related to GPA. Regression analysis indicated that aggression, self-directed learning, tough-mindedness, and work drive accounted for partial effects in GPA. Significant interactions were noted between gender and optimism and gender and self-directed learning. Finally, a sequential multiple regression revealed that the following narrow traits added incremental validity to the Big Five in explaining variance in college GPA: conscientiousness from the Big Five, and the narrow traits of self-directed learning, aggression, tough-mindedness, and work drive. Significant interactions were noted between gender and optimism and gender and self-directed learning. These findings were interpreted as supporting the usefulness of both broad and narrow personality traits to predict real-world outcomes. Furthermore, these findings illuminate the relationship between personality and academic performance

    Distributed Supervised Statistical Learning

    Get PDF
    We live in the era of big data, nowadays, many companies face data of massive size that, in most cases, cannot be stored and processed on a single computer. Often such data has to be distributed over multiple computers which then makes the storage, pre-processing, and data analysis possible in practice. In the age of big data, distributed learning has gained popularity as a method to manage enormous datasets. In this thesis, we focus on distributed supervised statistical learning where sparse linear regression analysis is performed in a distributed framework. These methods are frequently applied in a variety of disciplines tackling large scale datasets analysis, including engineering, economics, and finance. In distributed learning, one key question is, for example, how to efficiently aggregate multiple estimators that are obtained based on data subsets stored on multiple computers. We investigate recent studies on distributed statistical inferences. There have been many efforts to propose efficient ways of aggregating local estimates, most popular methods are discussed in this thesis. Recently, an important question about the number of machines to deploy is addressed for several estimation methods, notable answers to the question are reviewed in this literature. We have considered a specific class of Liu-type shrinkage estimation methods for distributed statistical inference. We also conduct a Monte Carlo simulation study to assess performance of the Liu-type shrinkage estimation methods in a distributed framework

    Large-scale Heteroscedastic Regression via Gaussian Process

    Full text link
    Heteroscedastic regression considering the varying noises among observations has many applications in the fields like machine learning and statistics. Here we focus on the heteroscedastic Gaussian process (HGP) regression which integrates the latent function and the noise function together in a unified non-parametric Bayesian framework. Though showing remarkable performance, HGP suffers from the cubic time complexity, which strictly limits its application to big data. To improve the scalability, we first develop a variational sparse inference algorithm, named VSHGP, to handle large-scale datasets. Furthermore, two variants are developed to improve the scalability and capability of VSHGP. The first is stochastic VSHGP (SVSHGP) which derives a factorized evidence lower bound, thus enhancing efficient stochastic variational inference. The second is distributed VSHGP (DVSHGP) which (i) follows the Bayesian committee machine formalism to distribute computations over multiple local VSHGP experts with many inducing points; and (ii) adopts hybrid parameters for experts to guard against over-fitting and capture local variety. The superiority of DVSHGP and SVSHGP as compared to existing scalable heteroscedastic/homoscedastic GPs is then extensively verified on various datasets.Comment: 14 pages, 15 figure

    Predictive geospatial analytics using principal component regression

    Get PDF
    Nowadays, exponential growth in geospatial or spatial data all over the globe, geospatial data analytics is absolutely deserved to pay attention in manipulating voluminous amount of geodata in various forms increasing with high velocity. In addition, dimensionality reduction has been playing a key role in high-dimensional big data sets including spatial data sets which are continuously growing not only in observations but also in features or dimensions. In this paper, predictive analytics on geospatial big data using Principal Component Regression (PCR), traditional Multiple Linear Regression (MLR) model improved with Principal Component Analysis (PCA), is implemented on distributed, parallel big data processing platform. The main objective of the system is to improve the predictive power of MLR model combined with PCA which reduces insignificant and irrelevant variables or dimensions of that model. Moreover, it is contributed to present how data mining and machine learning approaches can be efficiently utilized in predictive geospatial data analytics. For experimentation, OpenStreetMap (OSM) data is applied to develop a one-way road prediction for city Yangon, Myanmar. Experimental results show that hybrid approach of PCA and MLR can be efficiently utilized not only in road prediction using OSM data but also in improvement of traditional MLR model
    • …
    corecore