257 research outputs found

    How diverse is your team? Investigating gender and nationality diversity in GitHub teams

    Get PDF
    Open Access: This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.Background Building an effective team of developers is a complex task faced by both software companies and open source communities. The problem of forming a “dream” team involves many variables, including consideration of human factors and it is not a dilemma solvable in a mathematical way. Empirical studies might provide interesting insights to explain which factors need to be taken into account in building a team of developers and which levers act to optimise productivity among developers. Aim In this paper, we present the results of an empirical study aimed at investigating the link between team diversity (i.e., gender, nationality) and productivity (issue fixing time). Method We consider issues solved from the GHTorrent dataset inferring gender and nationality of each team’s members. We also evaluate the politeness of all comments involved in issue resolution. Results Results show that higher gender diversity is linked with a lower team average issue fixing time (higher productivity), that nationality diversity is linked with lower team politeness and that gender diversity is linked with higher sentiment.Peer reviewedFinal Published versio

    PlosOne Reviewer

    Get PDF

    Contribuições multivariadas na decomposição de uma série temporal

    Get PDF
    One of the goals of time series analysis is to extract essential features from the series for exploratory or predictive purposes. The SSA is a method used for this intent, transforming the original series into a Hankel matrix, also called a trajectory matrix. Its only parameter is the so-called window length. The decomposition into singular values of the trajectory matrix allows the separation of the series components since the structure in terms of singular values and vectors is somehow associated with the trend, oscillatory component, and noise. In turn, the visualization of the steps of that method is little explored or lacks interpretability. In this work, we take advantage of the results of a particular decomposition into singular values using the NIPALS algorithm to implement a graphical display of the principal components using HJ-biplots, naming the method SSA-HJ-biplot. It is an exploratory tool whose main objective is to increase the visual interpretability of the SSA, facilitating the grouping step and, consequently, identifying characteristics of the time series. By exploring the properties of the HJ-biplots and adjusting the window length to half the series length, rows and columns of the trajectory matrix can be represented in the same SSA-HJ-biplot simultaneously and optimally. To circumvent the potential problem of structural changes in the time series, which can make it challenging to visualize the separation of the components, we propose a methodology for the detection of change points and the application of the SSA-HJ-biplot in homogeneous intervals, that is, between change points. This detection approach is based on sudden changes in the direction of the principal components, which are evaluated by a distance metric created for this purpose. Finally, we developed another visualization method based on SSA to estimate the dominant periodicities of a time series through geometric patterns, which we call the SSA Biplot Area. In this part of the research, we implemented a package in R called areabiplot, available on the Comprehensive R Archive Network (CRAN).Um dos objetivos da análise de séries temporais é extrair características essenciais da série para fins exploratórios ou preditivos. A Análise Espectral Singular (SSA) é um método utilizado para esse fim, transformando a série original em uma matriz de Hankel, também chamada de matriz trajetória. O seu único parâmetro é o chamado comprimento da janela. A decomposição em valores singulares da matriz trajetória permite a separação das componentes da série, uma vez que a estrutura em termos de valores e vetores singulares está de alguma forma associada à tendência, componente oscilatória e ruído. Por sua vez, a visualização das etapas daquele método é pouco explorada ou carece de interpretabilidade. Neste trabalho, aproveitamos os resultados de uma particular decomposição em valores singulares através do algoritmo NIPALS para implementar uma exibição gráfica das componentes principais usando HJ-biplots, nomeando-o método SSA-HJ-biplot. Trata-se de uma ferramenta de natureza exploratória e cujo principal objetivo é aumentar a interpretabilidade visual da SSA, facilitando o passo de agrupamento e, consequentemente, identificar características da série temporal. Ao explorar as propriedades dos HJ-biplots e ajustar o comprimento da janela para a metade do comprimento série, linhas e colunas da matriz trajetória podem ser representadas em um mesmo SSA-HJ-biplot simultaneamente e de maneira ótima. Para contornar o potencial problema de mudanças estruturais na série temporal, que podem dificultar a visualização da separação das componentes, propomos uma metodologia para a detecção de change points e a aplicação do SSA-HJ-biplot em intervalos homogéneos, ou seja, entre change points. Essa abordagem de detecção é baseada em mudanças bruscas na direção das componentes principais, que são avaliadas por uma métrica de distância criada para esse fim. Por fim, desenvolvemos um outro método de visualização baseado na SSA para estimar as periodicidades dominantes de uma série temporal por meio de padrões geométricos, ao que chamamos SSA Área biplot. Nesta parte da investigação, implementámos em R um pacote chamado areabiplot, disponível na Comprehensive R Archive Network (CRAN).Programa Doutoral em Matemátic

    Mind the Gap between Demand and Supply. A behavioral perspective on demand forecasting

    Get PDF

    Mind the Gap between Demand and Supply. A behavioral perspective on demand forecasting

    Get PDF

    Cryptocurrency ecosystems and social media environments: An empirical analysis through Hawkes’ models and natural language processing

    Get PDF
    Copyright © 2021 The Author(s). We analyse, using a mixture of statistical models and natural language process techniques, what happened in social media from June 2019 onwards to understand the relationships between Cryptocurrencies’ prices and social media, focusing on the rise of the Bitcoin and Ethereum prices. In particular, we identify and model the relationship between the cryptocurrencies market price changes, and sentiment and topic discussion occurrences on social media, using Hawkes’ Model. We find that some topics occurrences and rise of sentiment in social media precedes certain types of price movements. Specifically, discussions concerning governments, trading, and Ethereum cryptocurrency as an exchange currency appear to negatively affect Bitcoin and Ethereum prices. Those concerning investments, appear to explain price rises, whilst discussions related to new decentralized realities and technological applications explain price falls. Finally, we validate our model using a real case study: the already famous case of ”Wallstreetbet and GameStop”1 that took place in January 2021.Funding: No funding was received for this work

    Geo-physical parameter forecasting on imagery{based data sets using machine learning techniques

    Get PDF
    >Magister Scientiae - MScThis research objectively investigates the e ectiveness of machine learning (ML) tools towards predicting several geo-physical parameters. This is based on a large number of studies that have reported high levels of prediction success using ML in the eld. Therefore, several widely used ML tools coupled with a number of di erent feature sets are used to predict six geophysical parameters namely rainfall, groundwater, evapora- tion, humidity, temperature, and wind. The results of the research indicate that: a) a large number of related studies in the eld are prone to speci c pitfalls that lead to over-estimated results in favour of ML tools; b) the use of gaussian mixture models as global features can provide a higher accuracy compared to other local feature sets; c) ML never outperform simple statistically-based estimators on highly-seasonal parame- ters, and providing error bars is key to objectively evaluating the relative performance of the ML tools used; and d) ML tools can be e ective for parameters that are slow- changing such as groundwater

    Experience Innovation in Tourism:The Role of Front-line Employees

    Get PDF
    corecore